regression models explored: Topics by Science.gov

Sample records for regression models explored

EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

PubMed Central

Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

2013-01-01

We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
EXpectation Propagation LOgistic REgRession (EXPLORER): distributed privacy-preserving online model learning.

PubMed

Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

2013-06-01

We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.
Building Your Own Regression Model

ERIC Educational Resources Information Center

Horton, Robert, M.; Phillips, Vicki; Kenelly, John

2004-01-01

Spreadsheets to explore regression with an algebra 2 class in a medium-sized rural high school are presented. The use of spreadsheets can help students develop sophisticated understanding of mathematical models and use them to describe real-world phenomena.
Mechanisms of Developmental Regression in Autism and the Broader Phenotype: A Neural Network Modeling Approach

ERIC Educational Resources Information Center

Thomas, Michael S. C.; Knowland, Victoria C. P.; Karmiloff-Smith, Annette

2011-01-01

Loss of previously established behaviors in early childhood constitutes a markedly atypical developmental trajectory. It is found almost uniquely in autism and its cause is currently unknown (Baird et al., 2008). We present an artificial neural network model of developmental regression, exploring the hypothesis that regression is caused by…
Using Gamma and Quantile Regressions to Explore the Association between Job Strain and Adiposity in the ELSA-Brasil Study: Does Gender Matter?

PubMed

Fonseca, Maria de Jesus Mendes da; Juvanhol, Leidjaira Lopes; Rotenberg, Lúcia; Nobre, Aline Araújo; Griep, Rosane Härter; Alves, Márcia Guimarães de Mello; Cardoso, Letícia de Oliveira; Giatti, Luana; Nunes, Maria Angélica; Aquino, Estela M L; Chor, Dóra

2017-11-17

This paper explores the association between job strain and adiposity, using two statistical analysis approaches and considering the role of gender. The research evaluated 11,960 active baseline participants (2008-2010) in the ELSA-Brasil study. Job strain was evaluated through a demand-control questionnaire, while body mass index (BMI) and waist circumference (WC) were evaluated in continuous form. The associations were estimated using gamma regression models with an identity link function. Quantile regression models were also estimated from the final set of co-variables established by gamma regression. The relationship that was found varied by analytical approach and gender. Among the women, no association was observed between job strain and adiposity in the fitted gamma models. In the quantile models, a pattern of increasing effects of high strain was observed at higher BMI and WC distribution quantiles. Among the men, high strain was associated with adiposity in the gamma regression models. However, when quantile regression was used, that association was found not to be homogeneous across outcome distributions. In addition, in the quantile models an association was observed between active jobs and BMI. Our results point to an association between job strain and adiposity, which follows a heterogeneous pattern. Modelling strategies can produce different results and should, accordingly, be used to complement one another.
Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

PubMed

Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

2015-01-01

This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis

ERIC Educational Resources Information Center

Williams, Ryan

2013-01-01

The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…
Predicting nitrogen loading with land-cover composition: how can watershed size affect model performance?

PubMed

Zhang, Tao; Yang, Xiaojun

2013-01-01

Watershed-wide land-cover proportions can be used to predict the in-stream non-point source pollutant loadings through regression modeling. However, the model performance can vary greatly across different study sites and among various watersheds. Existing literature has shown that this type of regression modeling tends to perform better for large watersheds than for small ones, and that such a performance variation has been largely linked with different interwatershed landscape heterogeneity levels. The purpose of this study is to further examine the previously mentioned empirical observation based on a set of watersheds in the northern part of Georgia (USA) to explore the underlying causes of the variation in model performance. Through the combined use of the neutral landscape modeling approach and a spatially explicit nutrient loading model, we tested whether the regression model performance variation over the watershed groups ranging in size is due to the different watershed landscape heterogeneity levels. We adopted three neutral landscape modeling criteria that were tied with different similarity levels in watershed landscape properties and used the nutrient loading model to estimate the nitrogen loads for these neutral watersheds. Then we compared the regression model performance for the real and neutral landscape scenarios, respectively. We found that watershed size can affect the regression model performance both directly and indirectly. Along with the indirect effect through interwatershed heterogeneity, watershed size can directly affect the model performance over the watersheds varying in size. We also found that the regression model performance can be more significantly affected by other physiographic properties shaping nitrogen delivery effectiveness than the watershed land-cover heterogeneity. This study contrasts with many existing studies because it goes beyond hypothesis formulation based on empirical observations and into hypothesis testing to explore the fundamental mechanism.
Time series regression model for infectious disease and weather.

PubMed

Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro

2015-10-01

Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)

NASA Astrophysics Data System (ADS)

Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul

2018-05-01

The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.
Evaluation and prediction of shrub cover in coastal Oregon forests (USA)

Treesearch

Becky K. Kerns; Janet L. Ohmann

2004-01-01

We used data from regional forest inventories and research programs, coupled with mapped climatic and topographic information, to explore relationships and develop multiple linear regression (MLR) and regression tree models for total and deciduous shrub cover in the Oregon coastal province. Results from both types of models indicate that forest structure variables were...
Latent Variable Regression 4-Level Hierarchical Model Using Multisite Multiple-Cohorts Longitudinal Data. CRESST Report 801

ERIC Educational Resources Information Center

Choi, Kilchan

2011-01-01

This report explores a new latent variable regression 4-level hierarchical model for monitoring school performance over time using multisite multiple-cohorts longitudinal data. This kind of data set has a 4-level hierarchical structure: time-series observation nested within students who are nested within different cohorts of students. These…
Advanced statistics: linear regression, part I: simple linear regression.

PubMed

Marill, Keith A

2004-01-01

Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Evaluating differential effects using regression interactions and regression mixture models

PubMed Central

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Not Quite Normal: Consequences of Violating the Assumption of Normality in Regression Mixture Models

ERIC Educational Resources Information Center

Van Horn, M. Lee; Smith, Jessalyn; Fagan, Abigail A.; Jaki, Thomas; Feaster, Daniel J.; Masyn, Katherine; Hawkins, J. David; Howe, George

2012-01-01

Regression mixture models, which have only recently begun to be used in applied research, are a new approach for finding differential effects. This approach comes at the cost of the assumption that error terms are normally distributed within classes. This study uses Monte Carlo simulations to explore the effects of relatively minor violations of…
Exploring Student Characteristics of Retention That Lead to Graduation in Higher Education Using Data Mining Models

ERIC Educational Resources Information Center

Raju, Dheeraj; Schumacker, Randall

2015-01-01

The study used earliest available student data from a flagship university in the southeast United States to build data mining models like logistic regression with different variable selection methods, decision trees, and neural networks to explore important student characteristics associated with retention leading to graduation. The decision tree…
The Role of Violent Thinking in Violent Behavior: It's More About Thinking Than Drinking.

PubMed

Bowes, Nicola; Walker, Julian; Hughes, Elise; Lewis, Rhiannon; Hyde, Gemma

2017-08-01

This article aims to explore and report on violent thinking and alcohol misuse; how these factors may predict self-reported violence. The role of violent thinking in violent behavior is both well established in theoretical models, yet there are few measures that explain this role. One measure that has been identified is the Maudsley Violence Questionnaire (MVQ). This is the first study to explore the use of the MVQ with a general (nonoffender) adult sample, having already been shown to be valid with young people (under 18 years old), adult male offenders, and mentally disordered offenders. This study involved 808 adult participants-569 female and 239 male participants. As figures demonstrate that around half of all violent crime in the United Kingdom is alcohol related, we also explored the role of alcohol misuse. Regression was used to explore how these factors predicted violence. The results demonstrate the important role of violent thinking in violent behavior. The MVQ factor of "Machismo" was the primary factor in regression models for both male and female self-reported violence. The role of alcohol in the regression models differed slightly between the male and female participants, with alcohol misuse involved in male violence. The study supports theoretical models including the role of violent thinking and encourages those hoping to address violence, to consider "Machismo" as a treatment target. The study also provides further validation of the MVQ as a helpful tool for clinicians or researchers who may be interested in "measuring" violent thinking.
Application of logistic regression to case-control association studies involving two causative loci.

PubMed

North, Bernard V; Curtis, David; Sham, Pak C

2005-01-01

Models in which two susceptibility loci jointly influence the risk of developing disease can be explored using logistic regression analysis. Comparison of likelihoods of models incorporating different sets of disease model parameters allows inferences to be drawn regarding the nature of the joint effect of the loci. We have simulated case-control samples generated assuming different two-locus models and then analysed them using logistic regression. We show that this method is practicable and that, for the models we have used, it can be expected to allow useful inferences to be drawn from sample sizes consisting of hundreds of subjects. Interactions between loci can be explored, but interactive effects do not exactly correspond with classical definitions of epistasis. We have particularly examined the issue of the extent to which it is helpful to utilise information from a previously identified locus when investigating a second, unknown locus. We show that for some models conditional analysis can have substantially greater power while for others unconditional analysis can be more powerful. Hence we conclude that in general both conditional and unconditional analyses should be performed when searching for additional loci.
Regression Models for Identifying Noise Sources in Magnetic Resonance Images

PubMed Central

Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.

2009-01-01

Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Mobile Phone-Based Unobtrusive Ecological Momentary Assessment of Day-to-Day Mood: An Explorative Study.

PubMed

Asselbergs, Joost; Ruwaard, Jeroen; Ejdys, Michal; Schrader, Niels; Sijbrandij, Marit; Riper, Heleen

2016-03-29

Ecological momentary assessment (EMA) is a useful method to tap the dynamics of psychological and behavioral phenomena in real-world contexts. However, the response burden of (self-report) EMA limits its clinical utility. The aim was to explore mobile phone-based unobtrusive EMA, in which mobile phone usage logs are considered as proxy measures of clinically relevant user states and contexts. This was an uncontrolled explorative pilot study. Our study consisted of 6 weeks of EMA/unobtrusive EMA data collection in a Dutch student population (N=33), followed by a regression modeling analysis. Participants self-monitored their mood on their mobile phone (EMA) with a one-dimensional mood measure (1 to 10) and a two-dimensional circumplex measure (arousal/valence, -2 to 2). Meanwhile, with participants' consent, a mobile phone app unobtrusively collected (meta) data from six smartphone sensor logs (unobtrusive EMA: calls/short message service (SMS) text messages, screen time, application usage, accelerometer, and phone camera events). Through forward stepwise regression (FSR), we built personalized regression models from the unobtrusive EMA variables to predict day-to-day variation in EMA mood ratings. The predictive performance of these models (ie, cross-validated mean squared error and percentage of correct predictions) was compared to naive benchmark regression models (the mean model and a lag-2 history model). A total of 27 participants (81%) provided a mean 35.5 days (SD 3.8) of valid EMA/unobtrusive EMA data. The FSR models accurately predicted 55% to 76% of EMA mood scores. However, the predictive performance of these models was significantly inferior to that of naive benchmark models. Mobile phone-based unobtrusive EMA is a technically feasible and potentially powerful EMA variant. The method is young and positive findings may not replicate. At present, we do not recommend the application of FSR-based mood prediction in real-world clinical settings. Further psychometric studies and more advanced data mining techniques are needed to unlock unobtrusive EMA's true potential.

[Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

PubMed

Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

2011-11-01

To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Exploring the Effects of Managerial Ownership on the Decision to Go Private: A Behavioral Agency Model Approach

ERIC Educational Resources Information Center

Valenti, Alix; Schneider, Marguerite

2012-01-01

This paper utilizes the behavioral agency model to investigate why many formerly public companies have been converted to privately held corporations. Using a matched pairs sample and categorical binary regression, and controlling for effects found in previous studies, we explore how the equity ownership of those entrusted to manage firms, the…
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms

PubMed Central

Hu, Zhongyi; Xiong, Tao

2013-01-01

Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature. PMID:24459425
Electricity load forecasting using support vector regression with memetic algorithms.

PubMed

Hu, Zhongyi; Bao, Yukun; Xiong, Tao

2013-01-01

Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
Active Learning to Understand Infectious Disease Models and Improve Policy Making

PubMed Central

Vladislavleva, Ekaterina; Broeckhove, Jan; Beutels, Philippe; Hens, Niel

2014-01-01

Modeling plays a major role in policy making, especially for infectious disease interventions but such models can be complex and computationally intensive. A more systematic exploration is needed to gain a thorough systems understanding. We present an active learning approach based on machine learning techniques as iterative surrogate modeling and model-guided experimentation to systematically analyze both common and edge manifestations of complex model runs. Symbolic regression is used for nonlinear response surface modeling with automatic feature selection. First, we illustrate our approach using an individual-based model for influenza vaccination. After optimizing the parameter space, we observe an inverse relationship between vaccination coverage and cumulative attack rate reinforced by herd immunity. Second, we demonstrate the use of surrogate modeling techniques on input-response data from a deterministic dynamic model, which was designed to explore the cost-effectiveness of varicella-zoster virus vaccination. We use symbolic regression to handle high dimensionality and correlated inputs and to identify the most influential variables. Provided insight is used to focus research, reduce dimensionality and decrease decision uncertainty. We conclude that active learning is needed to fully understand complex systems behavior. Surrogate models can be readily explored at no computational expense, and can also be used as emulator to improve rapid policy making in various settings. PMID:24743387
Active learning to understand infectious disease models and improve policy making.

PubMed

Willem, Lander; Stijven, Sean; Vladislavleva, Ekaterina; Broeckhove, Jan; Beutels, Philippe; Hens, Niel

2014-04-01

Modeling plays a major role in policy making, especially for infectious disease interventions but such models can be complex and computationally intensive. A more systematic exploration is needed to gain a thorough systems understanding. We present an active learning approach based on machine learning techniques as iterative surrogate modeling and model-guided experimentation to systematically analyze both common and edge manifestations of complex model runs. Symbolic regression is used for nonlinear response surface modeling with automatic feature selection. First, we illustrate our approach using an individual-based model for influenza vaccination. After optimizing the parameter space, we observe an inverse relationship between vaccination coverage and cumulative attack rate reinforced by herd immunity. Second, we demonstrate the use of surrogate modeling techniques on input-response data from a deterministic dynamic model, which was designed to explore the cost-effectiveness of varicella-zoster virus vaccination. We use symbolic regression to handle high dimensionality and correlated inputs and to identify the most influential variables. Provided insight is used to focus research, reduce dimensionality and decrease decision uncertainty. We conclude that active learning is needed to fully understand complex systems behavior. Surrogate models can be readily explored at no computational expense, and can also be used as emulator to improve rapid policy making in various settings.
Students' Self-Regulation for Interaction with Others in Online Learning Environments

ERIC Educational Resources Information Center

Cho, Moon-Heum; Kim, B. Joon

2013-01-01

The purpose of this study was to explore variables explaining students' self-regulation (SR) for interaction with others, specifically peers and instructors, in online learning environments. A total of 407 students participated in the study. With hierarchical regression model (HRM), several variables were regressed on students' SR for interaction…
Unpacking commitment and exploration: preliminary validation of an integrative model of late adolescent identity formation.

PubMed

Luyckx, Koen; Goossens, Luc; Soenens, Bart; Beyers, Wim

2006-06-01

A model of identity formation comprising four structural dimensions (Commitment Making, Identification with Commitment, Exploration in Depth, and Exploration in Breadth) was developed through confirmatory factor analysis. In a sample of 565 emerging adults, this model provided a better fit than did alternative two- and three-dimensional models, thereby validating the unpacking of both exploration and commitment. Regression analyses indicated that Commitment Making was significantly related to family context in accordance with hypotheses. Identification with Commitment and both exploration dimensions were significantly related to adjustment and family context, again in accordance with hypotheses. Identification with Commitment was positively related to positive adjustment indicators and negatively to depressive symptoms, whereas Exploration in Breadth was positively related to depressive symptoms and substance use. Exploration in Depth, on the other hand, was positively related to academic adjustment and negatively to substance use. Implications and suggestions for future research are discussed.
Time series regression studies in environmental epidemiology.

PubMed

Bhaskaran, Krishnan; Gasparrini, Antonio; Hajat, Shakoor; Smeeth, Liam; Armstrong, Ben

2013-08-01

Time series regression studies have been widely used in environmental epidemiology, notably in investigating the short-term associations between exposures such as air pollution, weather variables or pollen, and health outcomes such as mortality, myocardial infarction or disease-specific hospital admissions. Typically, for both exposure and outcome, data are available at regular time intervals (e.g. daily pollution levels and daily mortality counts) and the aim is to explore short-term associations between them. In this article, we describe the general features of time series data, and we outline the analysis process, beginning with descriptive analysis, then focusing on issues in time series regression that differ from other regression methods: modelling short-term fluctuations in the presence of seasonal and long-term patterns, dealing with time varying confounding factors and modelling delayed ('lagged') associations between exposure and outcome. We finish with advice on model checking and sensitivity analysis, and some common extensions to the basic model.
Discrete mixture modeling to address genetic heterogeneity in time-to-event regression

PubMed Central

Eng, Kevin H.; Hanlon, Bret M.

2014-01-01

Motivation: Time-to-event regression models are a critical tool for associating survival time outcomes with molecular data. Despite mounting evidence that genetic subgroups of the same clinical disease exist, little attention has been given to exploring how this heterogeneity affects time-to-event model building and how to accommodate it. Methods able to diagnose and model heterogeneity should be valuable additions to the biomarker discovery toolset. Results: We propose a mixture of survival functions that classifies subjects with similar relationships to a time-to-event response. This model incorporates multivariate regression and model selection and can be fit with an expectation maximization algorithm, we call Cox-assisted clustering. We illustrate a likely manifestation of genetic heterogeneity and demonstrate how it may affect survival models with little warning. An application to gene expression in ovarian cancer DNA repair pathways illustrates how the model may be used to learn new genetic subsets for risk stratification. We explore the implications of this model for censored observations and the effect on genomic predictors and diagnostic analysis. Availability and implementation: R implementation of CAC using standard packages is available at https://gist.github.com/programeng/8620b85146b14b6edf8f Data used in the analysis are publicly available. Contact: kevin.eng@roswellpark.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24532723
Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

NASA Astrophysics Data System (ADS)

Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

2018-03-01

This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Modelling long-term fire occurrence factors in Spain by accounting for local variations with geographically weighted regression

NASA Astrophysics Data System (ADS)

Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.

2013-02-01

Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local parameter estimates for all the variables and an important reduction of the autocorrelation in the residuals of the GW linear model. Despite the fitting improvement of local models, GW regression, more than an alternative to "global" or traditional regression modelling, seems to be a valuable complement to explore the non-stationary relationships between the response variable and the explanatory variables. The synergy of global and local modelling provides insights into fire management and policy and helps further our understanding of the fire problem over large areas while at the same time recognizing its local character.
Using destination image to predict visitors' intention to revisit three Hudson River Valley, New York, communities

Treesearch

Rudy M. Schuster; Laura Sullivan; Duarte Morais; Diane Kuehn

2009-01-01

This analysis explores the differences in Affective and Cognitive Destination Image among three Hudson River Valley (New York) tourism communities. Multiple regressions were used with six dimensions of visitors' images to predict future intention to revisit. Two of the three regression models were significant. The only significantly contributing independent...
Food Crops Response to Climate Change

NASA Astrophysics Data System (ADS)

Butler, E.; Huybers, P.

2009-12-01

Projections of future climate show a warming world and heterogeneous changes in precipitation. Generally, warming temperatures indicate a decrease in crop yields where they are currently grown. However, warmer climate will also open up new areas at high latitudes for crop production. Thus, there is a question whether the warmer climate with decreased yields but potentially increased growing area will produce a net increase or decrease of overall food crop production. We explore this question through a multiple linear regression model linking temperature and precipitation to crop yield. Prior studies have emphasised temporal regression which indicate uniformly decreased yields, but neglect the potentially increased area opened up for crop production. This study provides a compliment to the prior work by exploring this spatial variation. We explore this subject with a multiple linear regression model from temperature, precipitation and crop yield data over the United States. The United States was chosen as the training region for the model because there are good crop data available over the same time frame as climate data and presumably the yield from crops in the United States is optimized with respect to potential yield. We study corn, soybeans, sorghum, hard red winter wheat and soft red winter wheat using monthly averages of temperature and precipitation from NCEP reanalysis and yearly yield data from the National Agriculture Statistics Service for 1948-2008. The use of monthly averaged temperature and precipitation, which neglect extreme events that can have a significant impact on crops limits this study as does the exclusive use of United States agricultural data. The GFDL 2.1 model under a 720ppm CO2 scenario provides temperature and precipitation fields for 2040-2100 which are used to explore how the spatial regions available for crop production will change under these new conditions.
Effects of Employing Ridge Regression in Structural Equation Models.

ERIC Educational Resources Information Center

McQuitty, Shaun

1997-01-01

LISREL 8 invokes a ridge option when maximum likelihood or generalized least squares are used to estimate a structural equation model with a nonpositive definite covariance or correlation matrix. Implications of the ridge option for model fit, parameter estimates, and standard errors are explored through two examples. (SLD)
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.

PubMed

Nikbakht, Roya; Bahrampour, Abbas

2017-01-01

Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Methods for estimating population density in data-limited areas: evaluating regression and tree-based models in Peru.

PubMed

Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

2014-01-01

Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.
Methods for Estimating Population Density in Data-Limited Areas: Evaluating Regression and Tree-Based Models in Peru

PubMed Central

Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

2014-01-01

Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657
Tests of Alignment among Assessment, Standards, and Instruction Using Generalized Linear Model Regression

ERIC Educational Resources Information Center

Fulmer, Gavin W.; Polikoff, Morgan S.

2014-01-01

An essential component in school accountability efforts is for assessments to be well-aligned with the standards or curriculum they are intended to measure. However, relatively little prior research has explored methods to determine statistical significance of alignment or misalignment. This study explores analyses of alignment as a special case…
Weather variability and the incidence of cryptosporidiosis: comparison of time series poisson regression and SARIMA models.

PubMed

Hu, Wenbiao; Tong, Shilu; Mengersen, Kerrie; Connell, Des

2007-09-01

Few studies have examined the relationship between weather variables and cryptosporidiosis in Australia. This paper examines the potential impact of weather variability on the transmission of cryptosporidiosis and explores the possibility of developing an empirical forecast system. Data on weather variables, notified cryptosporidiosis cases, and population size in Brisbane were supplied by the Australian Bureau of Meteorology, Queensland Department of Health, and Australian Bureau of Statistics for the period of January 1, 1996-December 31, 2004, respectively. Time series Poisson regression and seasonal auto-regression integrated moving average (SARIMA) models were performed to examine the potential impact of weather variability on the transmission of cryptosporidiosis. Both the time series Poisson regression and SARIMA models show that seasonal and monthly maximum temperature at a prior moving average of 1 and 3 months were significantly associated with cryptosporidiosis disease. It suggests that there may be 50 more cases a year for an increase of 1 degrees C maximum temperature on average in Brisbane. Model assessments indicated that the SARIMA model had better predictive ability than the Poisson regression model (SARIMA: root mean square error (RMSE): 0.40, Akaike information criterion (AIC): -12.53; Poisson regression: RMSE: 0.54, AIC: -2.84). Furthermore, the analysis of residuals shows that the time series Poisson regression appeared to violate a modeling assumption, in that residual autocorrelation persisted. The results of this study suggest that weather variability (particularly maximum temperature) may have played a significant role in the transmission of cryptosporidiosis. A SARIMA model may be a better predictive model than a Poisson regression model in the assessment of the relationship between weather variability and the incidence of cryptosporidiosis.

Impact of work pressure, work stress and work-family conflict on firefighter burnout.

PubMed

Smith, Todd D; DeJoy, David M; Dyal, Mari-Amanda Aimee; Huang, Gaojian

2017-10-25

Little research has explored burnout and its causes in the American fire service. Data were collected from career firefighters in the southeastern United States (n = 208) to explore these relationships. A hierarchical regression model was tested to examine predictors of burnout including sociodemographic characteristics (model 1), work pressure (model 2), work stress and work-family conflict (model 3) and interaction terms (model 4). The main findings suggest that perceived work stress and work-family conflict emerged as the significant predictors of burnout (both p < .001). Interventions and programs aimed at these predictors could potentially curtail burnout among firefighters.
Forecasting daily patient volumes in the emergency department.

PubMed

Jones, Spencer S; Thomas, Alun; Evans, R Scott; Welch, Shari J; Haug, Peter J; Snow, Gregory L

2008-02-01

Shifts in the supply of and demand for emergency department (ED) resources make the efficient allocation of ED resources increasingly important. Forecasting is a vital activity that guides decision-making in many areas of economic, industrial, and scientific planning, but has gained little traction in the health care industry. There are few studies that explore the use of forecasting methods to predict patient volumes in the ED. The goals of this study are to explore and evaluate the use of several statistical forecasting methods to predict daily ED patient volumes at three diverse hospital EDs and to compare the accuracy of these methods to the accuracy of a previously proposed forecasting method. Daily patient arrivals at three hospital EDs were collected for the period January 1, 2005, through March 31, 2007. The authors evaluated the use of seasonal autoregressive integrated moving average, time series regression, exponential smoothing, and artificial neural network models to forecast daily patient volumes at each facility. Forecasts were made for horizons ranging from 1 to 30 days in advance. The forecast accuracy achieved by the various forecasting methods was compared to the forecast accuracy achieved when using a benchmark forecasting method already available in the emergency medicine literature. All time series methods considered in this analysis provided improved in-sample model goodness of fit. However, post-sample analysis revealed that time series regression models that augment linear regression models by accounting for serial autocorrelation offered only small improvements in terms of post-sample forecast accuracy, relative to multiple linear regression models, while seasonal autoregressive integrated moving average, exponential smoothing, and artificial neural network forecasting models did not provide consistently accurate forecasts of daily ED volumes. This study confirms the widely held belief that daily demand for ED services is characterized by seasonal and weekly patterns. The authors compared several time series forecasting methods to a benchmark multiple linear regression model. The results suggest that the existing methodology proposed in the literature, multiple linear regression based on calendar variables, is a reasonable approach to forecasting daily patient volumes in the ED. However, the authors conclude that regression-based models that incorporate calendar variables, account for site-specific special-day effects, and allow for residual autocorrelation provide a more appropriate, informative, and consistently accurate approach to forecasting daily ED patient volumes.
The Local Food Environment and Fruit and Vegetable Intake: A Geographically Weighted Regression Approach in the ORiEL Study.

PubMed

Clary, Christelle; Lewis, Daniel J; Flint, Ellen; Smith, Neil R; Kestens, Yan; Cummins, Steven

2016-12-01

Studies that explore associations between the local food environment and diet routinely use global regression models, which assume that relationships are invariant across space, yet such stationarity assumptions have been little tested. We used global and geographically weighted regression models to explore associations between the residential food environment and fruit and vegetable intake. Analyses were performed in 4 boroughs of London, United Kingdom, using data collected between April 2012 and July 2012 from 969 adults in the Olympic Regeneration in East London Study. Exposures were assessed both as absolute densities of healthy and unhealthy outlets, taken separately, and as a relative measure (proportion of total outlets classified as healthy). Overall, local models performed better than global models (lower Akaike information criterion). Locally estimated coefficients varied across space, regardless of the type of exposure measure, although changes of sign were observed only when absolute measures were used. Despite findings from global models showing significant associations between the relative measure and fruit and vegetable intake (β = 0.022; P < 0.01) only, geographically weighted regression models using absolute measures outperformed models using relative measures. This study suggests that greater attention should be given to nonstationary relationships between the food environment and diet. It further challenges the idea that a single measure of exposure, whether relative or absolute, can reflect the many ways the food environment may shape health behaviors. © The Author 2016. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The statistical analysis of multi-environment data: modeling genotype-by-environment interaction and its genetic basis

PubMed Central

Malosetti, Marcos; Ribaut, Jean-Marcel; van Eeuwijk, Fred A.

2013-01-01

Genotype-by-environment interaction (GEI) is an important phenomenon in plant breeding. This paper presents a series of models for describing, exploring, understanding, and predicting GEI. All models depart from a two-way table of genotype by environment means. First, a series of descriptive and explorative models/approaches are presented: Finlay–Wilkinson model, AMMI model, GGE biplot. All of these approaches have in common that they merely try to group genotypes and environments and do not use other information than the two-way table of means. Next, factorial regression is introduced as an approach to explicitly introduce genotypic and environmental covariates for describing and explaining GEI. Finally, QTL modeling is presented as a natural extension of factorial regression, where marker information is translated into genetic predictors. Tests for regression coefficients corresponding to these genetic predictors are tests for main effect QTL expression and QTL by environment interaction (QEI). QTL models for which QEI depends on environmental covariables form an interesting model class for predicting GEI for new genotypes and new environments. For realistic modeling of genotypic differences across multiple environments, sophisticated mixed models are necessary to allow for heterogeneity of genetic variances and correlations across environments. The use and interpretation of all models is illustrated by an example data set from the CIMMYT maize breeding program, containing environments differing in drought and nitrogen stress. To help readers to carry out the statistical analyses, GenStat® programs, 15th Edition and Discovery® version, are presented as “Appendix.” PMID:23487515
Use of Two-Part Regression Calibration Model to Correct for Measurement Error in Episodically Consumed Foods in a Single-Replicate Study Design: EPIC Case Study

PubMed Central

Agogo, George O.; van der Voet, Hilko; Veer, Pieter van’t; Ferrari, Pietro; Leenders, Max; Muller, David C.; Sánchez-Cantalejo, Emilio; Bamia, Christina; Braaten, Tonje; Knüppel, Sven; Johansson, Ingegerd; van Eeuwijk, Fred A.; Boshuizen, Hendriek

2014-01-01

In epidemiologic studies, measurement error in dietary variables often attenuates association between dietary intake and disease occurrence. To adjust for the attenuation caused by error in dietary intake, regression calibration is commonly used. To apply regression calibration, unbiased reference measurements are required. Short-term reference measurements for foods that are not consumed daily contain excess zeroes that pose challenges in the calibration model. We adapted two-part regression calibration model, initially developed for multiple replicates of reference measurements per individual to a single-replicate setting. We showed how to handle excess zero reference measurements by two-step modeling approach, how to explore heteroscedasticity in the consumed amount with variance-mean graph, how to explore nonlinearity with the generalized additive modeling (GAM) and the empirical logit approaches, and how to select covariates in the calibration model. The performance of two-part calibration model was compared with the one-part counterpart. We used vegetable intake and mortality data from European Prospective Investigation on Cancer and Nutrition (EPIC) study. In the EPIC, reference measurements were taken with 24-hour recalls. For each of the three vegetable subgroups assessed separately, correcting for error with an appropriately specified two-part calibration model resulted in about three fold increase in the strength of association with all-cause mortality, as measured by the log hazard ratio. Further found is that the standard way of including covariates in the calibration model can lead to over fitting the two-part calibration model. Moreover, the extent of adjusting for error is influenced by the number and forms of covariates in the calibration model. For episodically consumed foods, we advise researchers to pay special attention to response distribution, nonlinearity, and covariate inclusion in specifying the calibration model. PMID:25402487
Application of geographically-weighted regression analysis to assess risk factors for malaria hotspots in Keur Soce health and demographic surveillance site.

PubMed

Ndiath, Mansour M; Cisse, Badara; Ndiaye, Jean Louis; Gomis, Jules F; Bathiery, Ousmane; Dia, Anta Tal; Gaye, Oumar; Faye, Babacar

2015-11-18

In Senegal, considerable efforts have been made to reduce malaria morbidity and mortality during the last decade. This resulted in a marked decrease of malaria cases. With the decline of malaria cases, transmission has become sparse in most Senegalese health districts. This study investigated malaria hotspots in Keur Soce sites by using geographically-weighted regression. Because of the occurrence of hotspots, spatial modelling of malaria cases could have a considerable effect in disease surveillance. This study explored and analysed the spatial relationships between malaria occurrence and socio-economic and environmental factors in small communities in Keur Soce, Senegal, using 6 months passive surveillance. Geographically-weighted regression was used to explore the spatial variability of relationships between malaria incidence or persistence and the selected socio-economic, and human predictors. A model comparison of between ordinary least square and geographically-weighted regression was also explored. Vector dataset (spatial) of the study area by village levels and statistical data (non-spatial) on malaria confirmed cases, socio-economic status (bed net use), population data (size of the household) and environmental factors (temperature, rain fall) were used in this exploratory analysis. ArcMap 10.2 and Stata 11 were used to perform malaria hotspots analysis. From Jun to December, a total of 408 confirmed malaria cases were notified. The explanatory variables-household size, housing materials, sleeping rooms, sheep and distance to breeding site returned significant t values of -0.25, 2.3, 4.39, 1.25 and 2.36, respectively. The OLS global model revealed that it explained about 70 % (adjusted R(2) = 0.70) of the variation in malaria occurrence with AIC = 756.23. The geographically-weighted regression of malaria hotspots resulted in coefficient intercept ranging from 1.89 to 6.22 with a median of 3.5. Large positive values are distributed mainly in the southeast of the district where hotspots are more accurate while low values are mainly found in the centre and in the north. Geographically-weighted regression and OLS showed important risks factors of malaria hotspots in Keur Soce. The outputs of such models can be a useful tool to understand occurrence of malaria hotspots in Senegal. An understanding of geographical variation and determination of the core areas of the disease may provide an explanation regarding possible proximal and distal contributors to malaria elimination in Senegal.
Influence diagnostics in meta-regression model.

PubMed

Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua

2017-09-01

This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Human and bovine viruses and bacteria at three great lakes beaches: Environmental variable associations and health risk

USDA-ARS?s Scientific Manuscript database

Waterborne pathogens were detected in 96% of samples collected at three Lake Michigan beaches during the summer of 2010. Linear regression models were developed to explore environmental factors that may be influential for pathogen prevalence. Simulation of pathogen concentration using these models, ...
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.

PubMed

Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H

2016-01-01

Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
The prediction of intelligence in preschool children using alternative models to regression.

PubMed

Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E

2011-12-01

Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.
Mapping soil textural fractions across a large watershed in north-east Florida.

PubMed

Lamsal, S; Mishra, U

2010-08-01

Assessment of regional scale soil spatial variation and mapping their distribution is constrained by sparse data which are collected using field surveys that are labor intensive and cost prohibitive. We explored geostatistical (ordinary kriging-OK), regression (Regression Tree-RT), and hybrid methods (RT plus residual Sequential Gaussian Simulation-SGS) to map soil textural fractions across the Santa Fe River Watershed (3585 km(2)) in north-east Florida. Soil samples collected from four depths (L1: 0-30 cm, L2: 30-60 cm, L3: 60-120 cm, and L4: 120-180 cm) at 141 locations were analyzed for soil textural fractions (sand, silt and clay contents), and combined with textural data (15 profiles) assembled under the Florida Soil Characterization program. Textural fractions in L1 and L2 were autocorrelated, and spatially mapped across the watershed. OK performance was poor, which may be attributed to the sparse sampling. RT model structure varied among textural fractions, and the model explained variations ranged from 25% for L1 silt to 61% for L2 clay content. Regression residuals were simulated using SGS, and the average of simulated residuals were used to approximate regression residual distribution map, which were added to regression trend maps. Independent validation of the prediction maps showed that regression models performed slightly better than OK, and regression combined with average of simulated regression residuals improved predictions beyond the regression model. Sand content >90% in both 0-30 and 30-60 cm covered 80.6% of the watershed area. Copyright 2010 Elsevier Ltd. All rights reserved.
Classification and regression trees

Treesearch

G. G. Moisen

2008-01-01

Frequently, ecologists are interested in exploring ecological relationships, describing patterns and processes, or making spatial or temporal predictions. These purposes often can be addressed by modeling the relationship between some outcome or response and a set of features or explanatory variables.
An Analysis of Advertising Effectiveness for U.S. Navy Recruiting

DTIC Science & Technology

1997-09-01

This thesis estimates the effect of Navy television advertising on enlistment rates of high quality male recruits (Armed Forces Qualification Test...Joint advertising is for all Armed Forces), Joint journal, and Joint direct mail advertising are explored. Enlistments are modeled as a function of...several factors including advertising , recruiters, and economic. Regression analyses (Ordinary Least Squares and Two Stage Least Squares) explore the
Convex Regression with Interpretable Sharp Partitions

PubMed Central

Petersen, Ashley; Simon, Noah; Witten, Daniela

2016-01-01

We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set. PMID:27635120
[Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

PubMed

Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

2015-05-12

To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Does transport time help explain the high trauma mortality rates in rural areas? New and traditional predictors assessed by new and traditional statistical methods

PubMed Central

Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas

2015-01-01

Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600
Benchmark dose analysis via nonparametric regression modeling

PubMed Central

Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

2013-01-01

Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits’ small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057
RRegrs: an R package for computer-aided model selection with multiple regression models.

PubMed

Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L

2015-01-01

Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.
An open-access CMIP5 pattern library for temperature and precipitation: description and methodology

NASA Astrophysics Data System (ADS)

Lynch, Cary; Hartin, Corinne; Bond-Lamberty, Ben; Kravitz, Ben

2017-05-01

Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squares regression methods. We explore the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90° N/S). Bias and mean errors between modeled and pattern-predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5 °C, but the choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. This paper describes our library of least squares regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns. The dataset and netCDF data generation code are available at doi:10.5281/zenodo.495632.
Mapping the spatial pattern of temperate forest above ground biomass by integrating airborne lidar with Radarsat-2 imagery via geostatistical models

NASA Astrophysics Data System (ADS)

Li, Wang; Niu, Zheng; Gao, Shuai; Wang, Cheng

2014-11-01

Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are two competitive active remote sensing techniques in forest above ground biomass estimation, which is important for forest management and global climate change study. This study aims to further explore their capabilities in temperate forest above ground biomass (AGB) estimation by emphasizing the spatial auto-correlation of variables obtained from these two remote sensing tools, which is a usually overlooked aspect in remote sensing applications to vegetation studies. Remote sensing variables including airborne LiDAR metrics, backscattering coefficient for different SAR polarizations and their ratio variables for Radarsat-2 imagery were calculated. First, simple linear regression models (SLR) was established between the field-estimated above ground biomass and the remote sensing variables. Pearson's correlation coefficient (R2) was used to find which LiDAR metric showed the most significant correlation with the regression residuals and could be selected as co-variable in regression co-kriging (RCoKrig). Second, regression co-kriging was conducted by choosing the regression residuals as dependent variable and the LiDAR metric (Hmean) with highest R2 as co-variable. Third, above ground biomass over the study area was estimated using SLR model and RCoKrig model, respectively. The results for these two models were validated using the same ground points. Results showed that both of these two methods achieved satisfactory prediction accuracy, while regression co-kriging showed the lower estimation error. It is proved that regression co-kriging model is feasible and effective in mapping the spatial pattern of AGB in the temperate forest using Radarsat-2 data calibrated by airborne LiDAR metrics.

Does the high–tech industry consistently reduce CO{sub 2} emissions? Results from nonparametric additive regression model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Bin; Research Center of Applied Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013; Lin, Boqiang, E-mail: bqlin@xmu.edu.cn

China is currently the world's largest carbon dioxide (CO{sub 2}) emitter. Moreover, total energy consumption and CO{sub 2} emissions in China will continue to increase due to the rapid growth of industrialization and urbanization. Therefore, vigorously developing the high–tech industry becomes an inevitable choice to reduce CO{sub 2} emissions at the moment or in the future. However, ignoring the existing nonlinear links between economic variables, most scholars use traditional linear models to explore the impact of the high–tech industry on CO{sub 2} emissions from an aggregate perspective. Few studies have focused on nonlinear relationships and regional differences in China. Basedmore » on panel data of 1998–2014, this study uses the nonparametric additive regression model to explore the nonlinear effect of the high–tech industry from a regional perspective. The estimated results show that the residual sum of squares (SSR) of the nonparametric additive regression model in the eastern, central and western regions are 0.693, 0.054 and 0.085 respectively, which are much less those that of the traditional linear regression model (3.158, 4.227 and 7.196). This verifies that the nonparametric additive regression model has a better fitting effect. Specifically, the high–tech industry produces an inverted “U–shaped” nonlinear impact on CO{sub 2} emissions in the eastern region, but a positive “U–shaped” nonlinear effect in the central and western regions. Therefore, the nonlinear impact of the high–tech industry on CO{sub 2} emissions in the three regions should be given adequate attention in developing effective abatement policies. - Highlights: • The nonlinear effect of the high–tech industry on CO{sub 2} emissions was investigated. • The high–tech industry yields an inverted “U–shaped” effect in the eastern region. • The high–tech industry has a positive “U–shaped” nonlinear effect in other regions. • The linear impact of the high–tech industry in the eastern region is the strongest.« less
A Model for Predicting Student Performance on High-Stakes Assessment

ERIC Educational Resources Information Center

Dammann, Matthew Walter

2010-01-01

This research study examined the use of student achievement on reading and math state assessments to predict success on the science state assessment. Multiple regression analysis was utilized to test the prediction for all students in grades 5 and 8 in a mid-Atlantic state. The prediction model developed from the analysis explored the combined…
A Nationwide Epidemiologic Modeling Study of LD: Risk, Protection, and Unintended Impact

ERIC Educational Resources Information Center

McDermott, Paul A.; Goldberg, Michelle M.; Watkins, Marley W.; Stanley, Jeanne L.; Glutting, Joseph J.

2006-01-01

Through multiple logistic regression modeling, this article explores the relative importance of risk and protective factors associated with learning disabilities (LD). A representative national sample of 6- to 17-year-old students (N = 1,268) was drawn by random stratification and classified by the presence versus absence of LD in reading,…
Data Mining and Predictive Modeling in Institutional Advancement: How Ten Schools Found Success. Technical Report

ERIC Educational Resources Information Center

Luperchio, Dan

2009-01-01

This technical report, produced in partnership by the Council for Advancement and Support of Education (CASE) and SPSS Inc., explores the promise of data mining alumni records at educational institutions. Working with individual alumni records from The Johns Hopkins Zanvyl Krieger School of Arts and Sciences, a predictive regression model is…
Exploring and accounting for publication bias in mental health: a brief overview of methods.

PubMed

Mavridis, Dimitris; Salanti, Georgia

2014-02-01

OBJECTIVE Publication bias undermines the integrity of published research. The aim of this paper is to present a synopsis of methods for exploring and accounting for publication bias. METHODS We discussed the main features of the following methods to assess publication bias: funnel plot analysis; trim-and-fill methods; regression techniques and selection models. We applied these methods to a well-known example of antidepressants trials that compared trials submitted to the Food and Drug Administration (FDA) for regulatory approval. RESULTS The funnel plot-related methods (visual inspection, trim-and-fill, regression models) revealed an association between effect size and SE. Contours of statistical significance showed that asymmetry in the funnel plot is probably due to publication bias. Selection model found a significant correlation between effect size and propensity for publication. CONCLUSIONS Researchers should always consider the possible impact of publication bias. Funnel plot-related methods should be seen as a means of examining for small-study effects and not be directly equated with publication bias. Possible causes for funnel plot asymmetry should be explored. Contours of statistical significance may help disentangle whether asymmetry in a funnel plot is caused by publication bias or not. Selection models, although underused, could be useful resource when publication bias and heterogeneity are suspected because they address directly the problem of publication bias and not that of small-study effects.
A classical regression framework for mediation analysis: fitting one model to estimate mediation effects.

PubMed

Saunders, Christina T; Blume, Jeffrey D

2017-10-26

Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
Prediction of HDR quality by combining perceptually transformed display measurements with machine learning

NASA Astrophysics Data System (ADS)

Choudhury, Anustup; Farrell, Suzanne; Atkins, Robin; Daly, Scott

2017-09-01

We present an approach to predict overall HDR display quality as a function of key HDR display parameters. We first performed subjective experiments on a high quality HDR display that explored five key HDR display parameters: maximum luminance, minimum luminance, color gamut, bit-depth and local contrast. Subjects rated overall quality for different combinations of these display parameters. We explored two models | a physical model solely based on physically measured display characteristics and a perceptual model that transforms physical parameters using human vision system models. For the perceptual model, we use a family of metrics based on a recently published color volume model (ICT-CP), which consists of the PQ luminance non-linearity (ST2084) and LMS-based opponent color, as well as an estimate of the display point spread function. To predict overall visual quality, we apply linear regression and machine learning techniques such as Multilayer Perceptron, RBF and SVM networks. We use RMSE and Pearson/Spearman correlation coefficients to quantify performance. We found that the perceptual model is better at predicting subjective quality than the physical model and that SVM is better at prediction than linear regression. The significance and contribution of each display parameter was investigated. In addition, we found that combined parameters such as contrast do not improve prediction. Traditional perceptual models were also evaluated and we found that models based on the PQ non-linearity performed better.
Exploring unobserved heterogeneity in bicyclists' red-light running behaviors at different crossing facilities.

PubMed

Guo, Yanyong; Li, Zhibin; Wu, Yao; Xu, Chengcheng

2018-06-01

Bicyclists running the red light at crossing facilities increase the potential of colliding with motor vehicles. Exploring the contributing factors could improve the prediction of running red-light probability and develop countermeasures to reduce such behaviors. However, individuals could have unobserved heterogeneities in running a red light, which make the accurate prediction more challenging. Traditional models assume that factor parameters are fixed and cannot capture the varying impacts on red-light running behaviors. In this study, we employed the full Bayesian random parameters logistic regression approach to account for the unobserved heterogeneous effects. Two types of crossing facilities were considered which were the signalized intersection crosswalks and the road segment crosswalks. Electric and conventional bikes were distinguished in the modeling. Data were collected from 16 crosswalks in urban area of Nanjing, China. Factors such as individual characteristics, road geometric design, environmental features, and traffic variables were examined. Model comparison indicates that the full Bayesian random parameters logistic regression approach is statistically superior to the standard logistic regression model. More red-light runners are predicted at signalized intersection crosswalks than at road segment crosswalks. Factors affecting red-light running behaviors are gender, age, bike type, road width, presence of raised median, separation width, signal type, green ratio, bike and vehicle volume, and average vehicle speed. Factors associated with the unobserved heterogeneity are gender, bike type, signal type, separation width, and bike volume. Copyright © 2018 Elsevier Ltd. All rights reserved.
Gender differences in body consciousness and substance use among high-risk adolescents.

PubMed

Black, David Scott; Sussman, Steve; Unger, Jennifer; Pokhrel, Pallav; Sun, Ping

2010-08-01

This study explores the association between private and public body consciousness and past 30-day cigarette, alcohol, marijuana, and hard drug use among adolescents. Self-reported data from alterative high school students in California were analyzed (N = 976) using multilevel regression models to account for student clustering within schools. Separate regression analyses were conducted for males and females. Both cross-sectional baseline data and one-year longitudinal prediction models indicated that body consciousness is associated with specific drug use categories differentially by gender. Findings suggest that body consciousness accounts for additional variance in substance use etiology not explained by previously recognized dispositional variables.
Tradespace Exploration for the Engineering of Resilient Systems

DTIC Science & Technology

2015-05-01

world scenarios. The types of tools within the SAE set include visualization, decision analysis, and M&S, so it is difficult to categorize this toolset... overpopulated , or questionable. ERS Tradespace Workshop Create predictive models using multiple techniques (e.g., regression, Kriging, neural nets
LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

USGS Publications Warehouse

Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.

2015-01-01

Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
An empirical study using permutation-based resampling in meta-regression

PubMed Central

2012-01-01

Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815
Measuring the impact of urbanization on scenic quality: land use change in the northeast

Treesearch

Robert O. Brush; James F. Palmer

1979-01-01

The changes in scenic quality resulting from urbanization are explored for a region in the Northeast. The relative contributions to scenic quality of certain landscape features are examined by developing regression models for the region and for town landscapes within that region. The models provide empirical evidence of the importance of trees for maintaining high...
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

PubMed Central

Fernandes, Bruno J. T.; Roque, Alexandre

2018-01-01

Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
Student Moon Observations and Spatial-Scientific Reasoning

ERIC Educational Resources Information Center

Cole, Merryn; Wilhelm, Jennifer; Yang, Hongwei

2015-01-01

Relationships between sixth grade students' moon journaling and students' spatial-scientific reasoning after implementation of an Earth/Space unit were examined. Teachers used the project-based Realistic Explorations in Astronomical Learning curriculum. We used a regression model to analyze the relationship between the students' Lunar Phases…
Avoiding overstating the strength of forensic evidence: Shrunk likelihood ratios/Bayes factors.

PubMed

Morrison, Geoffrey Stewart; Poh, Norman

2018-05-01

When strength of forensic evidence is quantified using sample data and statistical models, a concern may be raised as to whether the output of a model overestimates the strength of evidence. This is particularly the case when the amount of sample data is small, and hence sampling variability is high. This concern is related to concern about precision. This paper describes, explores, and tests three procedures which shrink the value of the likelihood ratio or Bayes factor toward the neutral value of one. The procedures are: (1) a Bayesian procedure with uninformative priors, (2) use of empirical lower and upper bounds (ELUB), and (3) a novel form of regularized logistic regression. As a benchmark, they are compared with linear discriminant analysis, and in some instances with non-regularized logistic regression. The behaviours of the procedures are explored using Monte Carlo simulated data, and tested on real data from comparisons of voice recordings, face images, and glass fragments. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Individualized Prediction of Heat Stress in Firefighters: A Data-Driven Approach Using Classification and Regression Trees.

PubMed

Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit

2015-01-01

The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.
Exploring the modeling of spatiotemporal variations in ambient air pollution within the land use regression framework: Estimation of PM10 concentrations on a daily basis.

PubMed

Alam, Md Saniul; McNabola, Aonghus

2015-05-01

Estimation of daily average exposure to PM10 (particulate matter with an aerodynamic diameter<10 μm) using the available fixed-site monitoring stations (FSMs) in a city poses a great challenge. This is because typically FSMs are limited in number when considering the spatial representativeness of their measurements and also because statistical models of citywide exposure have yet to be explored in this context. This paper deals with the later aspect of this challenge and extends the widely used land use regression (LUR) approach to deal with temporal changes in air pollution and the influence of transboundary air pollution on short-term variations in PM10. Using the concept of multiple linear regression (MLR) modeling, the average daily concentrations of PM10 in two European cities, Vienna and Dublin, were modeled. Models were initially developed using the standard MLR approach in Vienna using the most recently available data. Efforts were subsequently made to (i) assess the stability of model predictions over time; (ii) explores the applicability of nonparametric regression (NPR) and artificial neural networks (ANNs) to deal with the nonlinearity of input variables. The predictive performance of the MLR models of the both cities was demonstrated to be stable over time and to produce similar results. However, NPR and ANN were found to have more improvement in the predictive performance in both cities. Using ANN produced the highest result, with daily PM10 exposure predicted at R2=66% for Vienna and 51% for Dublin. In addition, two new predictor variables were also assessed for the Dublin model. The variables representing transboundary air pollution and peak traffic count were found to account for 6.5% and 12.7% of the variation in average daily PM10 concentration. The variable representing transboundary air pollution that was derived from air mass history (from back-trajectory analysis) and population density has demonstrated a positive impact on model performance. The implications of this research would suggest that it is possible to produce a model of ambient air quality on a citywide scale using the readily available data. Most European cities typically have a limited FSM network with average daily concentrations of air pollutants as well as available meteorological, traffic, and land-use data. This research highlights that using these data in combination with advanced statistical techniques such as NPR or ANNs will produce reasonably accurate predictions of ambient air quality across a city, including temporal variations. Therefore, this approach reduces the need for additional measurement data to supplement existing historical records and enables a lower-cost method of air pollution model development for practitioners and policy makers.
An event-based approach to understanding decadal fluctuations in the Atlantic meridional overturning circulation

NASA Astrophysics Data System (ADS)

Allison, Lesley; Hawkins, Ed; Woollings, Tim

2015-01-01

Many previous studies have shown that unforced climate model simulations exhibit decadal-scale fluctuations in the Atlantic meridional overturning circulation (AMOC), and that this variability can have impacts on surface climate fields. However, the robustness of these surface fingerprints across different models is less clear. Furthermore, with the potential for coupled feedbacks that may amplify or damp the response, it is not known whether the associated climate signals are linearly related to the strength of the AMOC changes, or if the fluctuation events exhibit nonlinear behaviour with respect to their strength or polarity. To explore these questions, we introduce an objective and flexible method for identifying the largest natural AMOC fluctuation events in multicentennial/multimillennial simulations of a variety of coupled climate models. The characteristics of the events are explored, including their magnitude, meridional coherence and spatial structure, as well as links with ocean heat transport and the horizontal circulation. The surface fingerprints in ocean temperature and salinity are examined, and compared with the results of linear regression analysis. It is found that the regressions generally provide a good indication of the surface changes associated with the largest AMOC events. However, there are some exceptions, including a nonlinear change in the atmospheric pressure signal, particularly at high latitudes, in HadCM3. Some asymmetries are also found between the changes associated with positive and negative AMOC events in the same model. Composite analysis suggests that there are signals that are robust across the largest AMOC events in each model, which provides reassurance that the surface changes associated with one particular event will be similar to those expected from regression analysis. However, large differences are found between the AMOC fingerprints in different models, which may hinder the prediction and attribution of such events in reality.
The Mantel-Haenszel procedure revisited: models and generalizations.

PubMed

Fidler, Vaclav; Nagelkerke, Nico

2013-01-01

Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented.

The Mantel-Haenszel Procedure Revisited: Models and Generalizations

PubMed Central

Fidler, Vaclav; Nagelkerke, Nico

2013-01-01

Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented. PMID:23516463
Life Satisfaction and Violent Behaviors among Middle School Students

ERIC Educational Resources Information Center

Valois, Robert F.; Paxton, Raheem J.; Zullig, Keith J.; Huebner, E. Scott

2006-01-01

We explored relationships between violent behaviors and perceived life satisfaction among 2,138 middle school students in a southern state using the CDC Middle School Youth Risk Behavior Survey (MSYRBS) and the Brief Multidimensional Student Life Satisfaction Scale (BMSLSS). Logistic regression analyses and multivariate models constructed…
A hybrid PSO-SVM-based method for predicting the friction coefficient between aircraft tire and coating

NASA Astrophysics Data System (ADS)

Zhan, Liwei; Li, Chengwei

2017-02-01

A hybrid PSO-SVM-based model is proposed to predict the friction coefficient between aircraft tire and coating. The presented hybrid model combines a support vector machine (SVM) with particle swarm optimization (PSO) technique. SVM has been adopted to solve regression problems successfully. Its regression accuracy is greatly related to optimizing parameters such as the regularization constant C , the parameter gamma γ corresponding to RBF kernel and the epsilon parameter \\varepsilon in the SVM training procedure. However, the friction coefficient which is predicted based on SVM has yet to be explored between aircraft tire and coating. The experiment reveals that drop height and tire rotational speed are the factors affecting friction coefficient. Bearing in mind, the friction coefficient can been predicted using the hybrid PSO-SVM-based model by the measured friction coefficient between aircraft tire and coating. To compare regression accuracy, a grid search (GS) method and a genetic algorithm (GA) are used to optimize the relevant parameters (C , γ and \\varepsilon ), respectively. The regression accuracy could be reflected by the coefficient of determination ({{R}2} ). The result shows that the hybrid PSO-RBF-SVM-based model has better accuracy compared with the GS-RBF-SVM- and GA-RBF-SVM-based models. The agreement of this model (PSO-RBF-SVM) with experiment data confirms its good performance.
Spatial Analysis of Severe Fever with Thrombocytopenia Syndrome Virus in China Using a Geographically Weighted Logistic Regression Model

PubMed Central

Wu, Liang; Deng, Fei; Xie, Zhong; Hu, Sheng; Shen, Shu; Shi, Junming; Liu, Dan

2016-01-01

Severe fever with thrombocytopenia syndrome (SFTS) is caused by severe fever with thrombocytopenia syndrome virus (SFTSV), which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity), the average proportion of rural population and the average proportion of primary industries over three years (2010–2012). We constructed a geographically weighted logistic regression (GWLR) model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1) meteorological factors have a strong influence on the SFTSV cover; (2) a GWLR model is suitable for exploring SFTSV cover in mainland China; (3) our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies. PMID:27845737
Random forests and stochastic gradient boosting for predicting tree canopy cover: Comparing tuning processes and model performance

Treesearch

E. Freeman; G. Moisen; J. Coulston; B. Wilson

2014-01-01

Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Predicting clicks of PubMed articles.

PubMed

Mao, Yuqing; Lu, Zhiyong

2013-01-01

Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed.
Predicting clicks of PubMed articles

PubMed Central

Mao, Yuqing; Lu, Zhiyong

2013-01-01

Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed. PMID:24551386
Bayesian semi-parametric analysis of Poisson change-point regression models: application to policy making in Cali, Colombia.

PubMed

Park, Taeyoung; Krafty, Robert T; Sánchez, Alvaro I

2012-07-27

A Poisson regression model with an offset assumes a constant baseline rate after accounting for measured covariates, which may lead to biased estimates of coefficients in an inhomogeneous Poisson process. To correctly estimate the effect of time-dependent covariates, we propose a Poisson change-point regression model with an offset that allows a time-varying baseline rate. When the nonconstant pattern of a log baseline rate is modeled with a nonparametric step function, the resulting semi-parametric model involves a model component of varying dimension and thus requires a sophisticated varying-dimensional inference to obtain correct estimates of model parameters of fixed dimension. To fit the proposed varying-dimensional model, we devise a state-of-the-art MCMC-type algorithm based on partial collapse. The proposed model and methods are used to investigate an association between daily homicide rates in Cali, Colombia and policies that restrict the hours during which the legal sale of alcoholic beverages is permitted. While simultaneously identifying the latent changes in the baseline homicide rate which correspond to the incidence of sociopolitical events, we explore the effect of policies governing the sale of alcohol on homicide rates and seek a policy that balances the economic and cultural dependencies on alcohol sales to the health of the public.
Evaluation of the Use of Zero-Augmented Regression Techniques to Model Incidence of Campylobacter Infections in FoodNet.

PubMed

Tremblay, Marlène; Crim, Stacy M; Cole, Dana J; Hoekstra, Robert M; Henao, Olga L; Döpfer, Dörte

2017-10-01

The Foodborne Diseases Active Surveillance Network (FoodNet) is currently using a negative binomial (NB) regression model to estimate temporal changes in the incidence of Campylobacter infection. FoodNet active surveillance in 483 counties collected data on 40,212 Campylobacter cases between years 2004 and 2011. We explored models that disaggregated these data to allow us to account for demographic, geographic, and seasonal factors when examining changes in incidence of Campylobacter infection. We hypothesized that modeling structural zeros and including demographic variables would increase the fit of FoodNet's Campylobacter incidence regression models. Five different models were compared: NB without demographic covariates, NB with demographic covariates, hurdle NB with covariates in the count component only, hurdle NB with covariates in both zero and count components, and zero-inflated NB with covariates in the count component only. Of the models evaluated, the nonzero-augmented NB model with demographic variables provided the best fit. Results suggest that even though zero inflation was not present at this level, individualizing the level of aggregation and using different model structures and predictors per site might be required to correctly distinguish between structural and observational zeros and account for risk factors that vary geographically.
Shrinkage Estimation of Varying Covariate Effects Based On Quantile Regression

PubMed Central

Peng, Limin; Xu, Jinfeng; Kutner, Nancy

2013-01-01

Varying covariate effects often manifest meaningful heterogeneity in covariate-response associations. In this paper, we adopt a quantile regression model that assumes linearity at a continuous range of quantile levels as a tool to explore such data dynamics. The consideration of potential non-constancy of covariate effects necessitates a new perspective for variable selection, which, under the assumed quantile regression model, is to retain variables that have effects on all quantiles of interest as well as those that influence only part of quantiles considered. Current work on l1-penalized quantile regression either does not concern varying covariate effects or may not produce consistent variable selection in the presence of covariates with partial effects, a practical scenario of interest. In this work, we propose a shrinkage approach by adopting a novel uniform adaptive LASSO penalty. The new approach enjoys easy implementation without requiring smoothing. Moreover, it can consistently identify the true model (uniformly across quantiles) and achieve the oracle estimation efficiency. We further extend the proposed shrinkage method to the case where responses are subject to random right censoring. Numerical studies confirm the theoretical results and support the utility of our proposals. PMID:25332515
[Exploring novel hyperspectral band and key index for leaf nitrogen accumulation in wheat].

PubMed

Yao, Xia; Zhu, Yan; Feng, Wei; Tian, Yong-Chao; Cao, Wei-Xing

2009-08-01

The objectives of the present study were to explore new sensitive spectral bands and ratio spectral indices based on precise analysis of ground-based hyperspectral information, and then develop regression model for estimating leaf N accumulation per unit soil area (LNA) in winter wheat (Triticum aestivum L.). Three field experiments were conducted with different N rates and cultivar types in three consecutive growing seasons, and time-course measurements were taken on canopy hyperspectral reflectance and LNA tinder the various treatments. By adopting the method of reduced precise sampling, the detailed ratio spectral indices (RSI) within the range of 350-2 500 nm were constructed, and the quantitative relationships between LNA (gN m(-2)) and RSI (i, j) were analyzed. It was found that several key spectral bands and spectral indices were suitable for estimating LNA in wheat, and the spectral parameter RSI (990, 720) was the most reliable indicator for LNA in wheat. The regression model based on the best RSI was formulated as y = 5.095x - 6.040, with R2 of 0.814. From testing of the derived equations with independent experiment data, the model on RSI (990, 720) had R2 of 0.847 and RRMSE of 24.7%. Thus, it is concluded that the present hyperspectral parameter of RSI (990, 720) and derived regression model can be reliably used for estimating LNA in winter wheat. These results provide the feasible key bands and technical basis for developing the portable instrument of monitoring wheat nitrogen status and for extracting useful spectral information from remote sensing images.
Non-ignorable missingness in logistic regression.

PubMed

Wang, Joanna J J; Bartlett, Mark; Ryan, Louise

2017-08-30

Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
An Exploring Model of Intelligence and Personality in Different Culture

ERIC Educational Resources Information Center

Wu, Yufeng; Qian, Guoying

2005-01-01

Middle school subjects of 13-21 years (from 4 nationalities) were used for studying the relationship between progressive cognition and personality characteristics by Raven's Standard Progressive Matrices and Eysenk's Personality Questionnaire. The results showed: (1) the correlation and stepwise regression were completely identical: P score was…
Epidemiological determinants of successful vaccine development.

PubMed

Nishiura, Hiroshi; Mizumoto, Kenji

2013-01-01

Epidemiological determinants of successful vaccine development were explored using measurable biological variables including antigenic stability and requirement of T-cell immunity. Employing a logistic regression model, we demonstrate that a high affinity with blood and immune cells and pathogen interactions (e.g. interference) would be the risk factors of failure for vaccine development.
Exploring the Relationships among Multicultural Training Experiences and Attitudes toward Diversity among Counseling Students

ERIC Educational Resources Information Center

Dickson, Ginger L.; Jepsen, David A.; Barbee, Phillip W.

2008-01-01

The authors surveyed a national sample of master's-level counseling students regarding their multicultural training experiences and their attitudes toward racial diversity and gender equity. Hierarchical regression models showed that student perceptions of program cultural ambience predicted positive cognitive attitudes toward racial diversity.…
Using Artificial Neural Networks in Educational Research: Some Comparisons with Linear Statistical Models.

ERIC Educational Resources Information Center

Everson, Howard T.; And Others

This paper explores the feasibility of neural computing methods such as artificial neural networks (ANNs) and abductory induction mechanisms (AIM) for use in educational measurement. ANNs and AIMS methods are contrasted with more traditional statistical techniques, such as multiple regression and discriminant function analyses, for making…
The Omitted Variable in Accounting Education Research: The Non-Traditional Student

ERIC Educational Resources Information Center

Mohrweis, Lawrence C.

2010-01-01

Few studies have examined the empirical question of whether nontraditional students are different from traditional students in learning performance. This study explores this issue. Specifically, is there a performance difference between traditional and nontraditional students in the first course in accounting? The model regressed students'…
Correlates and Risk Markers for Sleep Disturbance in Participants of the Autism Treatment Network

ERIC Educational Resources Information Center

Hollway, Jill A.; Aman, Michael G.; Butter, Eric

2013-01-01

We explored possible cognitive, behavioral, emotional, and physiological risk markers for sleep disturbance in children with autism spectrum disorders. Data from 1,583 children in the Autism Treatment Network were analyzed. Approximately 45 potential predictors were analyzed using hierarchical regression modeling. As medication could confound…
Refining cost-effectiveness analyses using the net benefit approach and econometric methods: an example from a trial of anti-depressant treatment.

PubMed

Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony

2013-04-01

Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.
Strengthen forensic entomology in court--the need for data exploration and the validation of a generalised additive mixed model.

PubMed

Baqué, Michèle; Amendt, Jens

2013-01-01

Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.

[Influences of environmental factors and interaction of several chemokines gene-environmental on systemic lupus erythematosus].

PubMed

Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui

2004-11-01

To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
The Norwegian Healthier Goats program--modeling lactation curves using a multilevel cubic spline regression model.

PubMed

Nagel-Alne, G E; Krontveit, R; Bohlin, J; Valle, P S; Skjerve, E; Sølverød, L S

2014-07-01

In 2001, the Norwegian Goat Health Service initiated the Healthier Goats program (HG), with the aim of eradicating caprine arthritis encephalitis, caseous lymphadenitis, and Johne's disease (caprine paratuberculosis) in Norwegian goat herds. The aim of the present study was to explore how control and eradication of the above-mentioned diseases by enrolling in HG affected milk yield by comparison with herds not enrolled in HG. Lactation curves were modeled using a multilevel cubic spline regression model where farm, goat, and lactation were included as random effect parameters. The data material contained 135,446 registrations of daily milk yield from 28,829 lactations in 43 herds. The multilevel cubic spline regression model was applied to 4 categories of data: enrolled early, control early, enrolled late, and control late. For enrolled herds, the early and late notations refer to the situation before and after enrolling in HG; for nonenrolled herds (controls), they refer to development over time, independent of HG. Total milk yield increased in the enrolled herds after eradication: the total milk yields in the fourth lactation were 634.2 and 873.3 kg in enrolled early and enrolled late herds, respectively, and 613.2 and 701.4 kg in the control early and control late herds, respectively. Day of peak yield differed between enrolled and control herds. The day of peak yield came on d 6 of lactation for the control early category for parities 2, 3, and 4, indicating an inability of the goats to further increase their milk yield from the initial level. For enrolled herds, on the other hand, peak yield came between d 49 and 56, indicating a gradual increase in milk yield after kidding. Our results indicate that enrollment in the HG disease eradication program improved the milk yield of dairy goats considerably, and that the multilevel cubic spline regression was a suitable model for exploring effects of disease control and eradication on milk yield. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Drivers of Variability in Public-Supply Water Use Across the Contiguous United States

NASA Astrophysics Data System (ADS)

Worland, Scott C.; Steinschneider, Scott; Hornberger, George M.

2018-03-01

This study explores the relationship between municipal water use and an array of climate, economic, behavioral, and policy variables across the contiguous U.S. The relationship is explored using Bayesian-hierarchical regression models for over 2,500 counties, 18 covariates, and three higher-level grouping variables. Additionally, a second analysis is included for 83 cities where water price and water conservation policy information is available. A hierarchical model using the nine climate regions (product of National Oceanic and Atmospheric Administration) as the higher-level groups results in the best out-of-sample performance, as estimated by the Widely Available Information Criterion, compared to counties grouped by urban continuum classification or primary economic activity. The regression coefficients indicate that the controls on water use are not uniform across the nation: e.g., counties in the Northeast and Northwest climate regions are more sensitive to social variables, whereas counties in the Southwest and East North Central climate regions are more sensitive to environmental variables. For the national city-level model, it appears that arid cities with a high cost of living and relatively low water bills sell more water per customer, but as with the county-level model, the effect of each variable depends heavily on where a city is located.
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.

PubMed

Fang, Xingang; Bagui, Sikha; Bagui, Subhash

2017-08-01

The readily available high throughput screening (HTS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Exploring simple, transparent, interpretable and predictive QSAR models for classification and quantitative prediction of rat toxicity of ionic liquids using OECD recommended guidelines.

PubMed

Das, Rudra Narayan; Roy, Kunal; Popelier, Paul L A

2015-11-01

The present study explores the chemical attributes of diverse ionic liquids responsible for their cytotoxicity in a rat leukemia cell line (IPC-81) by developing predictive classification as well as regression-based mathematical models. Simple and interpretable descriptors derived from a two-dimensional representation of the chemical structures along with quantum topological molecular similarity indices have been used for model development, employing unambiguous modeling strategies that strictly obey the guidelines of the Organization for Economic Co-operation and Development (OECD) for quantitative structure-activity relationship (QSAR) analysis. The structure-toxicity relationships that emerged from both classification and regression-based models were in accordance with the findings of some previous studies. The models suggested that the cytotoxicity of ionic liquids is dependent on the cationic surfactant action, long alkyl side chains, cationic lipophilicity as well as aromaticity, the presence of a dialkylamino substituent at the 4-position of the pyridinium nucleus and a bulky anionic moiety. The models have been transparently presented in the form of equations, thus allowing their easy transferability in accordance with the OECD guidelines. The models have also been subjected to rigorous validation tests proving their predictive potential and can hence be used for designing novel and "greener" ionic liquids. The major strength of the present study lies in the use of a diverse and large dataset, use of simple reproducible descriptors and compliance with the OECD norms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Using decision trees to understand structure in missing data

PubMed Central

Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L

2015-01-01

Objectives Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting Data taken from employees at 3 different industrial sites in Australia. Participants 7915 observations were included. Materials and methods The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions Researchers are encouraged to use CART and BRT models to explore and understand missing data. PMID:26124509
Quantum regression theorem and non-Markovianity of quantum dynamics

NASA Astrophysics Data System (ADS)

Guarnieri, Giacomo; Smirne, Andrea; Vacchini, Bassano

2014-08-01

We explore the connection between two recently introduced notions of non-Markovian quantum dynamics and the validity of the so-called quantum regression theorem. While non-Markovianity of a quantum dynamics has been defined looking at the behavior in time of the statistical operator, which determines the evolution of mean values, the quantum regression theorem makes statements about the behavior of system correlation functions of order two and higher. The comparison relies on an estimate of the validity of the quantum regression hypothesis, which can be obtained exactly evaluating two-point correlation functions. To this aim we consider a qubit undergoing dephasing due to interaction with a bosonic bath, comparing the exact evaluation of the non-Markovianity measures with the violation of the quantum regression theorem for a class of spectral densities. We further study a photonic dephasing model, recently exploited for the experimental measurement of non-Markovianity. It appears that while a non-Markovian dynamics according to either definition brings with itself violation of the regression hypothesis, even Markovian dynamics can lead to a failure of the regression relation.
Dose response explorer: an integrated open-source tool for exploring and modelling radiotherapy dose volume outcome relationships

NASA Astrophysics Data System (ADS)

El Naqa, I.; Suneja, G.; Lindsay, P. E.; Hope, A. J.; Alaly, J. R.; Vicic, M.; Bradley, J. D.; Apte, A.; Deasy, J. O.

2006-11-01

Radiotherapy treatment outcome models are a complicated function of treatment, clinical and biological factors. Our objective is to provide clinicians and scientists with an accurate, flexible and user-friendly software tool to explore radiotherapy outcomes data and build statistical tumour control or normal tissue complications models. The software tool, called the dose response explorer system (DREES), is based on Matlab, and uses a named-field structure array data type. DREES/Matlab in combination with another open-source tool (CERR) provides an environment for analysing treatment outcomes. DREES provides many radiotherapy outcome modelling features, including (1) fitting of analytical normal tissue complication probability (NTCP) and tumour control probability (TCP) models, (2) combined modelling of multiple dose-volume variables (e.g., mean dose, max dose, etc) and clinical factors (age, gender, stage, etc) using multi-term regression modelling, (3) manual or automated selection of logistic or actuarial model variables using bootstrap statistical resampling, (4) estimation of uncertainty in model parameters, (5) performance assessment of univariate and multivariate analyses using Spearman's rank correlation and chi-square statistics, boxplots, nomograms, Kaplan-Meier survival plots, and receiver operating characteristics curves, and (6) graphical capabilities to visualize NTCP or TCP prediction versus selected variable models using various plots. DREES provides clinical researchers with a tool customized for radiotherapy outcome modelling. DREES is freely distributed. We expect to continue developing DREES based on user feedback.
Wilderness recreation participation: Projections for the next half century

Treesearch

J. M. Bowker; D. Murphy; H. K. Cordell; D. B. K. English; J. C. Bergstrom; C. M. Starbuck; C. J. Betz; G. T. Green; P. Reed

2007-01-01

This paper explores the influence of demographic and spatial variables on individual participation in wildland area recreation. Data from the National Survey on Recreation and the Environment (NSRE) are combined with GIS-based distance measures to develop nonlinear regression models used to predict both participation and the number of days of participation in...
Social Influence on Information Technology Adoption and Sustained Use in Healthcare: A Hierarchical Bayesian Learning Method Analysis

ERIC Educational Resources Information Center

Hao, Haijing

2013-01-01

Information technology adoption and diffusion is currently a significant challenge in the healthcare delivery setting. This thesis includes three papers that explore social influence on information technology adoption and sustained use in the healthcare delivery environment using conventional regression models and novel hierarchical Bayesian…
Fixing Advising: A Model for Faculty Advising

ERIC Educational Resources Information Center

Crocker, Robert M.; Kahla, Marlene; Allen, Charlotte

2014-01-01

This paper addresses mandates to fix the advising process with a focus on faculty advising systems. Measures of student success and satisfaction, administrative issues, and faculty concerns are among the many factors discussed. Regression analysis is used to explore long-voiced faculty complaints that students do not follow advice. A case study is…
The Relationship of Institutional Tuition Discounts with Enrollment at Private, Not-for-Profit Institutions

ERIC Educational Resources Information Center

Lassila, Nathan E.

2010-01-01

Empirical studies exploring the impact of student aid on postsecondary enrollment often stop short of the specific examination of institutional tuition discounting. This research uses separate empirical ordinary least squares (OLS) regression models to examine three questions using public choice theory, positing that enrollment decisions may be…
Modelling Student Satisfaction and Motivation in the Integrated Educational Environment: An Empirical study

ERIC Educational Resources Information Center

Stukalina, Yulia

2016-01-01

Purpose: The purpose of this paper is to explore some issues related to enhancing the quality of educational services provided by a university in the agenda of integrating quality assurance activities and strategic management procedures. Design/methodology/approach: Employing multiple regression analysis the author has examined some factors that…
Self-Reported Weight Perceptions, Dieting Behavior, and Breakfast Eating among High School Adolescents

ERIC Educational Resources Information Center

Zullig, Keith; Ubbes, Valerie A.; Pyle, Jennifer; Valois, Robert F.

2006-01-01

This study explored the relationships among weight perceptions, dieting behavior, and breakfast eating in 4597 public high school adolescents using the Centers for Disease Control and Prevention Youth Risk Behavior Survey. Adjusted multiple logistic regression models were constructed separately for race and gender groups via SUDAAN (Survey Data…
Predicting Knowledge Workers' Participation in Voluntary Learning with Employee Characteristics and Online Learning Tools

ERIC Educational Resources Information Center

Hicks, Catherine

2018-01-01

Purpose: This paper aims to explore predicting employee learning activity via employee characteristics and usage for two online learning tools. Design/methodology/approach: Statistical analysis focused on observational data collected from user logs. Data are analyzed via regression models. Findings: Findings are presented for over 40,000…
Stream-flow forecasting using extreme learning machines: A case study in a semi-arid region in Iraq

NASA Astrophysics Data System (ADS)

Yaseen, Zaher Mundher; Jaafar, Othman; Deo, Ravinesh C.; Kisi, Ozgur; Adamowski, Jan; Quilty, John; El-Shafie, Ahmed

2016-11-01

Monthly stream-flow forecasting can yield important information for hydrological applications including sustainable design of rural and urban water management systems, optimization of water resource allocations, water use, pricing and water quality assessment, and agriculture and irrigation operations. The motivation for exploring and developing expert predictive models is an ongoing endeavor for hydrological applications. In this study, the potential of a relatively new data-driven method, namely the extreme learning machine (ELM) method, was explored for forecasting monthly stream-flow discharge rates in the Tigris River, Iraq. The ELM algorithm is a single-layer feedforward neural network (SLFNs) which randomly selects the input weights, hidden layer biases and analytically determines the output weights of the SLFNs. Based on the partial autocorrelation functions of historical stream-flow data, a set of five input combinations with lagged stream-flow values are employed to establish the best forecasting model. A comparative investigation is conducted to evaluate the performance of the ELM compared to other data-driven models: support vector regression (SVR) and generalized regression neural network (GRNN). The forecasting metrics defined as the correlation coefficient (r), Nash-Sutcliffe efficiency (ENS), Willmott's Index (WI), root-mean-square error (RMSE) and mean absolute error (MAE) computed between the observed and forecasted stream-flow data are employed to assess the ELM model's effectiveness. The results revealed that the ELM model outperformed the SVR and the GRNN models across a number of statistical measures. In quantitative terms, superiority of ELM over SVR and GRNN models was exhibited by ENS = 0.578, 0.378 and 0.144, r = 0.799, 0.761 and 0.468 and WI = 0.853, 0.802 and 0.689, respectively and the ELM model attained lower RMSE value by approximately 21.3% (relative to SVR) and by approximately 44.7% (relative to GRNN). Based on the findings of this study, several recommendations were suggested for further exploration of the ELM model in hydrological forecasting problems.
Characterizing multivariate decoding models based on correlated EEG spectral features

PubMed Central

McFarland, Dennis J.

2013-01-01

Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Lidar aboveground vegetation biomass estimates in shrublands: Prediction, uncertainties and application to coarser scales

USGS Publications Warehouse

Li, Aihua; Dhakal, Shital; Glenn, Nancy F.; Spaete, Luke P.; Shinneman, Douglas; Pilliod, David S.; Arkle, Robert; McIlroy, Susan

2017-01-01

Our study objectives were to model the aboveground biomass in a xeric shrub-steppe landscape with airborne light detection and ranging (Lidar) and explore the uncertainty associated with the models we created. We incorporated vegetation vertical structure information obtained from Lidar with ground-measured biomass data, allowing us to scale shrub biomass from small field sites (1 m subplots and 1 ha plots) to a larger landscape. A series of airborne Lidar-derived vegetation metrics were trained and linked with the field-measured biomass in Random Forests (RF) regression models. A Stepwise Multiple Regression (SMR) model was also explored as a comparison. Our results demonstrated that the important predictors from Lidar-derived metrics had a strong correlation with field-measured biomass in the RF regression models with a pseudo R2 of 0.76 and RMSE of 125 g/m2 for shrub biomass and a pseudo R2 of 0.74 and RMSE of 141 g/m2 for total biomass, and a weak correlation with field-measured herbaceous biomass. The SMR results were similar but slightly better than RF, explaining 77–79% of the variance, with RMSE ranging from 120 to 129 g/m2 for shrub and total biomass, respectively. We further explored the computational efficiency and relative accuracies of using point cloud and raster Lidar metrics at different resolutions (1 m to 1 ha). Metrics derived from the Lidar point cloud processing led to improved biomass estimates at nearly all resolutions in comparison to raster-derived Lidar metrics. Only at 1 m were the results from the point cloud and raster products nearly equivalent. The best Lidar prediction models of biomass at the plot-level (1 ha) were achieved when Lidar metrics were derived from an average of fine resolution (1 m) metrics to minimize boundary effects and to smooth variability. Overall, both RF and SMR methods explained more than 74% of the variance in biomass, with the most important Lidar variables being associated with vegetation structure and statistical measures of this structure (e.g., standard deviation of height was a strong predictor of biomass). Using our model results, we developed spatially-explicit Lidar estimates of total and shrub biomass across our study site in the Great Basin, U.S.A., for monitoring and planning in this imperiled ecosystem.
Bone marrow endothelial progenitors augment atherosclerotic plaque regression in a mouse model of plasma lipid lowering

PubMed Central

Yao, Longbiao; Heuser-Baker, Janet; Herlea-Pana, Oana; Iida, Ryuji; Wang, Qilong; Zou, Ming-Hui; Barlic-Dicen, Jana

2012-01-01

The major event initiating atherosclerosis is hypercholesterolemia-induced disruption of vascular endothelium integrity. In settings of endothelial damage, endothelial progenitor cells (EPCs) are mobilized from bone marrow into circulation and home to sites of vascular injury where they aid endothelial regeneration. Given the beneficial effects of EPCs in vascular repair, we hypothesized that these cells play a pivotal role in atherosclerosis regression. We tested our hypothesis in the atherosclerosis-prone mouse model in which hypercholesterolemia, one of the main factors affecting EPC homeostasis, is reversible (Reversa mice). In these mice normalization of plasma lipids decreased atherosclerotic burden; however, plaque regression was incomplete. To explore whether endothelial progenitors contribute to atherosclerosis regression, bone marrow EPCs from a transgenic strain expressing green fluorescent protein under the control of endothelial cell-specific Tie2 promoter (Tie2-GFP+) were isolated. These cells were then adoptively transferred into atheroregressing Reversa recipients where they augmented plaque regression induced by reversal of hypercholesterolemia. Advanced plaque regression correlated with engraftment of Tie2-GFP+ EPCs into endothelium and resulted in an increase in atheroprotective nitric oxide and improved vascular relaxation. Similarly augmented plaque regression was also detected in regressing Reversa mice treated with the stem cell mobilizer AMD3100 which also mobilizes EPCs to peripheral blood. We conclude that correction of hypercholesterolemia in Reversa mice leads to partial plaque regression that can be augmented by AMD3100 treatment or by adoptive transfer of EPCs. This suggests that direct cell therapy or indirect progenitor cell mobilization therapy may be used in combination with statins to treat atherosclerosis. PMID:23081735
Homogeneity Pursuit

PubMed Central

Ke, Tracy; Fan, Jianqing; Wu, Yichao

2014-01-01

This paper explores the homogeneity of coefficients in high-dimensional regression, which extends the sparsity concept and is more general and suitable for many applications. Homogeneity arises when regression coefficients corresponding to neighboring geographical regions or a similar cluster of covariates are expected to be approximately the same. Sparsity corresponds to a special case of homogeneity with a large cluster of known atom zero. In this article, we propose a new method called clustering algorithm in regression via data-driven segmentation (CARDS) to explore homogeneity. New mathematics are provided on the gain that can be achieved by exploring homogeneity. Statistical properties of two versions of CARDS are analyzed. In particular, the asymptotic normality of our proposed CARDS estimator is established, which reveals better estimation accuracy for homogeneous parameters than that without homogeneity exploration. When our methods are combined with sparsity exploration, further efficiency can be achieved beyond the exploration of sparsity alone. This provides additional insights into the power of exploring low-dimensional structures in high-dimensional regression: homogeneity and sparsity. Our results also shed lights on the properties of the fussed Lasso. The newly developed method is further illustrated by simulation studies and applications to real data. Supplementary materials for this article are available online. PMID:26085701

Africa Knowledge, Data Source, and Analytic Effort (KDAE) Exploration

DTIC Science & Technology

2012-08-20

The World Bank’s web site contains a substantial amount of data, organized by 18 broad topic areas like Agriculture and Rural Development, Education...wb.indicators) <- c(" Agriculture & Rural Development", "Aid Effectiveness", "Climate Change", "Economic Policy & External Debt", "Education", "Energy...Services,Equality))) IV. Model Building #### Function to iterate regression models IOT pick the best ones 75 library(MASS) data.best
Animal models of maternal high fat diet exposure and effects on metabolism in offspring: a meta-regression analysis.

PubMed

Ribaroff, G A; Wastnedge, E; Drake, A J; Sharpe, R M; Chambers, T J G

2017-06-01

Animal models of maternal high fat diet (HFD) demonstrate perturbed offspring metabolism although the effects differ markedly between models. We assessed studies investigating metabolic parameters in the offspring of HFD fed mothers to identify factors explaining these inter-study differences. A total of 171 papers were identified, which provided data from 6047 offspring. Data were extracted regarding body weight, adiposity, glucose homeostasis and lipidaemia. Information regarding the macronutrient content of diet, species, time point of exposure and gestational weight gain were collected and utilized in meta-regression models to explore predictive factors. Publication bias was assessed using Egger's regression test. Maternal HFD exposure did not affect offspring birthweight but increased weaning weight, final bodyweight, adiposity, triglyceridaemia, cholesterolaemia and insulinaemia in both female and male offspring. Hyperglycaemia was found in female offspring only. Meta-regression analysis identified lactational HFD exposure as a key moderator. The fat content of the diet did not correlate with any outcomes. There was evidence of significant publication bias for all outcomes except birthweight. Maternal HFD exposure was associated with perturbed metabolism in offspring but between studies was not accounted for by dietary constituents, species, strain or maternal gestational weight gain. Specific weaknesses in experimental design predispose many of the results to bias. © 2017 The Authors. Obesity Reviews published by John Wiley & Sons Ltd on behalf of World Obesity Federation.
A Continuous Threshold Expectile Model.

PubMed

Zhang, Feipeng; Li, Qunhua

2017-12-01

Expectile regression is a useful tool for exploring the relation between the response and the explanatory variables beyond the conditional mean. A continuous threshold expectile regression is developed for modeling data in which the effect of a covariate on the response variable is linear but varies below and above an unknown threshold in a continuous way. The estimators for the threshold and the regression coefficients are obtained using a grid search approach. The asymptotic properties for all the estimators are derived, and the estimator for the threshold is shown to achieve root-n consistency. A weighted CUSUM type test statistic is proposed for the existence of a threshold at a given expectile, and its asymptotic properties are derived under both the null and the local alternative models. This test only requires fitting the model under the null hypothesis in the absence of a threshold, thus it is computationally more efficient than the likelihood-ratio type tests. Simulation studies show that the proposed estimators and test have desirable finite sample performance in both homoscedastic and heteroscedastic cases. The application of the proposed method on a Dutch growth data and a baseball pitcher salary data reveals interesting insights. The proposed method is implemented in the R package cthreshER .
Work performance decrements are associated with Australian working conditions, particularly the demand to work longer hours.

PubMed

Holden, Libby; Scuffham, Paul A; Hilton, Michael F; Vecchio, Nerina N; Whiteford, Harvey A

2010-03-01

To demonstrate the importance of including a range of working conditions in models exploring the association between health- and work-related performance. The Australian Work Outcomes Research Cost-benefit study cross-sectional screening data set was used to explore health-related absenteeism and work performance losses on a sample of approximately 78,000 working Australians, including available demographic and working condition factors. Data collected using the World Health Organization Health and Productivity Questionnaire were analyzed with negative binomial logistic regression and multinomial logistic regressions for absenteeism and work performance, respectively. Hours expected to work, annual wage, and job insecurity play a vital role in the association between health- and work-related performance for both work attendance and self-reported work performance. Australian working conditions are contributing to both absenteeism and low work performance, regardless of health status.
Characterizing mammographic images by using generic texture features

PubMed Central

2012-01-01

Introduction Although mammographic density is an established risk factor for breast cancer, its use is limited in clinical practice because of a lack of automated and standardized measurement methods. The aims of this study were to evaluate a variety of automated texture features in mammograms as risk factors for breast cancer and to compare them with the percentage mammographic density (PMD) by using a case-control study design. Methods A case-control study including 864 cases and 418 controls was analyzed automatically. Four hundred seventy features were explored as possible risk factors for breast cancer. These included statistical features, moment-based features, spectral-energy features, and form-based features. An elaborate variable selection process using logistic regression analyses was performed to identify those features that were associated with case-control status. In addition, PMD was assessed and included in the regression model. Results Of the 470 image-analysis features explored, 46 remained in the final logistic regression model. An area under the curve of 0.79, with an odds ratio per standard deviation change of 2.88 (95% CI, 2.28 to 3.65), was obtained with validation data. Adding the PMD did not improve the final model. Conclusions Using texture features to predict the risk of breast cancer appears feasible. PMD did not show any additional value in this study. With regard to the features assessed, most of the analysis tools appeared to reflect mammographic density, although some features did not correlate with PMD. It remains to be investigated in larger case-control studies whether these features can contribute to increased prediction accuracy. PMID:22490545
Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

PubMed

Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

2017-02-06

Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Local linear regression for function learning: an analysis based on sample discrepancy.

PubMed

Cervellera, Cristiano; Macciò, Danilo

2014-11-01

Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.
Introduction to statistical modelling 2: categorical variables and interactions in linear regression.

PubMed

Lunt, Mark

2015-07-01

In the first article in this series we explored the use of linear regression to predict an outcome variable from a number of predictive factors. It assumed that the predictive factors were measured on an interval scale. However, this article shows how categorical variables can also be included in a linear regression model, enabling predictions to be made separately for different groups and allowing for testing the hypothesis that the outcome differs between groups. The use of interaction terms to measure whether the effect of a particular predictor variable differs between groups is also explained. An alternative approach to testing the difference between groups of the effect of a given predictor, which consists of measuring the effect in each group separately and seeing whether the statistical significance differs between the groups, is shown to be misleading. © The Author 2013. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Computational intelligence models to predict porosity of tablets using minimum features

PubMed Central

Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

2017-01-01

The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space. PMID:28138223
Computational intelligence models to predict porosity of tablets using minimum features.

PubMed

Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

2017-01-01

The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space.
Additive Genetic Variability and the Bayesian Alphabet

PubMed Central

Gianola, Daniel; de los Campos, Gustavo; Hill, William G.; Manfredi, Eduardo; Fernando, Rohan

2009-01-01

The use of all available molecular markers in statistical models for prediction of quantitative traits has led to what could be termed a genomic-assisted selection paradigm in animal and plant breeding. This article provides a critical review of some theoretical and statistical concepts in the context of genomic-assisted genetic evaluation of animals and crops. First, relationships between the (Bayesian) variance of marker effects in some regression models and additive genetic variance are examined under standard assumptions. Second, the connection between marker genotypes and resemblance between relatives is explored, and linkages between a marker-based model and the infinitesimal model are reviewed. Third, issues associated with the use of Bayesian models for marker-assisted selection, with a focus on the role of the priors, are examined from a theoretical angle. The sensitivity of a Bayesian specification that has been proposed (called “Bayes A”) with respect to priors is illustrated with a simulation. Methods that can solve potential shortcomings of some of these Bayesian regression procedures are discussed briefly. PMID:19620397
Wilderness and primitive area recreation participation and consumption: an examination of demographic and spatial factors

Treesearch

J. Michael Bowker; D. Murphy; H. Ken Cordell; Donald B.K. English; J.C. Bergstrom; C.M. Starbuck; C.J. Betz; G.T. Green

2006-01-01

This paper explores the influence of demographic and spatial variables on individual participation and consumption of wildland area recreation. Data from the National Survey on Recreation and the Environment are combined with geographical information systembased distance measures to develop nonlinear regression models used to predict both participation and the number...
Executing and Teaching Science--The Breast Cancer Genetics and Technology-Rich Curriculum Professional Development Studies of a Science Educator

ERIC Educational Resources Information Center

Wragg, Regina E.

2013-01-01

This dissertation presents my explorations in both molecular biology and science education research. In study one, we determined the "ADIPOQ" and "ADIPORI" genotypes of 364 White and 148 Black BrCa patients and used dominant model univariate logistic regression analyses to determine individual SNP and haplotype associations…
Community-Based Addiction Treatment Staff Attitudes about the Usefulness of Evidence-Based Addiction Treatment and CBO Organizational Linkages to Research Institutions

ERIC Educational Resources Information Center

Lundgren, Lena; Krull, Ivy; Zerden, Lisa de Saxe; McCarty, Dennis

2011-01-01

This national study of community-based addiction-treatment organizations' (CBOs) implementation of evidence-based practices explored CBO Program Directors' (n = 296) and clinical staff (n = 518) attitudes about the usefulness of science-based addiction treatment. Through multivariable regression modeling, the study identified that identical…
Differences in Household Saving between Non-Hispanic White and Hispanic Households

ERIC Educational Resources Information Center

Fisher, Patti J.; Hsu, Chungwen

2012-01-01

This study uses the 2007 Survey of Consumer Finances to empirically explore differences in saving behavior between Hispanic (N = 533) and non-Hispanic White (N = 2,473) households. The results of the logistic regression model show that self-employed Hispanics were more likely to save, while self-employment was not significant for Whites. Being…
On the interest of combining an analog model to a regression model for the adaptation of the downscaling link. Application to probabilistic prediction of precipitation over France.

NASA Astrophysics Data System (ADS)

Chardon, Jérémy; Hingray, Benoit; Favre, Anne-Catherine

2016-04-01

Scenarios of surface weather required for the impact studies have to be unbiased and adapted to the space and time scales of the considered hydro-systems. Hence, surface weather scenarios obtained from global climate models and/or numerical weather prediction models are not really appropriated. Outputs of these models have to be post-processed, which is often carried out thanks to Statistical Downscaling Methods (SDMs). Among those SDMs, approaches based on regression are often applied. For a given station, a regression link can be established between a set of large scale atmospheric predictors and the surface weather variable. These links are then used for the prediction of the latter. However, physical processes generating surface weather vary in time. This is well known for precipitation for instance. The most relevant predictors and the regression link are also likely to vary in time. A better prediction skill is thus classically obtained with a seasonal stratification of the data. Another strategy is to identify the most relevant predictor set and establish the regression link from dates that are similar - or analog - to the target date. In practice, these dates can be selected thanks to an analog model. In this study, we explore the possibility of improving the local performance of an analog model - where the analogy is applied to the geopotential heights 1000 and 500 hPa - using additional local scale predictors for the probabilistic prediction of the Safran precipitation over France. For each prediction day, the prediction is obtained from two GLM regression models - for both the occurrence and the quantity of precipitation - for which predictors and parameters are estimated from the analog dates. Firstly, the resulting combined model noticeably allows increasing the prediction performance by adapting the downscaling link for each prediction day. Secondly, the selected predictors for a given prediction depend on the large scale situation and on the considered region. Finally, even with such an adaptive predictor identification, the downscaling link appears to be robust: for a same prediction day, predictors selected for different locations of a given region are similar and the regression parameters are consistent within the region of interest.
Calibration Designs for Non-Monolithic Wind Tunnel Force Balances

NASA Technical Reports Server (NTRS)

Johnson, Thomas H.; Parker, Peter A.; Landman, Drew

2010-01-01

This research paper investigates current experimental designs and regression models for calibrating internal wind tunnel force balances of non-monolithic design. Such calibration methods are necessary for this class of balance because it has an electrical response that is dependent upon the sign of the applied forces and moments. This dependency gives rise to discontinuities in the response surfaces that are not easily modeled using traditional response surface methodologies. An analysis of current recommended calibration models is shown to lead to correlated response model terms. Alternative modeling methods are explored which feature orthogonal or near-orthogonal terms.
PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

NASA Astrophysics Data System (ADS)

Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

2014-10-01

The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.
Forecasting carbon dioxide emissions based on a hybrid of mixed data sampling regression model and back propagation neural network in the USA.

PubMed

Zhao, Xin; Han, Meng; Ding, Lili; Calin, Adrian Cantemir

2018-01-01

The accurate forecast of carbon dioxide emissions is critical for policy makers to take proper measures to establish a low carbon society. This paper discusses a hybrid of the mixed data sampling (MIDAS) regression model and BP (back propagation) neural network (MIDAS-BP model) to forecast carbon dioxide emissions. Such analysis uses mixed frequency data to study the effects of quarterly economic growth on annual carbon dioxide emissions. The forecasting ability of MIDAS-BP is remarkably better than MIDAS, ordinary least square (OLS), polynomial distributed lags (PDL), autoregressive distributed lags (ADL), and auto-regressive moving average (ARMA) models. The MIDAS-BP model is suitable for forecasting carbon dioxide emissions for both the short and longer term. This research is expected to influence the methodology for forecasting carbon dioxide emissions by improving the forecast accuracy. Empirical results show that economic growth has both negative and positive effects on carbon dioxide emissions that last 15 quarters. Carbon dioxide emissions are also affected by their own change within 3 years. Therefore, there is a need for policy makers to explore an alternative way to develop the economy, especially applying new energy policies to establish a low carbon society.
Association of Emotional Labor and Occupational Stressors with Depressive Symptoms among Women Sales Workers at a Clothing Shopping Mall in the Republic of Korea: A Cross-Sectional Study

PubMed Central

Chung, Yuh-Jin; Jung, Woo-Chul

2017-01-01

In the distribution service industry, sales people often experience multiple occupational stressors such as excessive emotional labor, workplace mistreatment, and job insecurity. The present study aimed to explore the associations of these stressors with depressive symptoms among women sales workers at a clothing shopping mall in Korea. A cross sectional study was conducted on 583 women who consist of clothing sales workers and manual workers using a structured questionnaire to assess demographic factors, occupational stressors, and depressive symptoms. Multiple regression analyses were performed to explore the association of these stressors with depressive symptoms. Scores for job stress subscales such as job demand, job control, and job insecurity were higher among sales workers than among manual workers (p < 0.01). The multiple regression analysis revealed the association between occupation and depressive symptoms after controlling for age, educational level, cohabiting status, and occupational stressors (sβ = 0.08, p = 0.04). A significant interaction effect between occupation and social support was also observed in this model (sβ = −0.09, p = 0.02). The multiple regression analysis stratified by occupation showed that job demand, job insecurity, and workplace mistreatment were significantly associated with depressive symptoms in both occupations (p < 0.05), although the strength of statistical associations were slightly different. We found negative associations of social support (sβ = −0.22, p < 0.01) and emotional effort (sβ = −0.17, p < 0.01) with depressive symptoms in another multiple regression model for sales workers. Emotional dissonance (sβ = 0.23, p < 0.01) showed positive association with depressive symptoms in this model. The result of this study indicated that reducing occupational stressors would be effective for women sales workers to prevent depressive symptoms. In particular, promoting social support could be the most effective way to promote women sales workers’ mental health. PMID:29168777

The power of siblings and caregivers: under-explored types of social support among children affected by HIV and AIDS.

PubMed

Sharer, Melissa; Cluver, Lucie; Shields, Joseph J; Ahearn, Frederick

2016-03-01

Children affected by HIV and AIDS have significantly higher rates of mental health problems than unaffected children. There is a need for research to examine how social support functions as a source of resiliency for children in high HIV-prevalence settings such as South Africa. The purpose of this research was to explore how family social support relates to depression, anxiety, and post-traumatic stress (PTS). Using the ecological model as a frame, data were drawn from a 2011 cross-sectional study of 1380 children classified as either orphaned by AIDS and/or living with an AIDS sick family member. The children were from high-poverty, high HIV-prevalent rural and urban communities in South Africa. Social support was analyzed in depth by examining the source (e.g. caregiver, sibling) and the type (e.g. emotional, instrumental, quality). These variables were entered into multiple regression analyses to estimate the most parsimonious regression models to show the relationships between social support and depression, anxiety, and PTS symptoms among the children. Siblings emerged as the most consistent source of social support on mental health. Overall caregiver and sibling support explained 13% variance in depression, 12% in anxiety, and 11% in PTS. Emotional support was the most frequent type of social support associated with mental health in all regression models, with higher levels of quality and instrumental support having the strongest relation to positive mental health outcomes. Although instrumental and quality support from siblings were related to positive mental health, unexpectedly, the higher the level of emotional support received from a sibling resulted in the child reporting more symptoms of depression, anxiety, and PTS. The opposite was true for emotional support provided via caregivers, higher levels of this support was related to lower levels of all mental health symptoms. Sex was significant in all regressions, indicating the presence of moderation.
The power of siblings and caregivers: under-explored types of social support among children affected by HIV and AIDS

PubMed Central

Sharer, Melissa; Cluver, Lucie; Shields, Joseph J.; Ahearn, Frederick

2016-01-01

ABSTRACT Children affected by HIV and AIDS have significantly higher rates of mental health problems than unaffected children. There is a need for research to examine how social support functions as a source of resiliency for children in high HIV-prevalence settings such as South Africa. The purpose of this research was to explore how family social support relates to depression, anxiety, and post-traumatic stress (PTS). Using the ecological model as a frame, data were drawn from a 2011 cross-sectional study of 1380 children classified as either orphaned by AIDS and/or living with an AIDS sick family member. The children were from high-poverty, high HIV-prevalent rural and urban communities in South Africa. Social support was analyzed in depth by examining the source (e.g. caregiver, sibling) and the type (e.g. emotional, instrumental, quality). These variables were entered into multiple regression analyses to estimate the most parsimonious regression models to show the relationships between social support and depression, anxiety, and PTS symptoms among the children. Siblings emerged as the most consistent source of social support on mental health. Overall caregiver and sibling support explained 13% variance in depression, 12% in anxiety, and 11% in PTS. Emotional support was the most frequent type of social support associated with mental health in all regression models, with higher levels of quality and instrumental support having the strongest relation to positive mental health outcomes. Although instrumental and quality support from siblings were related to positive mental health, unexpectedly, the higher the level of emotional support received from a sibling resulted in the child reporting more symptoms of depression, anxiety, and PTS. The opposite was true for emotional support provided via caregivers, higher levels of this support was related to lower levels of all mental health symptoms. Sex was significant in all regressions, indicating the presence of moderation. PMID:27392006
Association of Emotional Labor and Occupational Stressors with Depressive Symptoms among Women Sales Workers at a Clothing Shopping Mall in the Republic of Korea: A Cross-Sectional Study.

PubMed

Chung, Yuh-Jin; Jung, Woo-Chul; Kim, Hyunjoo; Cho, Seong-Sik

2017-11-23

In the distribution service industry, sales people often experience multiple occupational stressors such as excessive emotional labor, workplace mistreatment, and job insecurity. The present study aimed to explore the associations of these stressors with depressive symptoms among women sales workers at a clothing shopping mall in Korea. A cross sectional study was conducted on 583 women who consist of clothing sales workers and manual workers using a structured questionnaire to assess demographic factors, occupational stressors, and depressive symptoms. Multiple regression analyses were performed to explore the association of these stressors with depressive symptoms. Scores for job stress subscales such as job demand, job control, and job insecurity were higher among sales workers than among manual workers ( p < 0.01). The multiple regression analysis revealed the association between occupation and depressive symptoms after controlling for age, educational level, cohabiting status, and occupational stressors (sβ = 0.08, p = 0.04). A significant interaction effect between occupation and social support was also observed in this model (sβ = -0.09, p = 0.02). The multiple regression analysis stratified by occupation showed that job demand, job insecurity, and workplace mistreatment were significantly associated with depressive symptoms in both occupations ( p < 0.05), although the strength of statistical associations were slightly different. We found negative associations of social support (sβ = -0.22, p < 0.01) and emotional effort (sβ = -0.17, p < 0.01) with depressive symptoms in another multiple regression model for sales workers. Emotional dissonance (sβ = 0.23, p < 0.01) showed positive association with depressive symptoms in this model. The result of this study indicated that reducing occupational stressors would be effective for women sales workers to prevent depressive symptoms. In particular, promoting social support could be the most effective way to promote women sales workers' mental health.
Accounting for informatively missing data in logistic regression by means of reassessment sampling.

PubMed

Lin, Ji; Lyles, Robert H

2015-05-20

We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.
Commitment to personal values and guilt feelings in dementia caregivers.

PubMed

Gallego-Alberto, Laura; Losada, Andrés; Márquez-González, María; Romero-Moreno, Rosa; Vara, Carlos

2017-01-01

Caregivers' commitment to personal values is linked to caregivers' well-being, although the effects of personal values on caregivers' guilt have not been explored to date. The goal of this study is to analyze the relationship between caregivers´ commitment to personal values and guilt feelings. Participants were 179 dementia family caregivers. Face-to-face interviews were carried out to describe sociodemographic variables and assess stressors, caregivers' commitment to personal values and guilt feelings. Commitment to values was conceptualized as two factors (commitment to own values and commitment to family values) and 12 specific individual values (e.g. education, family or caregiving role). Hierarchical regressions were performed controlling for sociodemographic variables and stressors, and introducing the two commitment factors (in a first regression) or the commitment to individual/specific values (in a second regression) as predictors of guilt. In terms of the commitment to values factors, the analyzed regression model explained 21% of the variance of guilt feelings. Only the factor commitment to family values contributed significantly to the model, explaining 7% of variance. With regard to the regression analyzing the contribution of specific values to caregivers' guilt, commitment to the caregiving role and with leisure contributed negatively and significantly to the explanation of caregivers' guilt. Commitment to work contributed positively to guilt feelings. The full model explained 30% of guilt feelings variance. The specific values explained 16% of the variance. Our findings suggest that commitment to personal values is a relevant variable to understand guilt feelings in caregivers.
Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

PubMed

Yang, Xiaowei; Nie, Kun

2008-03-15

Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
Exploring the Spatial Association between Social Deprivation and Cardiovascular Disease Mortality at the Neighborhood Level.

PubMed

Ford, Mary Margaret; Highfield, Linda D

2016-01-01

Cardiovascular disease (CVD), the leading cause of death in the United States, is impacted by neighborhood-level factors including social deprivation. To measure the association between social deprivation and CVD mortality in Harris County, Texas, global (Ordinary Least Squares (OLS) and local (Geographically Weighted Regression (GWR)) models were built. The models explored the spatial variation in the relationship at a census-tract level while controlling for age, income by race, and education. A significant and spatially varying association (p < .01) was found between social deprivation and CVD mortality, when controlling for all other factors in the model. The GWR model provided a better model fit over the analogous OLS model (R2 = .65 vs. .57), reinforcing the importance of geography and neighborhood of residence in the relationship between social deprivation and CVD mortality. Findings from the GWR model can be used to identify neighborhoods at greatest risk for poor health outcomes and to inform the placement of community-based interventions.
A Heckman selection model for the safety analysis of signalized intersections

PubMed Central

Wong, S. C.; Zhu, Feng; Pei, Xin; Huang, Helai; Liu, Youjun

2017-01-01

Purpose The objective of this paper is to provide a new method for estimating crash rate and severity simultaneously. Methods This study explores a Heckman selection model of the crash rate and severity simultaneously at different levels and a two-step procedure is used to investigate the crash rate and severity levels. The first step uses a probit regression model to determine the sample selection process, and the second step develops a multiple regression model to simultaneously evaluate the crash rate and severity for slight injury/kill or serious injury (KSI), respectively. The model uses 555 observations from 262 signalized intersections in the Hong Kong metropolitan area, integrated with information on the traffic flow, geometric road design, road environment, traffic control and any crashes that occurred during two years. Results The results of the proposed two-step Heckman selection model illustrate the necessity of different crash rates for different crash severity levels. Conclusions A comparison with the existing approaches suggests that the Heckman selection model offers an efficient and convenient alternative method for evaluating the safety performance at signalized intersections. PMID:28732050
Should the poor have no medicines to cure? A study on the association between social class and social security among the rural migrant workers in urban China.

PubMed

Guan, Ming

2017-11-07

The rampant urbanization and medical marketization in China have resulted in increased vulnerabilities to health and socioeconomic disparities among the rural migrant workers in urban China. In the Chinese context, the socioeconomic characteristics of rural migrant workers have attracted considerable research attention in the recent past years. However, to date, no previous studies have explored the association between the socioeconomic factors and social security among the rural migrant workers in urban China. This study aims to explore the association between socioeconomic inequity and social security inequity and the subsequent associations with medical inequity and reimbursement rejection. Data from a regionally representative sample of 2009 Survey of Migrant Workers in Pearl River Delta in China were used for analyses. Multiple logistic regressions were used to analyze the impacts of socioeconomic factors on the eight dimensions of social security (sick pay, paid leave, maternity pay, medical insurance, pension insurance, occupational injury insurance, unemployment insurance, and maternity insurance) and the impacts of social security on medical reimbursement rejection. The zero-inflated negative binomial regression model (ZINB regression) was adopted to explore the relationship between socioeconomic factors and hospital visits among the rural migrant workers with social security. The study population consisted of 848 rural migrant workers with high income who were young and middle-aged, low-educated, and covered by social security. Reimbursement rejection and abusive supervision for the rural migrant workers were observed. Logistic regression analysis showed that there were significant associations between socioeconomic factors and social security. ZINB regression showed that there were significant associations between socioeconomic factors and hospital visits among the rural migrant workers. Also, several dimensions of social security had significant associations with reimbursement rejections. This study showed that social security inequity, medical inequity, and reimbursement inequity happened to the rural migrant workers simultaneously. Future policy should strengthen health justice and enterprises' medical responsibilities to the employed rural migrant workers.
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.

PubMed

Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam

2017-11-01

The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
Seasonality in trauma admissions - Are daylight and weather variables better predictors than general cyclic effects?

PubMed

Røislien, Jo; Søvik, Signe; Eken, Torsten

2018-01-01

Trauma is a leading global cause of death, and predicting the burden of trauma admissions is vital for good planning of trauma care. Seasonality in trauma admissions has been found in several studies. Seasonal fluctuations in daylight hours, temperature and weather affect social and cultural practices but also individual neuroendocrine rhythms that may ultimately modify behaviour and potentially predispose to trauma. The aim of the present study was to explore to what extent the observed seasonality in daily trauma admissions could be explained by changes in daylight and weather variables throughout the year. Retrospective registry study on trauma admissions in the 10-year period 2001-2010 at Oslo University Hospital, Ullevål, Norway, where the amount of daylight varies from less than 6 hours to almost 19 hours per day throughout the year. Daily number of admissions was analysed by fitting non-linear Poisson time series regression models, simultaneously adjusting for several layers of temporal patterns, including a non-linear long-term trend and both seasonal and weekly cyclic effects. Five daylight and weather variables were explored, including hours of daylight and amount of precipitation. Models were compared using Akaike's Information Criterion (AIC). A regression model including daylight and weather variables significantly outperformed a traditional seasonality model in terms of AIC. A cyclic week effect was significant in all models. Daylight and weather variables are better predictors of seasonality in daily trauma admissions than mere information on day-of-year.
The extension of total gain (TG) statistic in survival models: properties and applications.

PubMed

Choodari-Oskooei, Babak; Royston, Patrick; Parmar, Mahesh K B

2015-07-01

The results of multivariable regression models are usually summarized in the form of parameter estimates for the covariates, goodness-of-fit statistics, and the relevant p-values. These statistics do not inform us about whether covariate information will lead to any substantial improvement in prediction. Predictive ability measures can be used for this purpose since they provide important information about the practical significance of prognostic factors. R (2)-type indices are the most familiar forms of such measures in survival models, but they all have limitations and none is widely used. In this paper, we extend the total gain (TG) measure, proposed for a logistic regression model, to survival models and explore its properties using simulations and real data. TG is based on the binary regression quantile plot, otherwise known as the predictiveness curve. Standardised TG ranges from 0 (no explanatory power) to 1 ('perfect' explanatory power). The results of our simulations show that unlike many of the other R (2)-type predictive ability measures, TG is independent of random censoring. It increases as the effect of a covariate increases and can be applied to different types of survival models, including models with time-dependent covariate effects. We also apply TG to quantify the predictive ability of multivariable prognostic models developed in several disease areas. Overall, TG performs well in our simulation studies and can be recommended as a measure to quantify the predictive ability in survival models.
INDIVIDUAL-BASED MODELS: POWERFUL OR POWER STRUGGLE?

PubMed

Willem, L; Stijven, S; Hens, N; Vladislavleva, E; Broeckhove, J; Beutels, P

2015-01-01

Individual-based models (IBMs) offer endless possibilities to explore various research questions but come with high model complexity and computational burden. Large-scale IBMs have become feasible but the novel hardware architectures require adapted software. The increased model complexity also requires systematic exploration to gain thorough system understanding. We elaborate on the development of IBMs for vaccine-preventable infectious diseases and model exploration with active learning. Investment in IBM simulator code can lead to significant runtime reductions. We found large performance differences due to data locality. Sorting the population once, reduced simulation time by a factor two. Storing person attributes separately instead of using person objects also seemed more efficient. Next, we improved model performance up to 70% by structuring potential contacts based on health status before processing disease transmission. The active learning approach we present is based on iterative surrogate modelling and model-guided experimentation. Symbolic regression is used for nonlinear response surface modelling with automatic feature selection. We illustrate our approach using an IBM for influenza vaccination. After optimizing the parameter spade, we observed an inverse relationship between vaccination coverage and the clinical attack rate reinforced by herd immunity. These insights can be used to focus and optimise research activities, and to reduce both dimensionality and decision uncertainty.
Exploring students' patterns of reasoning

NASA Astrophysics Data System (ADS)

Matloob Haghanikar, Mojgan

As part of a collaborative study of the science preparation of elementary school teachers, we investigated the quality of students' reasoning and explored the relationship between sophistication of reasoning and the degree to which the courses were considered inquiry oriented. To probe students' reasoning, we developed open-ended written content questions with the distinguishing feature of applying recently learned concepts in a new context. We devised a protocol for developing written content questions that provided a common structure for probing and classifying students' sophistication level of reasoning. In designing our protocol, we considered several distinct criteria, and classified students' responses based on their performance for each criterion. First, we classified concepts into three types: Descriptive, Hypothetical, and Theoretical and categorized the abstraction levels of the responses in terms of the types of concepts and the inter-relationship between the concepts. Second, we devised a rubric based on Bloom's revised taxonomy with seven traits (both knowledge types and cognitive processes) and a defined set of criteria to evaluate each trait. Along with analyzing students' reasoning, we visited universities and observed the courses in which the students were enrolled. We used the Reformed Teaching Observation Protocol (RTOP) to rank the courses with respect to characteristics that are valued for the inquiry courses. We conducted logistic regression for a sample of 18courses with about 900 students and reported the results for performing logistic regression to estimate the relationship between traits of reasoning and RTOP score. In addition, we analyzed conceptual structure of students' responses, based on conceptual classification schemes, and clustered students' responses into six categories. We derived regression model, to estimate the relationship between the sophistication of the categories of conceptual structure and RTOP scores. However, the outcome variable with six categories required a more complicated regression model, known as multinomial logistic regression, generalized from binary logistic regression. With the large amount of collected data, we found that the likelihood of the higher cognitive processes were in favor of classes with higher measures on inquiry. However, the usage of more abstract concepts with higher order conceptual structures was less prevalent in higher RTOP courses.
Probabilistic Forecasting of Surface Ozone with a Novel Statistical Approach

NASA Technical Reports Server (NTRS)

Balashov, Nikolay V.; Thompson, Anne M.; Young, George S.

2017-01-01

The recent change in the Environmental Protection Agency's surface ozone regulation, lowering the surface ozone daily maximum 8-h average (MDA8) exceedance threshold from 75 to 70 ppbv, poses significant challenges to U.S. air quality (AQ) forecasters responsible for ozone MDA8 forecasts. The forecasters, supplied by only a few AQ model products, end up relying heavily on self-developed tools. To help U.S. AQ forecasters, this study explores a surface ozone MDA8 forecasting tool that is based solely on statistical methods and standard meteorological variables from the numerical weather prediction (NWP) models. The model combines the self-organizing map (SOM), which is a clustering technique, with a step wise weighted quadratic regression using meteorological variables as predictors for ozone MDA8. The SOM method identifies different weather regimes, to distinguish between various modes of ozone variability, and groups them according to similarity. In this way, when a regression is developed for a specific regime, data from the other regimes are also used, with weights that are based on their similarity to this specific regime. This approach, regression in SOM (REGiS), yields a distinct model for each regime taking into account both the training cases for that regime and other similar training cases. To produce probabilistic MDA8 ozone forecasts, REGiS weighs and combines all of the developed regression models on the basis of the weather patterns predicted by an NWP model. REGiS is evaluated over the San Joaquin Valley in California and the northeastern plains of Colorado. The results suggest that the model performs best when trained and adjusted separately for an individual AQ station and its corresponding meteorological site.
Association of Brain-Derived Neurotrophic Factor and Vitamin D with Depression and Obesity: A Population-Based Study.

PubMed

Goltz, Annemarie; Janowitz, Deborah; Hannemann, Anke; Nauck, Matthias; Hoffmann, Johanna; Seyfart, Tom; Völzke, Henry; Terock, Jan; Grabe, Hans Jörgen

2018-06-19

Depression and obesity are widespread and closely linked. Brain-derived neurotrophic factor (BDNF) and vitamin D are both assumed to be associated with depression and obesity. Little is known about the interplay between vitamin D and BDNF. We explored the putative associations and interactions between serum BDNF and vitamin D levels with depressive symptoms and abdominal obesity in a large population-based cohort. Data were obtained from the population-based Study of Health in Pomerania (SHIP)-Trend (n = 3,926). The associations of serum BDNF and vitamin D levels with depressive symptoms (measured using the Patient Health Questionnaire) were assessed with binary and multinomial logistic regression models. The associations of serum BDNF and vitamin D levels with obesity (measured by the waist-to-hip ratio [WHR]) were assessed with binary logistic and linear regression models with restricted cubic splines. Logistic regression models revealed inverse associations of vitamin D with depression (OR = 0.966; 95% CI 0.951-0.981) and obesity (OR = 0.976; 95% CI 0.967-0.985). No linear association of serum BDNF with depression or obesity was found. However, linear regression models revealed a U-shaped association of BDNF with WHR (p < 0.001). Vitamin D was inversely associated with depression and obesity. BDNF was associated with abdominal obesity, but not with depression. At the population level, our results support the relevant roles of vitamin D and BDNF in mental and physical health-related outcomes. © 2018 S. Karger AG, Basel.
Factors Affecting Success in the Professional Entry Exam for Accountants in Brazil

ERIC Educational Resources Information Center

Lima Rodrigues, Lúcia; Pinho, Carlos; Bugarim, Maria Clara; Craig, Russell; Machado, Diego

2018-01-01

This paper explores factors that have affected the success of candidates in the professional entry exam conducted by Brazil's Federal Council of Accounting. We analyse results of 18,948 candidates who sat for the exam in 2012, using a logistic regression model and the key indicators used by government to monitor the performance of higher education…
Associations between Resilience and the Well-Being of Mothers of Children with Autism Spectrum Disorder and Other Developmental Disabilities

ERIC Educational Resources Information Center

Halstead, Elizabeth; Ekas, Naomi; Hastings, Richard P.; Griffith, Gemma M.

2018-01-01

There is variability in the extent to which mothers are affected by the behavior problems of their children with developmental disabilities (DD). We explore whether maternal resilience functions as a protective or compensatory factor. In Studies 1 and 2, using moderated multiple regression models, we found evidence that maternal resilience…
Interest Profile Elevation, Big Five Personality Traits, and Secondary Constructs on the Self-Directed Search: A Replication and Extension

ERIC Educational Resources Information Center

Bullock, Emily E.; Reardon, Robert C.

2008-01-01

The study used the Self-Directed Search (SDS) and the NEO-FFI to explore profile elevation, four secondary constructs, and the Big Five personality factors in a sample of college students in a career course. Regression model results showed that openness, conscientiousness, differentiation high-low, differentiation Iachan, and consistency accounted…
Won't You Be My Neighbor? Using an Ecological Approach to Examine the Impact of Community on Revictimization

ERIC Educational Resources Information Center

Obasaju, Mayowa A.; Palin, Frances L.; Jacobs, Carli; Anderson, Page; Kaslow, Nadine J.

2009-01-01

An ecological model is used to explore the moderating effects of community-level variables on the relation between childhood sexual, physical, and emotional abuse and adult intimate partner violence (IPV) within a sample of 98 African American women from low incomes. Results from hierarchical, binary logistics regressions analyses show that…

Multinomial-Regression Modeling of the Environmental Attitudes of Higher Education Students Based on the Revised New Ecological Paradigm Scale

ERIC Educational Resources Information Center

Jowett, Tim; Harraway, John; Lovelock, Brent; Skeaff, Sheila; Slooten, Liz; Strack, Mick; Shephard, Kerry

2014-01-01

Higher education is increasingly interested in its impact on the sustainability attributes of its students, so we wanted to explore how our students' environmental concern changed during their higher education experiences. We used the Revised New Ecological Paradigm Scale (NEP) with 505 students and developed and tested a multinomial…
Economics of Scholarly Publishing: Exploring the Causes of Subscription Price Variations of Scholarly Journals in Business Subject-Specific Areas

ERIC Educational Resources Information Center

Liu, Lewis G.

2011-01-01

This empirical research investigates subscription price variations of scholarly journals in five business subject-specific areas using the semilogarithmic regression model. It has two main purposes. The first is to address the unsettled debate over whether or not and to what extent commercial publishers reap monopoly profits by overcharging…
Exploring the Ups and Downs of Mathematics Engagement in the Middle Years of School

ERIC Educational Resources Information Center

Martin, Andrew J.; Way, Jennifer; Bobis, Janette; Anderson, Judy

2015-01-01

This study of 1,601 students in the middle years of schooling (Grades 5-8, each student measured twice, 1 year apart) from 200 classrooms in 44 schools sought to identify factors explaining gains and declines in mathematics engagement at key transition points. In multilevel regression modeling, findings showed that compared with Grade 6 students…
Examining the Influence of Selected Factors on Perceived Co-Op Work-Term Quality from a Student Perspective

ERIC Educational Resources Information Center

Drewery, David; Nevison, Colleen; Pretti, T. Judene; Cormier, Lauren; Barclay, Sage; Pennaforte, Antoine

2016-01-01

This study discusses and tests a conceptual model of co-op work-term quality from a student perspective. Drawing from an earlier exploration of co-op students' perceptions of work-term quality, variables related to role characteristics, interpersonal dynamics, and organizational elements were used in a multiple linear regression analysis to…
Exploring the use of random regression models with legendre polynomials to analyze measures of volume of ejaculate in Holstein bulls.

PubMed

Carabaño, M J; Díaz, C; Ugarte, C; Serrano, M

2007-02-01

Artificial insemination centers routinely collect records of quantity and quality of semen of bulls throughout the animals' productive period. The goal of this paper was to explore the use of random regression models with orthogonal polynomials to analyze repeated measures of semen production of Spanish Holstein bulls. A total of 8,773 records of volume of first ejaculate (VFE) collected between 12 and 30 mo of age from 213 Spanish Holstein bulls was analyzed under alternative random regression models. Legendre polynomial functions of increasing order (0 to 6) were fitted to the average trajectory, additive genetic and permanent environmental effects. Age at collection and days in production were used as time variables. Heterogeneous and homogeneous residual variances were alternatively assumed. Analyses were carried out within a Bayesian framework. The logarithm of the marginal density and the cross-validation predictive ability of the data were used as model comparison criteria. Based on both criteria, age at collection as a time variable and heterogeneous residuals models are recommended to analyze changes of VFE over time. Both criteria indicated that fitting random curves for genetic and permanent environmental components as well as for the average trajector improved the quality of models. Furthermore, models with a higher order polynomial for the permanent environmental (5 to 6) than for the genetic components (4 to 5) and the average trajectory (2 to 3) tended to perform best. High-order polynomials were needed to accommodate the highly oscillating nature of the phenotypic values. Heritability and repeatability estimates, disregarding the extremes of the studied period, ranged from 0.15 to 0.35 and from 0.20 to 0.50, respectively, indicating that selection for VFE may be effective at any stage. Small differences among models were observed. Apart from the extremes, estimated correlations between ages decreased steadily from 0.9 and 0.4 for measures 1 mo apart to 0.4 and 0.2 for most distant measures for additive genetic and phenotypic components, respectively. Further investigation to account for environmental factors that may be responsible for the oscillating observations of VFE is needed.
Exploring bikeability in a metropolitan setting: stimulating and hindering factors in commuting route environments

PubMed Central

2012-01-01

Background Route environments may influence people's active commuting positively and thereby contribute to public health. Assessments of route environments are, however, needed in order to better understand the possible relationship between active commuting and the route environment. The aim of this study was, therefore, to assess the potential associations between perceptions of whether the route environment on the whole hinders or stimulates bicycle commuting and perceptions of environmental factors. Methods The Active Commuting Route Environment Scale (ACRES) was used for the assessment of bicycle commuters' perceptions of their route environments in the inner urban parts of Greater Stockholm, Sweden. Bicycle commuters (n = 827) were recruited by advertisements in newspapers. Simultaneous multiple regression analyses were used to assess the relation between predictor variables (such as levels of exhaust fumes, noise, traffic speed, traffic congestion and greenery) and the outcome variable (hindering - stimulating route environments). Two models were run, (Model 1) without and (Model 2) with the item traffic: unsafe or safe included as a predictor. Results Overall, about 40% of the variance of hindering - stimulating route environments was explained by the environmental predictors in our models (Model 1, R2 = 0.415, and Model 2, R 2= 0.435). The regression equation for Model 1 was: y = 8.53 + 0.33 ugly or beautiful + 0.14 greenery + (-0.14) course of the route + (-0.13) exhaust fumes + (-0.09) congestion: all types of vehicles (p ≤ 0.019). The regression equation for Model 2 was y = 6.55 + 0.31 ugly or beautiful + 0.16 traffic: unsafe or safe + (-0.13) exhaust fumes + 0.12 greenery + (-0.12) course of the route (p ≤ 0.001). Conclusions The main results indicate that beautiful, green and safe route environments seem to be, independently of each other, stimulating factors for bicycle commuting in inner urban areas. On the other hand, exhaust fumes, traffic congestion and low 'directness' of the route seem to be hindering factors. Furthermore, the overall results illustrate the complexity of a research area at the beginning of exploration. PMID:22401492
Computational approaches for predicting biomedical research collaborations.

PubMed

Zhang, Qing; Yu, Hong

2014-01-01

Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naïve Bayes, naïve Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets.
Spiritual Well-Being, Depression, and Stress Among Hemodialysis Patients in Jordan.

PubMed

Musa, Ahmad S; Pevalin, David J; Al Khalaileh, Murad A A

2017-10-01

The spiritual dimension of a patient's life is an important factor that may mediate detrimental impacts on mental health. The lack of research investigating spiritual well-being, religiosity, and mental health among Jordanian hemodialysis patients encouraged this research. This study explored levels of spiritual well-being and its associations with depression, anxiety, and stress. A quantitative, cross-sectional correlational study. A sample of 218 Jordanian Muslim hemodialysis patients completed a structured, self-administered questionnaire. The data were analyzed using descriptive statistics and linear multivariate regression models. The hemodialysis patients had, on average, relatively low levels of spiritual well-being, moderate depression, severe anxiety, and mild to moderate stress. The results of the regression models indicated that aspects of spiritual well-being were negatively associated with depression, anxiety, and stress, but only existential well-being consistently retained significant associations after controlling for religious well-being, religiosity, and sociodemographic variables. Greater spiritual and existential well-being of Jordanian hemodialysis patients were significantly associated with less depression, anxiety, and stress. It appears that these patients use religious and spiritual beliefs and practices as coping mechanisms to overcome their depression, anxiety, and stress. The implications for holistic clinical practice are explored.
Spirituality and Resilience Among Mexican American IPV Survivors.

PubMed

de la Rosa, Iván A; Barnett-Queen, Timothy; Messick, Madeline; Gurrola, Maria

2016-12-01

Women with abusive partners use a variety of coping strategies. This study examined the correlation between spirituality, resilience, and intimate partner violence using a cross-sectional survey of 54 Mexican American women living along the U.S.-Mexico border. The meaning-making coping model provides the conceptual framework to explore how spirituality is used as a copying strategy. Multiple ordinary least squares (OLS) regression results indicate women who score higher on spirituality also report greater resilient characteristics. Poisson regression analyses revealed that an increase in level of spirituality is associated with lower number of types of abuse experienced. Clinical, programmatic, and research implications are discussed. © The Author(s) 2015.
Longitudinal study of stress, social support, and depression in married Arab immigrant women.

PubMed

Aroian, Karen; Uddin, Nizam; Blbas, Hazar

2017-02-01

Using a stress and social support framework, this study explored the trajectory of depression in 388 married Arab immigrant women. The women provided three panels of data approximately 18 months apart. Depression at Time 3 was regressed on Time 1 depression, socio-demographic variables, and rate of change over time in stress and social support. The regression model was significant and accounted for 41.16% of the variation in Time 3 depression scores. Time 1 depression, English reading ability, husband's employment status, changes over time in immigration demands, daily hassles, and social support from friends were associated with Time 3 depression.
Longitudinal Study of Stress, Social Support, and Depression In Married Arab Immigrant Women

PubMed Central

Aroian, Karen; Uddin, Nizam; Blbas, Hazar

2017-01-01

Using a stress and social support framework, this study explored the trajectory of depression in 388 married Arab immigrant women. The women provided three panels of data approximately 18 months apart. Depression at Time 3 was regressed on Time 1 depression, socio-demographic variables, and rate of change over time in stress and social support. The regression model was significant and accounted for 41.16% of the variation in Time 3 depression scores. Time 1 depression, English reading ability, husband’s employment status, and changes over time in immigration demands, daily hassles, and social support from friends were associated with Time 3 Depression. PMID:27791495
Characterizing multivariate decoding models based on correlated EEG spectral features.

PubMed

McFarland, Dennis J

2013-07-01

Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Relationship between the clinical global impression of severity for schizoaffective disorder scale and established mood scales for mania and depression.

PubMed

Turkoz, Ibrahim; Fu, Dong-Jing; Bossie, Cynthia A; Sheehan, John J; Alphs, Larry

2013-08-15

This analysis explored the relationship between ratings on HAM-D-17 or YMRS and those on the depressive or manic subscale of CGI-S for schizoaffective disorder (CGI-S-SCA). This post hoc analysis used the database (N=614) from two 6-week, randomized, placebo-controlled studies of paliperidone ER versus placebo in symptomatic subjects with schizoaffective disorder assessed using HAM-D-17, YMRS, and CGI-S-SCA scales. Parametric and nonparametric regression models explored the relationships between ratings on YMRS and HAM-D-17 and on depressive and manic domains of the CGI-S-SCA from baseline to the 6-week end point. A clinically meaningful improvement was defined as a change of 1 point in the CGI-S-SCA score. No adjustment was made for multiplicity. Multiple linear regression models suggested that a 1-point change in the depressive domain of CGI-S-SCA corresponded to an average 3.6-point (SE=0.2) change in HAM-D-17 score. Similarly, a 1-point change in the manic domain of CGI-S-SCA corresponded to an average 5.8-point (SE=0.2) change in YMRS score. Results were confirmed using local and cumulative logistic regression models in addition to equipercentile linking. Lack of subjects scoring over the complete range of possible scores may limit broad application of the analyses. Clinically meaningful score changes in depressive and manic domains of CGI-S-SCA corresponded to approximately 4- and 6-point score changes on HAM-D-17 and YMRS, respectively, in symptomatic subjects with schizoaffective disorder. Copyright © 2013 Elsevier B.V. All rights reserved.
A consistent positive association between landscape simplification and insecticide use across the Midwestern US from 1997 through 2012

DOE PAGES

Meehan, Timothy D.; Gratton, Claudio

2015-10-27

During 2007, counties across the Midwestern US with relatively high levels of landscape simplification (i.e., widespread replacement of seminatural habitats with cultivated crops) had relatively high crop-pest abundances which, in turn, were associated with relatively high insecticide application. These results suggested a positive relationship between landscape simplification and insecticide use, mediated by landscape effects on crop pests or their natural enemies. A follow-up study, in the same region but using different statistical methods, explored the relationship between landscape simplification and insecticide use between 1987 and 2007, and concluded that the relationship varied substantially in sign and strength across years. Here,more » we explore this relationship from 1997 through 2012, using a single dataset and two different analytical approaches. We demonstrate that, when using ordinary least squares (OLS) regression, the relationship between landscape simplification and insecticide use is, indeed, quite variable over time. However, the residuals from OLS models show strong spatial autocorrelation, indicating spatial structure in the data not accounted for by explanatory variables, and violating a standard assumption of OLS. When modeled using spatial regression techniques, relationships between landscape simplification and insecticide use were consistently positive between 1997 and 2012, and model fits were dramatically improved. We argue that spatial regression methods are more appropriate for these data, and conclude that there remains compelling correlative support for a link between landscape simplification and insecticide use in the Midwestern US. We discuss the limitations of inference from this and related studies, and suggest improved data collection campaigns for better understanding links between landscape structure, crop-pest pressure, and pest-management practices.« less
A consistent positive association between landscape simplification and insecticide use across the Midwestern US from 1997 through 2012

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meehan, Timothy D.; Gratton, Claudio

During 2007, counties across the Midwestern US with relatively high levels of landscape simplification (i.e., widespread replacement of seminatural habitats with cultivated crops) had relatively high crop-pest abundances which, in turn, were associated with relatively high insecticide application. These results suggested a positive relationship between landscape simplification and insecticide use, mediated by landscape effects on crop pests or their natural enemies. A follow-up study, in the same region but using different statistical methods, explored the relationship between landscape simplification and insecticide use between 1987 and 2007, and concluded that the relationship varied substantially in sign and strength across years. Here,more » we explore this relationship from 1997 through 2012, using a single dataset and two different analytical approaches. We demonstrate that, when using ordinary least squares (OLS) regression, the relationship between landscape simplification and insecticide use is, indeed, quite variable over time. However, the residuals from OLS models show strong spatial autocorrelation, indicating spatial structure in the data not accounted for by explanatory variables, and violating a standard assumption of OLS. When modeled using spatial regression techniques, relationships between landscape simplification and insecticide use were consistently positive between 1997 and 2012, and model fits were dramatically improved. We argue that spatial regression methods are more appropriate for these data, and conclude that there remains compelling correlative support for a link between landscape simplification and insecticide use in the Midwestern US. We discuss the limitations of inference from this and related studies, and suggest improved data collection campaigns for better understanding links between landscape structure, crop-pest pressure, and pest-management practices.« less
Network characteristics and patent value-Evidence from the Light-Emitting Diode industry.

PubMed

Huang, Way-Ren; Hsieh, Chia-Jen; Chang, Ke-Chiun; Kiang, Yen-Jo; Yuan, Chien-Chung; Chu, Woei-Chyn

2017-01-01

This study proposes a different angle to social network analysis that evaluates patent value and explores its influencing factors using the network centrality and network position. This study utilizes a logistic regression model to explore the relationships in the LED industry between patent value and network centrality as measured from out-degree centrality, in-degree centrality, in-closeness centrality, and network position, which is measured from effect size. The empirical result shows that out-degree centrality and in-degree centrality have significant positive effects on patent value and that effect size has a significant negative effect on patent value.
Modelling exploration of non-stationary hydrological system

NASA Astrophysics Data System (ADS)

Kim, Kue Bum; Kwon, Hyun-Han; Han, Dawei

2015-04-01

Traditional hydrological modelling assumes that the catchment does not change with time (i.e., stationary conditions) which means the model calibrated for the historical period is valid for the future period. However, in reality, due to change of climate and catchment conditions this stationarity assumption may not be valid in the future. It is a challenge to make the hydrological model adaptive to the future climate and catchment conditions that are not observable at the present time. In this study a lumped conceptual rainfall-runoff model called IHACRES was applied to a catchment in southwest England. Long observation data from 1961 to 2008 were used and seasonal calibration (in this study only summer period is further explored because it is more sensitive to climate and land cover change than the other three seasons) has been done since there are significant seasonal rainfall patterns. We expect that the model performance can be improved by calibrating the model based on individual seasons. The data is split into calibration and validation periods with the intention of using the validation period to represent the future unobserved situations. The success of the non-stationary model will depend not only on good performance during the calibration period but also the validation period. Initially, the calibration is based on changing the model parameters with time. Methodology is proposed to adapt the parameters using the step forward and backward selection schemes. However, in the validation both the forward and backward multiple parameter changing models failed. One problem is that the regression with time is not reliable since the trend may not be in a monotonic linear relationship with time. The second issue is that changing multiple parameters makes the selection process very complex which is time consuming and not effective in the validation period. As a result, two new concepts are explored. First, only one parameter is selected for adjustment while the other parameters are set as constant. Secondly, regression is made against climate condition instead of against time. It has been found that such a new approach is very effective and this non-stationary model worked very well both in the calibration and validation period. Although the catchment is specific in southwest England and the data are for only the summer period, the methodology proposed in this study is general and applicable to other catchments. We hope this study will stimulate the hydrological community to explore a variety of sites so that valuable experiences and knowledge could be gained to improve our understanding of such a complex modelling issue in climate change impact assessment.
Inter-model comparison of the landscape determinants of vector-borne disease: implications for epidemiological and entomological risk modeling.

PubMed

Lorenz, Alyson; Dhingra, Radhika; Chang, Howard H; Bisanzio, Donal; Liu, Yang; Remais, Justin V

2014-01-01

Extrapolating landscape regression models for use in assessing vector-borne disease risk and other applications requires thoughtful evaluation of fundamental model choice issues. To examine implications of such choices, an analysis was conducted to explore the extent to which disparate landscape models agree in their epidemiological and entomological risk predictions when extrapolated to new regions. Agreement between six literature-drawn landscape models was examined by comparing predicted county-level distributions of either Lyme disease or Ixodes scapularis vector using Spearman ranked correlation. AUC analyses and multinomial logistic regression were used to assess the ability of these extrapolated landscape models to predict observed national data. Three models based on measures of vegetation, habitat patch characteristics, and herbaceous landcover emerged as effective predictors of observed disease and vector distribution. An ensemble model containing these three models improved precision and predictive ability over individual models. A priori assessment of qualitative model characteristics effectively identified models that subsequently emerged as better predictors in quantitative analysis. Both a methodology for quantitative model comparison and a checklist for qualitative assessment of candidate models for extrapolation are provided; both tools aim to improve collaboration between those producing models and those interested in applying them to new areas and research questions.
Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08

PubMed Central

Casey, Martin F.; Neidell, Matthew

2013-01-01

Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205
Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway

PubMed Central

Wang, Fengfeng; Wong, S. C. Cesar; Chan, Lawrence W. C.; Cho, William C. S.; Yip, S. P.; Yung, Benjamin Y. M.

2014-01-01

Background. MicroRNA (miRNA) is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC), and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI) and chromosomal instability (CIN) signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC. PMID:24895601

Assessing the Relationships among Delinquent Male Students' Disruptive and Violent Behavior and Staff's Proactive and Reactive Behavior in a Secure Residential Treatment Center

ERIC Educational Resources Information Center

Rozalski, Michael; Drasgow, Erik; Drasgow, Fritz; Yell, Mitchell

2009-01-01

The purpose of this study was to examine the relationships among students' disruptive and violent behavior and staff's use of proactive and reactive strategies in a secure residential treatment center serving delinquent adolescent males. One hundred hours of observational data were collected, and linear regression models were used to explore the…
Evolution of the Marine Officer Fitness Report: A Multivariate Analysis

DTIC Science & Technology

This thesis explores the evaluation behavior of United States Marine Corps (USMC) Reporting Seniors (RSs) from 2010 to 2017. Using fitness report...RSs evaluate the performance of subordinate active component unrestricted officer MROs over time. I estimate logistic regression models of the...lowest. However, these correlations indicating the effects of race matching on FITREP evaluations narrow in significance when performance-based factors
America's Democracy Colleges: The Civic Engagement of Community College Students

ERIC Educational Resources Information Center

Angeli Newell, Mallory

2014-01-01

This study explored the civic engagement of current two- and four-year students to explore whether differences exist between the groups and what may explain the differences. Using binary logistic regression and Ordinary Least Squares regression it was found that community-based engagement was lower for two- than four-year students, though…
Validation of Statistical Models for Estimating Hospitalization Associated with Influenza and Other Respiratory Viruses

PubMed Central

Chan, King-Pan; Chan, Kwok-Hung; Wong, Wilfred Hing-Sang; Peiris, J. S. Malik; Wong, Chit-Ming

2011-01-01

Background Reliable estimates of disease burden associated with respiratory viruses are keys to deployment of preventive strategies such as vaccination and resource allocation. Such estimates are particularly needed in tropical and subtropical regions where some methods commonly used in temperate regions are not applicable. While a number of alternative approaches to assess the influenza associated disease burden have been recently reported, none of these models have been validated with virologically confirmed data. Even fewer methods have been developed for other common respiratory viruses such as respiratory syncytial virus (RSV), parainfluenza and adenovirus. Methods and Findings We had recently conducted a prospective population-based study of virologically confirmed hospitalization for acute respiratory illnesses in persons <18 years residing in Hong Kong Island. Here we used this dataset to validate two commonly used models for estimation of influenza disease burden, namely the rate difference model and Poisson regression model, and also explored the applicability of these models to estimate the disease burden of other respiratory viruses. The Poisson regression models with different link functions all yielded estimates well correlated with the virologically confirmed influenza associated hospitalization, especially in children older than two years. The disease burden estimates for RSV, parainfluenza and adenovirus were less reliable with wide confidence intervals. The rate difference model was not applicable to RSV, parainfluenza and adenovirus and grossly underestimated the true burden of influenza associated hospitalization. Conclusion The Poisson regression model generally produced satisfactory estimates in calculating the disease burden of respiratory viruses in a subtropical region such as Hong Kong. PMID:21412433
Development and Validation of an Instrument to Predict Functional Recovery in Tibial Fracture Patients: The Somatic Pre-Occupation and Coping (SPOC) Questionnaire

PubMed Central

Busse, Jason W.; Bhandari, Mohit; Guyatt, Gordon H.; Heels-Ansdell, Diane; Kulkarni, Abhaya V.; Mandel, Scott; Sanders, David; Schemitsch, Emil; Swiontkowski, Marc; Tornetta, Paul; Wai, Eugene; Walter, Stephen D.

2011-01-01

Objective To explore the role of patients’ beliefs in their likelihood of recovery from severe physical trauma. Methods We developed and validated an instrument designed to capture the impact of patients’ beliefs on functional recovery from injury; the Somatic Pre-occupation and Coping (SPOC) questionnaire. At 6-weeks post-surgical fixation, we administered the SPOC questionnaire to 359 consecutive patients with operatively managed tibial shaft fractures. We constructed multivariable regression models to explore the association between SPOC scores and functional outcome at 1-year, as measured by return to work and short form-36 (SF-36) physical component summary (PCS) and mental component summary (MCS) scores. Results In our adjusted multivariable regression models that included pre-injury SF-36 scores, SPOC scores at 6-weeks post-surgery accounted for 18% of the variation in SF-36 PCS scores and 18% of SF-36 MCS scores at 1-year. In both models, 6-week SPOC scores were a far more powerful predictor of functional recovery than age, gender, fracture type, smoking status, or the presence of multi-trauma. Our adjusted analysis found that for each 14 point increment in SPOC score at 6-weeks (14 chosen on the basis of half a standard deviation of the mean SPOC score) the odds of returning to work at 1-year decreased by 40% (odds ratio = 0.60; 95% CI = 0.50 to 0.73). Conclusion The SPOC questionnaire is a valid measurement of illness beliefs in tibial fracture patients and is highly predictive of their long-term functional recovery. Future research should explore if these results extend to other trauma populations and if modification of unhelpful illness beliefs is feasible and would result in improved functional outcomes. PMID:22011635
Time-series panel analysis (TSPA): multivariate modeling of temporal associations in psychotherapy process.

PubMed

Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang

2014-10-01

Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A comparison of time dependent Cox regression, pooled logistic regression and cross sectional pooling with simulations and an application to the Framingham Heart Study.

PubMed

Ngwa, Julius S; Cabral, Howard J; Cheng, Debbie M; Pencina, Michael J; Gagnon, David R; LaValley, Michael P; Cupples, L Adrienne

2016-11-03

Typical survival studies follow individuals to an event and measure explanatory variables for that event, sometimes repeatedly over the course of follow up. The Cox regression model has been used widely in the analyses of time to diagnosis or death from disease. The associations between the survival outcome and time dependent measures may be biased unless they are modeled appropriately. In this paper we explore the Time Dependent Cox Regression Model (TDCM), which quantifies the effect of repeated measures of covariates in the analysis of time to event data. This model is commonly used in biomedical research but sometimes does not explicitly adjust for the times at which time dependent explanatory variables are measured. This approach can yield different estimates of association compared to a model that adjusts for these times. In order to address the question of how different these estimates are from a statistical perspective, we compare the TDCM to Pooled Logistic Regression (PLR) and Cross Sectional Pooling (CSP), considering models that adjust and do not adjust for time in PLR and CSP. In a series of simulations we found that time adjusted CSP provided identical results to the TDCM while the PLR showed larger parameter estimates compared to the time adjusted CSP and the TDCM in scenarios with high event rates. We also observed upwardly biased estimates in the unadjusted CSP and unadjusted PLR methods. The time adjusted PLR had a positive bias in the time dependent Age effect with reduced bias when the event rate is low. The PLR methods showed a negative bias in the Sex effect, a subject level covariate, when compared to the other methods. The Cox models yielded reliable estimates for the Sex effect in all scenarios considered. We conclude that survival analyses that explicitly account in the statistical model for the times at which time dependent covariates are measured provide more reliable estimates compared to unadjusted analyses. We present results from the Framingham Heart Study in which lipid measurements and myocardial infarction data events were collected over a period of 26 years.
Reverse engineering model structures for soil and ecosystem respiration: the potential of gene expression programming

NASA Astrophysics Data System (ADS)

Ilie, Iulia; Dittrich, Peter; Carvalhais, Nuno; Jung, Martin; Heinemeyer, Andreas; Migliavacca, Mirco; Morison, James I. L.; Sippel, Sebastian; Subke, Jens-Arne; Wilkinson, Matthew; Mahecha, Miguel D.

2017-09-01

Accurate model representation of land-atmosphere carbon fluxes is essential for climate projections. However, the exact responses of carbon cycle processes to climatic drivers often remain uncertain. Presently, knowledge derived from experiments, complemented by a steadily evolving body of mechanistic theory, provides the main basis for developing such models. The strongly increasing availability of measurements may facilitate new ways of identifying suitable model structures using machine learning. Here, we explore the potential of gene expression programming (GEP) to derive relevant model formulations based solely on the signals present in data by automatically applying various mathematical transformations to potential predictors and repeatedly evolving the resulting model structures. In contrast to most other machine learning regression techniques, the GEP approach generates readable models that allow for prediction and possibly for interpretation. Our study is based on two cases: artificially generated data and real observations. Simulations based on artificial data show that GEP is successful in identifying prescribed functions, with the prediction capacity of the models comparable to four state-of-the-art machine learning methods (random forests, support vector machines, artificial neural networks, and kernel ridge regressions). Based on real observations we explore the responses of the different components of terrestrial respiration at an oak forest in south-eastern England. We find that the GEP-retrieved models are often better in prediction than some established respiration models. Based on their structures, we find previously unconsidered exponential dependencies of respiration on seasonal ecosystem carbon assimilation and water dynamics. We noticed that the GEP models are only partly portable across respiration components, the identification of a general terrestrial respiration model possibly prevented by equifinality issues. Overall, GEP is a promising tool for uncovering new model structures for terrestrial ecology in the data-rich era, complementing more traditional modelling approaches.
Shrinkage regression-based methods for microarray missing value imputation.

PubMed

Wang, Hsiuying; Chiu, Chia-Chun; Wu, Yi-Ching; Wu, Wei-Sheng

2013-01-01

Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin

Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology

DOE PAGES

Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin; ...

2017-05-15

Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less
Development of flood regressions and climate change scenarios to explore estimates of future peak flows

USGS Publications Warehouse

Burns, Douglas A.; Smith, Martyn J.; Freehafer, Douglas A.

2015-12-31

The application uses predictions of future annual precipitation from five climate models and two future greenhouse gas emissions scenarios and provides results that are averaged over three future periods—2025 to 2049, 2050 to 2074, and 2075 to 2099. Results are presented in ensemble form as the mean, median, maximum, and minimum values among the five climate models for each greenhouse gas emissions scenario and period. These predictions of future annual precipitation are substituted into either the precipitation variable or a water balance equation for runoff to calculate potential future peak flows. This application is intended to be used only as an exploratory tool because (1) the regression equations on which the application is based have not been adequately tested outside the range of the current climate and (2) forecasting future precipitation with climate models and downscaling these results to a fine spatial resolution have a high degree of uncertainty. This report includes a discussion of the assumptions, uncertainties, and appropriate use of this exploratory application.
Latent Class Models in action: bridging social capital & Internet usage.

PubMed

Neves, Barbara Barbosa; Fonseca, Jaime R S

2015-03-01

This paper explores how Latent Class Models (LCM) can be applied in social research, when the basic assumptions of regression models cannot be validated. We examine the usefulness of this method with data collected from a study on the relationship between bridging social capital and the Internet. Social capital is defined here as the resources that are potentially available in one's social ties. Bridging is a dimension of social capital, usually related to weak ties (acquaintances), and a source of instrumental resources such as information. The study surveyed a stratified random sample of 417 inhabitants of Lisbon, Portugal. We used LCM to create the variable bridging social capital, but also to estimate the relationship between bridging social capital and Internet usage when we encountered convergence problems with the logistic regression analysis. We conclude by showing a positive relationship between bridging and Internet usage, and by discussing the potential of LCM for social science research. Copyright © 2014 Elsevier Inc. All rights reserved.
From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer.

PubMed

Willke, Richard J; Zheng, Zhiyuan; Subedi, Prasun; Althin, Rikard; Mullins, C Daniel

2012-12-13

Implicit in the growing interest in patient-centered outcomes research is a growing need for better evidence regarding how responses to a given intervention or treatment may vary across patients, referred to as heterogeneity of treatment effect (HTE). A variety of methods are available for exploring HTE, each associated with unique strengths and limitations. This paper reviews a selected set of methodological approaches to understanding HTE, focusing largely but not exclusively on their uses with randomized trial data. It is oriented for the "intermediate" outcomes researcher, who may already be familiar with some methods, but would value a systematic overview of both more and less familiar methods with attention to when and why they may be used. Drawing from the biomedical, statistical, epidemiological and econometrics literature, we describe the steps involved in choosing an HTE approach, focusing on whether the intent of the analysis is for exploratory, initial testing, or confirmatory testing purposes. We also map HTE methodological approaches to data considerations as well as the strengths and limitations of each approach. Methods reviewed include formal subgroup analysis, meta-analysis and meta-regression, various types of predictive risk modeling including classification and regression tree analysis, series of n-of-1 trials, latent growth and growth mixture models, quantile regression, and selected non-parametric methods. In addition to an overview of each HTE method, examples and references are provided for further reading.By guiding the selection of the methods and analysis, this review is meant to better enable outcomes researchers to understand and explore aspects of HTE in the context of patient-centered outcomes research.
Association between MRI structural features and cognitive measures in pediatric multiple sclerosis

NASA Astrophysics Data System (ADS)

Amoroso, N.; Bellotti, R.; Fanizzi, A.; Lombardi, A.; Monaco, A.; Liguori, M.; Margari, L.; Simone, M.; Viterbo, R. G.; Tangaro, S.

2017-09-01

Multiple sclerosis (MS) is an inflammatory and demyelinating disease associated with neurodegenerative processes that lead to brain structural changes. The disease affects mostly young adults, but 3-5% of cases has a pediatric onset (POMS). Magnetic Resonance Imaging (MRI) is generally used for diagnosis and follow-up in MS patients, however the most common MRI measures (e.g. new or enlarging T2-weighted lesions, T1-weighted gadolinium- enhancing lesions) have often failed as surrogate markers of MS disability and progression. MS is clinically heterogenous with symptoms that can include both physical changes (such as visual loss or walking difficulties) and cognitive impairment. 30-50% of POMS experience prominent cognitive dysfunction. In order to investigate the association between cognitive measures and brain morphometry, in this work we present a fully automated pipeline for processing and analyzing MRI brain scans. Relevant anatomical structures are segmented with FreeSurfer; besides, statistical features are computed. Thus, we describe the data referred to 12 patients with early POMS (mean age at MRI: 15.5 +/- 2.7 years) with a set of 181 structural features. The major cognitive abilities measured are verbal and visuo-spatial learning, expressive language and complex attention. Data was collected at the Department of Basic Sciences, Neurosciences and Sense Organs, University of Bari, and exploring different abilities like the verbal and visuo-spatial learning, expressive language and complex attention. Different regression models and parameter configurations are explored to assess the robustness of the results, in particular Generalized Linear Models, Bayes Regression, Random Forests, Support Vector Regression and Artificial Neural Networks are discussed.
Exploring the Mechanisms of Ecological Land Change Based on the Spatial Autoregressive Model: A Case Study of the Poyang Lake Eco-Economic Zone, China

PubMed Central

Xie, Hualin; Liu, Zhifei; Wang, Peng; Liu, Guiying; Lu, Fucai

2013-01-01

Ecological land is one of the key resources and conditions for the survival of humans because it can provide ecosystem services and is particularly important to public health and safety. It is extremely valuable for effective ecological management to explore the evolution mechanisms of ecological land. Based on spatial statistical analyses, we explored the spatial disparities and primary potential drivers of ecological land change in the Poyang Lake Eco-economic Zone of China. The results demonstrated that the global Moran’s I value is 0.1646 during the 1990 to 2005 time period and indicated significant positive spatial correlation (p < 0.05). The results also imply that the clustering trend of ecological land changes weakened in the study area. Some potential driving forces were identified by applying the spatial autoregressive model in this study. The results demonstrated that the higher economic development level and industrialization rate were the main drivers for the faster change of ecological land in the study area. This study also tested the superiority of the spatial autoregressive model to study the mechanisms of ecological land change by comparing it with the traditional linear regressive model. PMID:24384778
Assessing the impact of local meteorological variables on surface ozone in Hong Kong during 2000-2015 using quantile and multiple line regression models

NASA Astrophysics Data System (ADS)

Zhao, Wei; Fan, Shaojia; Guo, Hai; Gao, Bo; Sun, Jiaren; Chen, Laiguo

2016-11-01

The quantile regression (QR) method has been increasingly introduced to atmospheric environmental studies to explore the non-linear relationship between local meteorological conditions and ozone mixing ratios. In this study, we applied QR for the first time, together with multiple linear regression (MLR), to analyze the dominant meteorological parameters influencing the mean, 10th percentile, 90th percentile and 99th percentile of maximum daily 8-h average (MDA8) ozone concentrations in 2000-2015 in Hong Kong. The dominance analysis (DA) was used to assess the relative importance of meteorological variables in the regression models. Results showed that the MLR models worked better at suburban and rural sites than at urban sites, and worked better in winter than in summer. QR models performed better in summer for 99th and 90th percentiles and performed better in autumn and winter for 10th percentile. And QR models also performed better in suburban and rural areas for 10th percentile. The top 3 dominant variables associated with MDA8 ozone concentrations, changing with seasons and regions, were frequently associated with the six meteorological parameters: boundary layer height, humidity, wind direction, surface solar radiation, total cloud cover and sea level pressure. Temperature rarely became a significant variable in any season, which could partly explain the peak of monthly average ozone concentrations in October in Hong Kong. And we found the effect of solar radiation would be enhanced during extremely ozone pollution episodes (i.e., the 99th percentile). Finally, meteorological effects on MDA8 ozone had no significant changes before and after the 2010 Asian Games.
Psychosocial factors influencing smokeless tobacco use by teen-age military dependents.

PubMed

Lee, S; Raker, T; Chisick, M C

1994-02-01

Using bivariate and logistic regression analysis, we explored psychosocial correlates of smokeless tobacco (SLT) use in a sample of 2,257 teenage military dependents. We built separate regression models for males and females to explain triers and users of SLT. Results show female and male triers share five factors regarding SLT use--parental and peer approval, trying smoking, relatives using SLT, and athletic team membership. Male trial of SLT was additionally associated with race, difficulty in purchasing SLT, relatives who smoke, current smoking, and belief that SLT can cause mouth cancer. Male use of SLT was associated with race, seeing a dentist regularly, SLT counseling by a dentist, parental approval, trying and current smoking, and grade level. In all models, trying smoking was the strongest explanatory variable. Relatives and peers exert considerable influence on SLT use. Few triers or users had received SLT counseling from their dentist despite high dental utilization rates.
Testing a model of research intention among U.K. clinical psychologists: a logistic regression analysis.

PubMed

Eke, Gemma; Holttum, Sue; Hayward, Mark

2012-03-01

Previous research highlights barriers to clinical psychologists conducting research, but has rarely examined U.K. clinical psychologists. The study investigated U.K. clinical psychologists' self-reported research output and tested part of a theoretical model of factors influencing their intention to conduct research. Questionnaires were mailed to 1,300 U.K. clinical psychologists. Three hundred and seventy-four questionnaires were returned (29% response-rate). This study replicated in a U.K. sample the finding that the modal number of publications was zero, highlighted in a number of U.K. and U.S. studies. Research intention was bimodally distributed, and logistic regression classified 78% of cases successfully. Outcome expectations, perceived behavioral control and normative beliefs mediated between research training environment and intention. Further research should explore how research is negotiated in clinical roles, and this issue should be incorporated into prequalification training. © 2012 Wiley Periodicals, Inc.
Self-efficacy and physical activity in adolescent and parent dyads.

PubMed

Rutkowski, Elaine M; Connelly, Cynthia D

2012-01-01

The study examined the relationships between self-efficacy and physical activity in adolescent and parent dyads. A cross-sectional, correlational design was used to explore the relationships among levels of parent physical activity, parent-adolescent self-efficacy, and adolescent physical activity. Descriptive and multivariate regression analyses were conducted in a purposive sample of 94 adolescent/parent dyads. Regression results indicated the overall model significantly predicted adolescent physical activity (R(2) = .20, R(2)(adj) = .14, F[5, 70]= 3.28, p= .01). Only one of the five predictor variables significantly contributed to the model. Higher levels of adolescent self-efficacy was positively related to greater levels of adolescent physical activity (β= .29, p= .01). Practitioners are encouraged to examine the level of self-efficacy and physical activity in families in an effort to develop strategies that impact these areas and ultimately to mediate obesity-related challenges in families seeking care. © 2011, Wiley Periodicals, Inc.

Application of classification tree and logistic regression for the management and health intervention plans in a community-based study.

PubMed

Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq

2007-10-01

A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.
Tobit analysis of vehicle accident rates on interstate highways.

PubMed

Anastasopoulos, Panagiotis Ch; Tarko, Andrew P; Mannering, Fred L

2008-03-01

There has been an abundance of research that has used Poisson models and its variants (negative binomial and zero-inflated models) to improve our understanding of the factors that affect accident frequencies on roadway segments. This study explores the application of an alternate method, tobit regression, by viewing vehicle accident rates directly (instead of frequencies) as a continuous variable that is left-censored at zero. Using data from vehicle accidents on Indiana interstates, the estimation results show that many factors relating to pavement condition, roadway geometrics and traffic characteristics significantly affect vehicle accident rates.
The importance of molecular structures, endpoints' values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders.

PubMed

Li, Jiazhong; Gramatica, Paola

2010-11-01

Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
Network characteristics and patent value—Evidence from the Light-Emitting Diode industry

PubMed Central

Huang, Way-Ren; Hsieh, Chia-Jen; Chang, Ke-Chiun; Kiang, Yen-Jo; Yuan, Chien-Chung; Chu, Woei-Chyn

2017-01-01

This study proposes a different angle to social network analysis that evaluates patent value and explores its influencing factors using the network centrality and network position. This study utilizes a logistic regression model to explore the relationships in the LED industry between patent value and network centrality as measured from out-degree centrality, in-degree centrality, in-closeness centrality, and network position, which is measured from effect size. The empirical result shows that out-degree centrality and in-degree centrality have significant positive effects on patent value and that effect size has a significant negative effect on patent value. PMID:28817587
Estimating Biochemical Parameters of Tea (camellia Sinensis (L.)) Using Hyperspectral Techniques

NASA Astrophysics Data System (ADS)

Bian, M.; Skidmore, A. K.; Schlerf, M.; Liu, Y.; Wang, T.

2012-07-01

Tea (Camellia Sinensis (L.)) is an important economic crop and the market price of tea depends largely on its quality. This research aims to explore the potential of hyperspectral remote sensing on predicting the concentration of biochemical components, namely total tea polyphenols, as indicators of tea quality at canopy scale. Experiments were carried out for tea plants growing in the field and greenhouse. Partial least squares regression (PLSR), which has proven to be the one of the most successful empirical approach, was performed to establish the relationship between reflectance and biochemical concentration across six tea varieties in the field. Moreover, a novel integrated approach involving successive projections algorithms as band selection method and neural networks was developed and applied to detect the concentration of total tea polyphenols for one tea variety, in order to explore and model complex nonlinearity relationships between independent (wavebands) and dependent (biochemicals) variables. The good prediction accuracies (r2 > 0.8 and relative RMSEP < 10 %) achieved for tea plants using both linear (partial lease squares regress) and nonlinear (artificial neural networks) modelling approaches in this study demonstrates the feasibility of using airborne and spaceborne sensors to cover wide areas of tea plantation for in situ monitoring of tea quality cheaply and rapidly.
Exploring models for the roles of health systems’ responsiveness and social determinants in explaining universal health coverage and health outcomes

PubMed Central

Bonsel, Gouke J.

2016-01-01

Background Intersectoral perspectives of health are present in the rhetoric of the sustainable development goals. Yet its descriptions of systematic approaches for an intersectoral monitoring vision, joining determinants of health, and barriers or facilitators to accessing healthcare services are lacking. Objective To explore models of associations between health outcomes and health service coverage, and health determinants and health systems responsiveness, and thereby to contribute to monitoring, analysis, and assessment approaches informed by an intersectoral vision of health. Design The study is designed as a series of ecological, cross-country regression analyses, covering between 23 and 57 countries with dependent health variables concentrated on the years 2002–2003. Countries cover a range of development contexts. Health outcome and health service coverage dependent variables were derived from World Health Organization (WHO) information sources. Predictor variables representing determinants are derived from the WHO and World Bank databases; variables used for health systems’ responsiveness are derived from the WHO World Health Survey. Responsiveness is a measure of acceptability of health services to the population, complementing financial health protection. Results Health determinants’ indicators – access to improved drinking sources, accountability, and average years of schooling – were statistically significant in particular health outcome regressions. Statistically significant coefficients were more common for mortality rate regressions than for coverage rate regressions. Responsiveness was systematically associated with poorer health and health service coverage. With respect to levels of inequality in health, the indicator of responsiveness problems experienced by the unhealthy poor groups in the population was statistically significant for regressions on measles vaccination inequalities between rich and poor. For the broader determinants, the Gini mattered most for inequalities in child mortality; education mattered more for inequalities in births attended by skilled personnel. Conclusions This paper adds to the literature on comparative health systems research. National and international health monitoring frameworks need to incorporate indicators on trends in and impacts of other policy sectors on health. This will empower the health sector to carry out public health practices that promote health and health equity. PMID:26942516
Exploring models for the roles of health systems' responsiveness and social determinants in explaining universal health coverage and health outcomes.

PubMed

Valentine, Nicole Britt; Bonsel, Gouke J

2016-01-01

Intersectoral perspectives of health are present in the rhetoric of the sustainable development goals. Yet its descriptions of systematic approaches for an intersectoral monitoring vision, joining determinants of health, and barriers or facilitators to accessing healthcare services are lacking. To explore models of associations between health outcomes and health service coverage, and health determinants and health systems responsiveness, and thereby to contribute to monitoring, analysis, and assessment approaches informed by an intersectoral vision of health. The study is designed as a series of ecological, cross-country regression analyses, covering between 23 and 57 countries with dependent health variables concentrated on the years 2002-2003. Countries cover a range of development contexts. Health outcome and health service coverage dependent variables were derived from World Health Organization (WHO) information sources. Predictor variables representing determinants are derived from the WHO and World Bank databases; variables used for health systems' responsiveness are derived from the WHO World Health Survey. Responsiveness is a measure of acceptability of health services to the population, complementing financial health protection. Health determinants' indicators - access to improved drinking sources, accountability, and average years of schooling - were statistically significant in particular health outcome regressions. Statistically significant coefficients were more common for mortality rate regressions than for coverage rate regressions. Responsiveness was systematically associated with poorer health and health service coverage. With respect to levels of inequality in health, the indicator of responsiveness problems experienced by the unhealthy poor groups in the population was statistically significant for regressions on measles vaccination inequalities between rich and poor. For the broader determinants, the Gini mattered most for inequalities in child mortality; education mattered more for inequalities in births attended by skilled personnel. This paper adds to the literature on comparative health systems research. National and international health monitoring frameworks need to incorporate indicators on trends in and impacts of other policy sectors on health. This will empower the health sector to carry out public health practices that promote health and health equity.
Does money matter in inflation forecasting?

NASA Astrophysics Data System (ADS)

Binner, J. M.; Tino, P.; Tepper, J.; Anderson, R.; Jones, B.; Kendall, G.

2010-11-01

This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two nonlinear techniques, namely, recurrent neural networks and kernel recursive least squares regression-techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naïve random walk model. The best models were nonlinear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. Beyond its economic findings, our study is in the tradition of physicists’ long-standing interest in the interconnections among statistical mechanics, neural networks, and related nonparametric statistical methods, and suggests potential avenues of extension for such studies.
Blood oxygen level dependent magnetic resonance imaging for detecting pathological patterns in lupus nephritis patients: a preliminary study using a decision tree model.

PubMed

Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng

2018-02-09

Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P < 0.001) and inferior to that of the Logistic regression model (71.87% vs 78.71%, P < 0.001). The specificity of decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P < 0.001). The Area under the ROC curve (AUROCC) of the decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P < 0.001) and logistic regression model (0.765 vs 0.662, P < 0.001). BOLD MRI is a useful non-invasive imaging technique for the evaluation of lupus nephritis. Decision tree models constructed using functions of R2* values may facilitate the prediction of renal pathological patterns.
A binary genetic programing model for teleconnection identification between global sea surface temperature and local maximum monthly rainfall events

NASA Astrophysics Data System (ADS)

Danandeh Mehr, Ali; Nourani, Vahid; Hrnjica, Bahrudin; Molajou, Amir

2017-12-01

The effectiveness of genetic programming (GP) for solving regression problems in hydrology has been recognized in recent studies. However, its capability to solve classification problems has not been sufficiently explored so far. This study develops and applies a novel classification-forecasting model, namely Binary GP (BGP), for teleconnection studies between sea surface temperature (SST) variations and maximum monthly rainfall (MMR) events. The BGP integrates certain types of data pre-processing and post-processing methods with conventional GP engine to enhance its ability to solve both regression and classification problems simultaneously. The model was trained and tested using SST series of Black Sea, Mediterranean Sea, and Red Sea as potential predictors as well as classified MMR events at two locations in Iran as predictand. Skill of the model was measured in regard to different rainfall thresholds and SST lags and compared to that of the hybrid decision tree-association rule (DTAR) model available in the literature. The results indicated that the proposed model can identify potential teleconnection signals of surrounding seas beneficial to long-term forecasting of the occurrence of the classified MMR events.
Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

PubMed Central

Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

2013-01-01

Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

PubMed

Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

2013-01-01

Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
4D-LQTA-QSAR and docking study on potent Gram-negative specific LpxC inhibitors: a comparison to CoMFA modeling.

PubMed

Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G

2012-02-01

A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

PubMed Central

Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike

2010-01-01

We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well as scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric components for latent structure, underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway studies. Supplementary supporting material provides more details of the applications, as well as examples of the use of freely available software tools for implementing the methodology. PMID:21218139
Nitrogen dioxide concentrations in neighborhoods adjacent to a commercial airport: a land use regression modeling study

PubMed Central

2010-01-01

Background There is growing concern in communities surrounding airports regarding the contribution of various emission sources (such as aircraft and ground support equipment) to nearby ambient concentrations. We used extensive monitoring of nitrogen dioxide (NO2) in neighborhoods surrounding T.F. Green Airport in Warwick, RI, and land-use regression (LUR) modeling techniques to determine the impact of proximity to the airport and local traffic on these concentrations. Methods Palmes diffusion tube samplers were deployed along the airport's fence line and within surrounding neighborhoods for one to two weeks. In total, 644 measurements were collected over three sampling campaigns (October 2007, March 2008 and June 2008) and each sampling location was geocoded. GIS-based variables were created as proxies for local traffic and airport activity. A forward stepwise regression methodology was employed to create general linear models (GLMs) of NO2 variability near the airport. The effect of local meteorology on associations with GIS-based variables was also explored. Results Higher concentrations of NO2 were seen near the airport terminal, entrance roads to the terminal, and near major roads, with qualitatively consistent spatial patterns between seasons. In our final multivariate model (R2 = 0.32), the local influences of highways and arterial/collector roads were statistically significant, as were local traffic density and distance to the airport terminal (all p < 0.001). Local meteorology did not significantly affect associations with principal GIS variables, and the regression model structure was robust to various model-building approaches. Conclusion Our study has shown that there are clear local variations in NO2 in the neighborhoods that surround an urban airport, which are spatially consistent across seasons. LUR modeling demonstrated a strong influence of local traffic, except the smallest roads that predominate in residential areas, as well as proximity to the airport terminal. PMID:21083910
Nitrogen dioxide concentrations in neighborhoods adjacent to a commercial airport: a land use regression modeling study.

PubMed

Adamkiewicz, Gary; Hsu, Hsiao-Hsien; Vallarino, Jose; Melly, Steven J; Spengler, John D; Levy, Jonathan I

2010-11-17

There is growing concern in communities surrounding airports regarding the contribution of various emission sources (such as aircraft and ground support equipment) to nearby ambient concentrations. We used extensive monitoring of nitrogen dioxide (NO2) in neighborhoods surrounding T.F. Green Airport in Warwick, RI, and land-use regression (LUR) modeling techniques to determine the impact of proximity to the airport and local traffic on these concentrations. Palmes diffusion tube samplers were deployed along the airport's fence line and within surrounding neighborhoods for one to two weeks. In total, 644 measurements were collected over three sampling campaigns (October 2007, March 2008 and June 2008) and each sampling location was geocoded. GIS-based variables were created as proxies for local traffic and airport activity. A forward stepwise regression methodology was employed to create general linear models (GLMs) of NO2 variability near the airport. The effect of local meteorology on associations with GIS-based variables was also explored. Higher concentrations of NO2 were seen near the airport terminal, entrance roads to the terminal, and near major roads, with qualitatively consistent spatial patterns between seasons. In our final multivariate model (R2 = 0.32), the local influences of highways and arterial/collector roads were statistically significant, as were local traffic density and distance to the airport terminal (all p < 0.001). Local meteorology did not significantly affect associations with principal GIS variables, and the regression model structure was robust to various model-building approaches. Our study has shown that there are clear local variations in NO2 in the neighborhoods that surround an urban airport, which are spatially consistent across seasons. LUR modeling demonstrated a strong influence of local traffic, except the smallest roads that predominate in residential areas, as well as proximity to the airport terminal.
Exploring visuospatial abilities and their contribution to constructional abilities and nonverbal intelligence.

PubMed

Trojano, Luigi; Siciliano, Mattia; Cristinzio, Chiara; Grossi, Dario

2018-01-01

The present study aimed at exploring relationships among the visuospatial tasks included in the Battery for Visuospatial Abilities (BVA), and at assessing the relative contribution of different facets of visuospatial processing on tests tapping constructional abilities and nonverbal abstract reasoning. One hundred forty-four healthy subjects with a normal score on Mini Mental State Examination completed the BVA plus Raven's Coloured Progressive Matrices and Constructional Apraxia test. We used Principal Axis Factoring and Parallel Analysis to investigate relationships among the BVA visuospatial tasks, and performed regression analyses to assess the visuospatial contribution to constructional abilities and nonverbal abstract reasoning. Principal Axis Factoring and Parallel Analysis revealed two eigenvalues exceeding 1, accounting for about 60% of the variance. A 2-factor model provided the best fit. Factor 1 included sub-tests exploring "complex" visuospatial skills, whereas Factor 2 included two subtests tapping "simple" visuospatial skills. Regression analyses revealed that both Factor 1 and Factor 2 significantly affected performance on Raven's Coloured Progressive Matrices, whereas only the Factor 1 affected performance on Constructional Apraxia test. Our results supported functional segregation proposed by De Renzi, suggesting clinical caution to utilize a single test to assess visuospatial domain, and qualified the visuospatial contribution in drawing and non-verbal intelligence test.
Linking in situ LAI and fine resolution remote sensing data to map reference LAI over cropland and grassland using geostatistical regression method

NASA Astrophysics Data System (ADS)

He, Yaqian; Bo, Yanchen; Chai, Leilei; Liu, Xiaolong; Li, Aihua

2016-08-01

Leaf Area Index (LAI) is an important parameter of vegetation structure. A number of moderate resolution LAI products have been produced in urgent need of large scale vegetation monitoring. High resolution LAI reference maps are necessary to validate these LAI products. This study used a geostatistical regression (GR) method to estimate LAI reference maps by linking in situ LAI and Landsat TM/ETM+ and SPOT-HRV data over two cropland and two grassland sites. To explore the discrepancies of employing different vegetation indices (VIs) on estimating LAI reference maps, this study established the GR models for different VIs, including difference vegetation index (DVI), normalized difference vegetation index (NDVI), and ratio vegetation index (RVI). To further assess the performance of the GR model, the results from the GR and Reduced Major Axis (RMA) models were compared. The results show that the performance of the GR model varies between the cropland and grassland sites. At the cropland sites, the GR model based on DVI provides the best estimation, while at the grassland sites, the GR model based on DVI performs poorly. Compared to the RMA model, the GR model improves the accuracy of reference LAI maps in terms of root mean square errors (RMSE) and bias.
Exploring the Assessment of the DSM-5 Alternative Model for Personality Disorders With the Personality Assessment Inventory.

PubMed

Busch, Alexander J; Morey, Leslie C; Hopwood, Christopher J

2017-01-01

Section III of the Diagnostic and Statistical Manual of Mental Disorders (5th ed. [DSM-5]; American Psychiatric Association, 2013) contains an alternative model for the diagnosis of personality disorder involving the assessment of 25 traits and a global level of overall personality functioning. There is hope that this model will be increasingly used in clinical and research settings, and the ability to apply established instruments to assess these concepts could facilitate this process. This study sought to develop scoring algorithms for these alternative model concepts using scales from the Personality Assessment Inventory (PAI). A multiple regression strategy used to predict scores in 2 undergraduate samples on DSM-5 alternative model instruments: the Personality Inventory for the DSM-5 (PID-5) and the General Personality Pathology scale (GPP; Morey et al., 2011 ). These regression functions resulted in scores that demonstrated promising convergent and discriminant validity across the alternative model concepts, as well as a factor structure in a cross-validation sample that was congruent with the putative structure of the alternative model traits. Results were linked to the PAI community normative data to provide normative information regarding these alternative model concepts that can be used to identify elevated traits and personality functioning level scores.
The relationship between severity of violence in the home and dating violence.

PubMed

Sims, Eva Nowakowski; Dodd, Virginia J Noland; Tejeda, Manuel J

2008-01-01

This study used propositions from the social learning theory to explore the effects of the combined influences of child maltreatment, childhood witness to parental violence, sibling violence, and gender on dating violence perpetration using a modified version of the Conflict Tactics Scale 2 (CTS2). A weighted scoring method was utilized to determine how severity of violence in the home impacts dating violence perpetration. Bivariate correlations and linear regression models indicate significant associations between child maltreatment, sibling violence perpetration, childhood witness to parental violence, gender, and subsequent dating violence perpetration. Multiple regression analyses indicate that for men, history of severe violence victimization (i.e., child maltreatment and childhood witness to parental violence) and severe perpetration (sibling violence) significantly predict dating violence perpetration.

[Gaussian process regression and its application in near-infrared spectroscopy analysis].

PubMed

Feng, Ai-Ming; Fang, Li-Min; Lin, Min

2011-06-01

Gaussian process (GP) is applied in the present paper as a chemometric method to explore the complicated relationship between the near infrared (NIR) spectra and ingredients. After the outliers were detected by Monte Carlo cross validation (MCCV) method and removed from dataset, different preprocessing methods, such as multiplicative scatter correction (MSC), smoothing and derivate, were tried for the best performance of the models. Furthermore, uninformative variable elimination (UVE) was introduced as a variable selection technique and the characteristic wavelengths obtained were further employed as input for modeling. A public dataset with 80 NIR spectra of corn was introduced as an example for evaluating the new algorithm. The optimal models for oil, starch and protein were obtained by the GP regression method. The performance of the final models were evaluated according to the root mean square error of calibration (RMSEC), root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP) and correlation coefficient (r). The models give good calibration ability with r values above 0.99 and the prediction ability is also satisfactory with r values higher than 0.96. The overall results demonstrate that GP algorithm is an effective chemometric method and is promising for the NIR analysis.
Analysis on the adaptive countermeasures to ecological management under changing environment in the Tarim River Basin, China

NASA Astrophysics Data System (ADS)

Yang, Fan; Xue, Lianqing; Zhang, Luochen; Chen, Xinfang; Chi, Yixia

2017-12-01

This article aims to explore the adaptive utilization strategies of flow regime versus traditional practices in the context of climate change and human activities in the arid area. The study presents quantitative analysis of climatic and anthropogenic factors to streamflow alteration in the Tarim River Basin (TRB) using the Budyko method and adaptive utilization strategies to eco-hydrological regime by comparing the applicability between autoregressive moving average model (ARMA) model and combined regression model. Our results suggest that human activities played a dominant role in streamflow deduction in the mainstream with contribution of 120.7%~190.1%. While in the headstreams, climatic variables were the primary determinant of streamflow by 56.5~152.6% of the increase. The comparison revealed that combined regression model performed better than ARMA model with the qualified rate of 80.49~90.24%. Based on the forecasts of streamflow for different purposes, the adaptive utilization scheme of water flow is established from the perspective of time and space. Our study presents an effective water resources scheduling scheme for the ecological environment and provides references for ecological protection and water allocation in the arid area.
Beyond Reading Alone: The Relationship Between Aural Literacy And Asthma Management

PubMed Central

Rosenfeld, Lindsay; Rudd, Rima; Emmons, Karen M.; Acevedo-García, Dolores; Martin, Laurie; Buka, Stephen

2010-01-01

Objectives To examine the relationship between literacy and asthma management with a focus on the oral exchange. Methods Study participants, all of whom reported asthma, were drawn from the New England Family Study (NEFS), an examination of links between education and health. NEFS data included reading, oral (speaking), and aural (listening) literacy measures. An additional survey was conducted with this group of study participants related to asthma issues, particularly asthma management. Data analysis focused on bivariate and multivariable logistic regression. Results In bivariate logistic regression models exploring aural literacy, there was a statistically significant association between those participants with lower aural literacy skills and less successful asthma management (OR:4.37, 95%CI:1.11, 17.32). In multivariable logistic regression analyses, controlling for gender, income, and race in separate models (one-at-a-time), there remained a statistically significant association between those participants with lower aural literacy skills and less successful asthma management. Conclusion Lower aural literacy skills seem to complicate asthma management capabilities. Practice Implications Greater attention to the oral exchange, in particular the listening skills highlighted by aural literacy, as well as other related literacy skills may help us develop strategies for clear communication related to asthma management. PMID:20399060
Modeling the spatio-temporal heterogeneity in the PM10-PM2.5 relationship

NASA Astrophysics Data System (ADS)

Chu, Hone-Jay; Huang, Bo; Lin, Chuan-Yao

2015-02-01

This paper explores the spatio-temporal patterns of particulate matter (PM) in Taiwan based on a series of methods. Using fuzzy c-means clustering first, the spatial heterogeneity (six clusters) in the PM data collected between 2005 and 2009 in Taiwan are identified and the industrial and urban areas of Taiwan (southwestern, west central, northwestern, and northern Taiwan) are found to have high PM concentrations. The PM10-PM2.5 relationship is then modeled with global ordinary least squares regression, geographically weighted regression (GWR), and geographically and temporally weighted regression (GTWR). The GTWR and GWR produce consistent results; however, GTWR provides more detailed information of spatio-temporal variations of the PM10-PM2.5 relationship. The results also show that GTWR provides a relatively high goodness of fit and sufficient space-time explanatory power. In particular, the PM2.5 or PM10 varies with time and space, depending on weather conditions and the spatial distribution of land use and emission patterns in local areas. Such information can be used to determine patterns of spatio-temporal heterogeneity in PM that will allow the control of pollutants and the reduction of public exposure.
Teacher characteristics, social classroom relationships, and children's social, emotional, and behavioral classroom adjustment in special education.

PubMed

Breeman, L D; Wubbels, T; van Lier, P A C; Verhulst, F C; van der Ende, J; Maras, A; Hopman, J A B; Tick, N T

2015-02-01

The goal of this study was to explore relations between teacher characteristics (i.e., competence and wellbeing); social classroom relationships (i.e., teacher-child and peer interactions); and children's social, emotional, and behavioral classroom adjustment. These relations were explored at both the individual and classroom levels among 414 children with emotional and behavioral disorders placed in special education. Two models were specified. In the first model, children's classroom adjustment was regressed on social relationships and teacher characteristics. In the second model, reversed links were examined by regressing teacher characteristics on social relationships and children's adjustment. Results of model 1 showed that, at the individual level, better social and emotional adjustment of children was predicted by higher levels of teacher-child closeness and better behavioral adjustment was predicted by both positive teacher-child and peer interactions. At the classroom level, positive social relationships were predicted by higher levels of teacher competence, which in turn were associated with lower classroom levels of social problems. Higher levels of teacher wellbeing were directly associated with classroom adaptive and maladaptive child outcomes. Results of model 2 showed that, at the individual and classroom levels, only the emotional and behavioral problems of children predicted social classroom relationships. At the classroom level, teacher competence was best predicted by positive teacher-child relationships and teacher wellbeing was best predicted by classroom levels of prosocial behavior. We discuss the importance of positive teacher-child and peer interactions for children placed in special education and suggest ways of improving classroom processes by targeting teacher competence. Copyright © 2014 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Robust inference in the negative binomial regression model with an application to falls data.

PubMed

Aeberhard, William H; Cantoni, Eva; Heritier, Stephane

2014-12-01

A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer

PubMed Central

2012-01-01

Implicit in the growing interest in patient-centered outcomes research is a growing need for better evidence regarding how responses to a given intervention or treatment may vary across patients, referred to as heterogeneity of treatment effect (HTE). A variety of methods are available for exploring HTE, each associated with unique strengths and limitations. This paper reviews a selected set of methodological approaches to understanding HTE, focusing largely but not exclusively on their uses with randomized trial data. It is oriented for the “intermediate” outcomes researcher, who may already be familiar with some methods, but would value a systematic overview of both more and less familiar methods with attention to when and why they may be used. Drawing from the biomedical, statistical, epidemiological and econometrics literature, we describe the steps involved in choosing an HTE approach, focusing on whether the intent of the analysis is for exploratory, initial testing, or confirmatory testing purposes. We also map HTE methodological approaches to data considerations as well as the strengths and limitations of each approach. Methods reviewed include formal subgroup analysis, meta-analysis and meta-regression, various types of predictive risk modeling including classification and regression tree analysis, series of n-of-1 trials, latent growth and growth mixture models, quantile regression, and selected non-parametric methods. In addition to an overview of each HTE method, examples and references are provided for further reading. By guiding the selection of the methods and analysis, this review is meant to better enable outcomes researchers to understand and explore aspects of HTE in the context of patient-centered outcomes research. PMID:23234603
Reducing the Complexity of an Agent-Based Local Heroin Market Model

PubMed Central

Heard, Daniel; Bobashev, Georgiy V.; Morris, Robert J.

2014-01-01

This project explores techniques for reducing the complexity of an agent-based model (ABM). The analysis involved a model developed from the ethnographic research of Dr. Lee Hoffer in the Larimer area heroin market, which involved drug users, drug sellers, homeless individuals and police. The authors used statistical techniques to create a reduced version of the original model which maintained simulation fidelity while reducing computational complexity. This involved identifying key summary quantities of individual customer behavior as well as overall market activity and replacing some agents with probability distributions and regressions. The model was then extended to allow external market interventions in the form of police busts. Extensions of this research perspective, as well as its strengths and limitations, are discussed. PMID:25025132
Exploring precipitation pattern scaling methodologies and robustness among CMIP5 models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kravitz, Ben; Lynch, Cary; Hartin, Corinne

Pattern scaling is a well-established method for approximating modeled spatial distributions of changes in temperature by assuming a time-invariant pattern that scales with changes in global mean temperature. We compare two methods of pattern scaling for annual mean precipitation (regression and epoch difference) and evaluate which method is better in particular circumstances by quantifying their robustness to interpolation/extrapolation in time, inter-model variations, and inter-scenario variations. Both the regression and epoch-difference methods (the two most commonly used methods of pattern scaling) have good absolute performance in reconstructing the climate model output, measured as an area-weighted root mean square error. We decomposemore » the precipitation response in the RCP8.5 scenario into a CO 2 portion and a non-CO 2 portion. Extrapolating RCP8.5 patterns to reconstruct precipitation change in the RCP2.6 scenario results in large errors due to violations of pattern scaling assumptions when this CO 2-/non-CO 2-forcing decomposition is applied. As a result, the methodologies discussed in this paper can help provide precipitation fields to be utilized in other models (including integrated assessment models or impacts assessment models) for a wide variety of scenarios of future climate change.« less
Exploring precipitation pattern scaling methodologies and robustness among CMIP5 models

DOE PAGES

Kravitz, Ben; Lynch, Cary; Hartin, Corinne; ...

2017-05-12

Pattern scaling is a well-established method for approximating modeled spatial distributions of changes in temperature by assuming a time-invariant pattern that scales with changes in global mean temperature. We compare two methods of pattern scaling for annual mean precipitation (regression and epoch difference) and evaluate which method is better in particular circumstances by quantifying their robustness to interpolation/extrapolation in time, inter-model variations, and inter-scenario variations. Both the regression and epoch-difference methods (the two most commonly used methods of pattern scaling) have good absolute performance in reconstructing the climate model output, measured as an area-weighted root mean square error. We decomposemore » the precipitation response in the RCP8.5 scenario into a CO 2 portion and a non-CO 2 portion. Extrapolating RCP8.5 patterns to reconstruct precipitation change in the RCP2.6 scenario results in large errors due to violations of pattern scaling assumptions when this CO 2-/non-CO 2-forcing decomposition is applied. As a result, the methodologies discussed in this paper can help provide precipitation fields to be utilized in other models (including integrated assessment models or impacts assessment models) for a wide variety of scenarios of future climate change.« less
[Prediction of schistosomiasis infection rates of population based on ARIMA-NARNN model].

PubMed

Ke-Wei, Wang; Yu, Wu; Jin-Ping, Li; Yu-Yu, Jiang

2016-07-12

To explore the effect of the autoregressive integrated moving average model-nonlinear auto-regressive neural network (ARIMA-NARNN) model on predicting schistosomiasis infection rates of population. The ARIMA model, NARNN model and ARIMA-NARNN model were established based on monthly schistosomiasis infection rates from January 2005 to February 2015 in Jiangsu Province, China. The fitting and prediction performances of the three models were compared. Compared to the ARIMA model and NARNN model, the mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of the ARIMA-NARNN model were the least with the values of 0.011 1, 0.090 0 and 0.282 4, respectively. The ARIMA-NARNN model could effectively fit and predict schistosomiasis infection rates of population, which might have a great application value for the prevention and control of schistosomiasis.
Acupuncture for musculoskeletal pain: A meta-analysis and meta-regression of sham-controlled randomized clinical trials

PubMed Central

Yuan, Qi-ling; Wang, Peng; Liu, Liang; Sun, Fu; Cai, Yong-song; Wu, Wen-tao; Ye, Mao-lin; Ma, Jiang-tao; Xu, Bang-bang; Zhang, Yin-gang

2016-01-01

The aims of this systematic review were to study the analgesic effect of real acupuncture and to explore whether sham acupuncture (SA) type is related to the estimated effect of real acupuncture for musculoskeletal pain. Five databases were searched. The outcome was pain or disability immediately (≤1 week) following an intervention. Standardized mean differences (SMDs) with 95% confidence intervals were calculated. Meta-regression was used to explore possible sources of heterogeneity. Sixty-three studies (6382 individuals) were included. Eight condition types were included. The pooled effect size was moderate for pain relief (59 trials, 4980 individuals, SMD −0.61, 95% CI −0.76 to −0.47; P < 0.001) and large for disability improvement (31 trials, 4876 individuals, −0.77, −1.05 to −0.49; P < 0.001). In a univariate meta-regression model, sham needle location and/or depth could explain most or all heterogeneities for some conditions (e.g., shoulder pain, low back pain, osteoarthritis, myofascial pain, and fibromyalgia); however, the interactions between subgroups via these covariates were not significant (P < 0.05). Our review provided low-quality evidence that real acupuncture has a moderate effect (approximate 12-point reduction on the 100-mm visual analogue scale) on musculoskeletal pain. SA type did not appear to be related to the estimated effect of real acupuncture. PMID:27471137
Spatiotemporal analysis of the relationship between socioeconomic factors and stroke in the Portuguese mainland population under 65 years old.

PubMed

Oliveira, André; Cabral, António J R; Mendes, Jorge M; Martins, Maria R O; Cabral, Pedro

2015-11-04

Stroke risk has been shown to display varying patterns of geographic distribution amongst countries but also between regions of the same country. Traditionally a disease of older persons, a global 25% increase in incidence instead was noticed between 1990 and 2010 in persons aged 20-≤64 years, particularly in low- and medium-income countries. Understanding spatial disparities in the association between socioeconomic factors and stroke is critical to target public health initiatives aiming to mitigate or prevent this disease, including in younger persons. We aimed to identify socioeconomic determinants of geographic disparities of stroke risk in people <65 years old, in municipalities of mainland Portugal, and the spatiotemporal variation of the association between these determinants and stroke risk during two study periods (1992-1996 and 2002-2006). Poisson and negative binomial global regression models were used to explore determinants of disease risk. Geographically weighted regression (GWR) represents a distinctive approach, allowing estimation of local regression coefficients. Models for both study periods were identified. Significant variables included education attainment, work hours per week and unemployment. Local Poisson GWR models achieved the best fit and evidenced spatially varying regression coefficients. Spatiotemporal inequalities were observed in significant variables, with dissimilarities between men and women. This study contributes to a better understanding of the relationship between stroke and socioeconomic factors in the population <65 years of age, one age group seldom analysed separately. It can thus help to improve the targeting of public health initiatives, even more in a context of economic crisis.
Exploring Individual and Structural Factors Associated with Employment Among Young Transgender Women of Color Using a No-Cost Transgender Legal Resource Center.

PubMed

Hill, Brandon J; Rosentel, Kris; Bak, Trevor; Silverman, Michael; Crosby, Richard; Salazar, Laura; Kipke, Michele

2017-01-01

The purpose of this study was to explore individual and structural factors associated with employment among young transgender women (TW) of color. Sixty-five trans women of color were recruited from the Transgender Legal Defense and Education Fund to complete a 30-min interviewer-assisted survey assessing sociodemographics, housing, workplace discrimination, job-seeking self-efficacy, self-esteem, perceived public passability, and transactional sex work. Logistic regression models revealed that stable housing (structural factor) and job-seeking self-efficacy (individual factor) were significantly associated with currently being employed. Our findings underscore the need for multilevel approaches to assist TW of color gain employment.
Understanding catastrophizing from a misdirected problem-solving perspective.

PubMed

Flink, Ida K; Boersma, Katja; MacDonald, Shane; Linton, Steven J

2012-05-01

The aim is to explore pain catastrophizing from a problem-solving perspective. The links between catastrophizing, problem framing, and problem-solving behaviour are examined through two possible models of mediation as inferred by two contemporary and complementary theoretical models, the misdirected problem solving model (Eccleston & Crombez, 2007) and the fear-anxiety-avoidance model (Asmundson, Norton, & Vlaeyen, 2004). In this prospective study, a general population sample (n= 173) with perceived problems with spinal pain filled out questionnaires twice; catastrophizing and problem framing were assessed on the first occasion and health care seeking (as a proxy for medically oriented problem solving) was assessed 7 months later. Two different approaches were used to explore whether the data supported any of the proposed models of mediation. First, multiple regressions were used according to traditional recommendations for mediation analyses. Second, a bootstrapping method (n= 1000 bootstrap resamples) was used to explore the significance of the indirect effects in both possible models of mediation. The results verified the concepts included in the misdirected problem solving model. However, the direction of the relations was more in line with the fear-anxiety-avoidance model. More specifically, the mediation analyses provided support for viewing catastrophizing as a mediator of the relation between biomedical problem framing and medically oriented problem-solving behaviour. These findings provide support for viewing catastrophizing from a problem-solving perspective and imply a need to examine and address problem framing and catastrophizing in back pain patients. ©2011 The British Psychological Society.
Structural exploration for the refinement of anticancer matrix metalloproteinase-2 inhibitor designing approaches through robust validated multi-QSARs

NASA Astrophysics Data System (ADS)

Adhikari, Nilanjan; Amin, Sk. Abdul; Saha, Achintya; Jha, Tarun

2018-03-01

Matrix metalloproteinase-2 (MMP-2) is a promising pharmacological target for designing potential anticancer drugs. MMP-2 plays critical functions in apoptosis by cleaving the DNA repair enzyme namely poly (ADP-ribose) polymerase (PARP). Moreover, MMP-2 expression triggers the vascular endothelial growth factor (VEGF) having a positive influence on tumor size, invasion, and angiogenesis. Therefore, it is an urgent need to develop potential MMP-2 inhibitors without any toxicity but better pharmacokinetic property. In this article, robust validated multi-quantitative structure-activity relationship (QSAR) modeling approaches were attempted on a dataset of 222 MMP-2 inhibitors to explore the important structural and pharmacophoric requirements for higher MMP-2 inhibition. Different validated regression and classification-based QSARs, pharmacophore mapping and 3D-QSAR techniques were performed. These results were challenged and subjected to further validation to explain 24 in house MMP-2 inhibitors to judge the reliability of these models further. All these models were individually validated internally as well as externally and were supported and validated by each other. These results were further justified by molecular docking analysis. Modeling techniques adopted here not only helps to explore the necessary structural and pharmacophoric requirements but also for the overall validation and refinement techniques for designing potential MMP-2 inhibitors.
[Exploration of influencing factors of price of herbal based on VAR model].

PubMed

Wang, Nuo; Liu, Shu-Zhen; Yang, Guang

2014-10-01

Based on vector auto-regression (VAR) model, this paper takes advantage of Granger causality test, variance decomposition and impulse response analysis techniques to carry out a comprehensive study of the factors influencing the price of Chinese herbal, including herbal cultivation costs, acreage, natural disasters, the residents' needs and inflation. The study found that there is Granger causality relationship between inflation and herbal prices, cultivation costs and herbal prices. And in the total variance analysis of Chinese herbal and medicine price index, the largest contribution to it is from its own fluctuations, followed by the cultivation costs and inflation.
Social Support in Children With ADHD: An Exploration of Resilience.

PubMed

Mastoras, Sarah M; Saklofske, Donald H; Schwean, Vicki L; Climie, Emma A

2018-06-01

This study investigated the role of perceived social support in promoting emotional well-being among children with ADHD. Specifically, it examined how children with ADHD perceive support from key individuals in their lives and the relationships between this support and aspects of emotional well-being. Main versus buffering models of social support in the context of social preference status were also explored. Participants were 55 school-age children with ADHD-combined or hyperactive/impulsive (ADHD-C/HI). Parent and child ratings evaluated source-specific social support, social status, and aspects of self-concept, anxiety, and depression. Children with ADHD reported lower social support than normative samples. Social support had moderate positive associations with self-concept, with source-specific differences, but was not associated with internalizing symptoms. Regression models with social preference status supported a main effect model of perceived social support. Social support may provide a target for resilience-based interventions among children with ADHD in promoting their self-concept and well-being.
Mixed kernel function support vector regression for global sensitivity analysis

NASA Astrophysics Data System (ADS)

Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

2017-11-01

Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Exploring the Influence of Nurse Work Environment and Patient Safety Culture on Attitudes Toward Incident Reporting.

PubMed

Yoo, Moon Sook; Kim, Kyoung Ja

2017-09-01

The aim of this study was to explore the influence of nurse work environments and patient safety culture on attitudes toward incident reporting. Patient safety culture had been known as a factor of incident reporting by nurses. Positive work environment could be an important influencing factor for the safety behavior of nurses. A cross-sectional survey design was used. The structured questionnaire was administered to 191 nurses working at a tertiary university hospital in South Korea. Nurses' perception of work environment and patient safety culture were positively correlated with attitudes toward incident reporting. A regression model with clinical career, work area and nurse work environment, and patient safety culture against attitudes toward incident reporting was statistically significant. The model explained approximately 50.7% of attitudes toward incident reporting. Improving nurses' attitudes toward incident reporting can be achieved with a broad approach that includes improvements in work environment and patient safety culture.

An exploration of spatial patterns of seasonal diarrhoeal morbidity in Thailand.

PubMed

McCormick, B J J; Alonso, W J; Miller, M A

2012-07-01

Studies of temporal and spatial patterns of diarrhoeal disease can suggest putative aetiological agents and environmental or socioeconomic drivers. Here, the seasonal patterns of monthly acute diarrhoeal morbidity in Thailand, where diarrhoeal morbidity is increasing, are explored. Climatic data (2003-2006) and Thai Ministry of Health annual reports (2003-2009) were used to construct a spatially weighted panel regression model. Seasonal patterns of diarrhoeal disease were generally bimodal with aetiological agents peaking at different times of the year. There is a strong association between daily mean temperature and precipitation and the incidence of hospitalization due to acute diarrhoea in Thailand leading to a distinct spatial pattern in the seasonal pattern of diarrhoea. Model performance varied across the country in relation to per capita GDP and population density. While climatic factors are likely to drive the general pattern of diarrhoeal disease in Thailand, the seasonality of diarrhoeal disease is dampened in affluent urban populations.
Isotopic patterns in caps and stipes in sporocarps reveal patterns of organic nitrogen use by ectomycorrhizal fungi

NASA Astrophysics Data System (ADS)

Hobbie, Erik; Ouimette, Andrew; Chen, Janet

2016-04-01

Current ecosystem models use inorganic nitrogen as the currency of nitrogen acquisition by plants. However, many trees may gain access to otherwise unavailable soil resources, such as soil organic nitrogen, through their symbioses with ectomycorrhizal fungi, and this pathway of nitrogen acquisition may therefore be important in understanding plant responses to environmental change. Different functional groups of ectomycorrhizal fungi vary in their ability to enzymatically access protein and other soil resources. Such fungal parameters as hyphal hydrophobicity, the presence of rhizomorphs (long-distance transport structures), and exploration strategies (e.g., short-distance versus long-distance, mat formation) correspond with how fungi interact with and explore the environment, and the proportions of different exploration types present will shift along environmental gradients such as nitrogen availability. Isotopic differences between caps and stipes may provide a means to test for organic nitrogen use, since caps and stipes differ in δ13C and δ15N as a result of variable proportions of protein and other classes of compounds, and protein should differ isotopically among de novo synthesis, litter sources, and soil sources. Here, we propose that (1) isotopic differences between caps and stipes could be related to organic nitrogen uptake and to the δ13C and δ15N values of different pools of soil-derived or de novo-synthesized amino acids; (2) these isotopic differences will reflect greater acquisition of soil-derived organic nitrogen by exploration types of greater enzymatic capabilities to degrade recalcitrant nitrogen forms, specifically long-distance, medium-distance fringe, and medium-distance mat exploration types. To test these hypotheses, we use a dataset of isotopic values, %N, and %C in 208 cap/stipe samples collected from Oregon, western USA. δ13C differences in caps and stipes in a multiple regression model had an adjusted r2 of 0.292 (p < 0.0001), and were explained best by exploration type (45% of explained variance), the interaction of exploration type and %Ncap-stipe (20%), the interaction of exploration type and %Ncap/stipe (22%), %Ccap-stipe (8%), and %Ncap-stipe (5%). δ15N differences between caps and stipes in a multiple regression model had an adjusted r2 of 0.486 (p < 0.0001), and were explained best by exploration type (47% of explained variance), the interaction of exploration type and %Ncap-stipe (26%), the interaction of exploration type and %Ncap/stipe (14%), %Ncap/stipe (11%),and %Ccap-stipe (2%). We argue that these differences in the 13C and 15N enrichment of caps relative to stipes reflect not only shifts in the proportions of protein and carbohydrates, but also differences in the extent of fluxes and the δ13C and δ15N signatures of soil- and litter-derived organic nitrogen taken up by these fungi. We also propose equations to quantify this uptake. Organic nitrogen from litter (lower δ13C and δ15N) may be incorporated by medium-distance mat, short-distance, and contact exploration types of ectomycorrhizal fungi, whereas long-distance and medium-distance fringe exploration types appeared to incorporate deeper soil organic nitrogen.
The effect of access restrictions on the vintage of drugs used by Medicaid enrollees.

PubMed

Lichtenberg, Frank R

2005-01-01

To examine the extent to which recent Medicaid drug access restrictions, such as preferred drug lists (PDLs), may affect the vintage (or time since Food and Drug Administration approval) of 6 types of drugs used by Medicaid beneficiaries. Retrospective claims database analysis using National Drug Code pharmacy claims data. A regression model was developed to analyze the effect that Medicaid access restrictions had on the vintage of medications prescribed in 6 different therapeutic categories. A "difference in differences" approach was used to compare the change in vintage of medications prescribed in Medicaid versus non-Medicaid patients between the January-June 2001 and July-December 2003 study periods. The results of the regression model showed that PDLs increased the age of Medicaid prescriptions by less than 1 year for drugs in 5 of the 6 therapeutic classes analyzed. In the case of pain management medications, the increase was more than 1.2 years. The results of the regression model suggest that Medicaid drug access restriction programs (e.g., PDLs) have resulted in an increase in the age of drugs prescribed for Medicaid beneficiaries versus non-Medicaid patients. Since previous research has suggested a clinical and economic advantage to utilizing newer versus older drugs, further research should be conducted to explore how these medication restriction policies may unduly affect Medicaid beneficiaries compared with privately insured patients.
Exploring the impact of different multi-level measures of physician communities in patient-centric care networks on healthcare outcomes: A multi-level regression approach.

PubMed

Uddin, Shahadat

2016-02-04

A patient-centric care network can be defined as a network among a group of healthcare professionals who provide treatments to common patients. Various multi-level attributes of the members of this network have substantial influence to its perceived level of performance. In order to assess the impact different multi-level attributes of patient-centric care networks on healthcare outcomes, this study first captured patient-centric care networks for 85 hospitals using health insurance claim dataset. From these networks, this study then constructed physician collaboration networks based on the concept of patient-sharing network among physicians. A multi-level regression model was then developed to explore the impact of different attributes that are organised at two levels on hospitalisation cost and hospital length of stay. For Level-1 model, the average visit per physician significantly predicted both hospitalisation cost and hospital length of stay. The number of different physicians significantly predicted only the hospitalisation cost, which has significantly been moderated by age, gender and Comorbidity score of patients. All Level-1 findings showed significance variance across physician collaboration networks having different community structure and density. These findings could be utilised as a reflective measure by healthcare decision makers. Moreover, healthcare managers could consider them in developing effective healthcare environments.
Cytotoxic Effects of the Therapeutic Radionuclide Rhenium-188 Combined with Taxanes in Human Prostate Carcinoma Cell Lines.

PubMed

Lange, Rogier; ter Heine, Rob; van Wieringen, Wessel N; Tromp, Adrienne M; Paap, Mayke; Bloemendal, Haiko J; de Klerk, John M H; Hendrikse, N Harry; Geldof, Albert A

2017-02-01

Rhenium-188-HEDP is an effective radiopharmaceutical for the treatment of painful bone metastases from prostate cancer. The effectiveness of the β-radiation emitted by 188 Re might be enhanced by combination with chemotherapy, using the radiosensitization concept. Therefore, the authors investigated the combined treatment of the taxanes, docetaxel and cabazitaxel, with 188 Re in prostate carcinoma cell lines. The cytotoxic effects of single and combined treatment with taxanes and 188 Re were investigated in three human prostate carcinoma cell lines (PC-3, DU 145, and LNCaP), using the colony-forming assay. The half maximal effective concentration (EC50) of all individual agents was determined. The combined treatment was studied at 0.25, 0.5, 1, 2, and 4 times the EC50 of each agent. The interaction was investigated with a regression model. The survival curves showed dose-dependent cell growth inhibition for both the taxanes and 188 Re. The regression model showed a good capability of explaining the data. It proved additivity in all combination experiments and confirmed a general trend to a slight subadditive effect. This proof-of-mechanism study exploring radiosensitization by combining 188 Re and taxanes showed no synergism, but significant additivity. This encourages the design of in vivo studies. Future research should explore the potential added value of concomitant treatment of bone metastases with chemotherapy and 188 Re-HEDP.
Psychopathology, psychopharmacological properties, decision-making capacity to consent to clinical research and the willingness to participate among long-term hospitalized patients with schizophrenia.

PubMed

Wu, Bo-Jian; Liao, Hsun-Yi; Chen, Hsing-Kang; Lan, Tsuo-Hung

2016-03-30

Many studies discuss factors related to the decision-making capacity to consent to clinical research (DMC) of patients with schizophrenia. However, these studies rarely approached willingness to participate and the association between psychopharmacological properties (e.g., antipsychotic-induced side effects) and DMC. This study aimed to explore factors related to DMC and willingness to participate in patients with schizophrenia. All 139 patients with schizophrenia were assessed with the MacArthur Competence Assessment Tool for Clinical Research (MacCAT-CR) and other measures. A linear regression model was used to find the predictors of MacCAT-CR scores. A logistic regression model was used for exploring the predictors of willingness to participate. Patients with more severe negative symptoms performed poorly in DMC outcomes. In addition, females, those with fewer years of education and reduced cognitive function are more likely to experience difficulties in decision-making. Forty-three subjects (30.4%) chose to participate. Patients with higher level of positive symptoms, longer length of stay, higher burden of anticholinergics and users of atypical antipsychotics were more likely to participate in a clinical study which aimed to "enhance cognition". These finding suggest that research investigators should consider many variables for patients who require more intensive screening for impaired DMC. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

PubMed

Haoliang Yuan; Yuan Yan Tang

2017-04-01

Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.
Predictors of quality of life: A quantitative investigation of the stress-coping model in children with asthma

PubMed Central

Peeters, Yvette; Boersma, Sandra N; Koopman, Hendrik M

2008-01-01

Background Aim of this study is to further explore predictors of health related quality of life in children with asthma using factors derived from to the extended stress-coping model. While the stress-coping model has often been used as a frame of reference in studying health related quality of life in chronic illness, few have actually tested the model in children with asthma. Method In this survey study data were obtained by means of self-report questionnaires from seventy-eight children with asthma and their parents. Based on data derived from these questionnaires the constructs of the extended stress-coping model were assessed, using regression analysis and path analysis. Results The results of both regression analysis and path analysis reveal tentative support for the proposed relationships between predictors and health related quality of life in the stress-coping model. Moreover, as indicated in the stress-coping model, HRQoL is only directly predicted by coping. Both coping strategies 'emotional reaction' (significantly) and 'avoidance' are directly related to HRQoL. Conclusion In children with asthma, the extended stress-coping model appears to be a useful theoretical framework for understanding the impact of the illness on their quality of life. Consequently, the factors suggested by this model should be taken into account when designing optimal psychosocial-care interventions. PMID:18366753
Near infrared spectrometric technique for testing fruit quality: optimisation of regression models using genetic algorithms

NASA Astrophysics Data System (ADS)

Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.

2016-02-01

Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
A non-linear data mining parameter selection algorithm for continuous variables

PubMed Central

Razavi, Marianne; Brady, Sean

2017-01-01

In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables. PMID:29131829
Does attitude matter in computer use in Australian general practice? A zero-inflated Poisson regression analysis.

PubMed

Khan, Asaduzzaman; Western, Mark

The purpose of this study was to explore factors that facilitate or hinder effective use of computers in Australian general medical practice. This study is based on data extracted from a national telephone survey of 480 general practitioners (GPs) across Australia. Clinical functions performed by GPs using computers were examined using a zero-inflated Poisson (ZIP) regression modelling. About 17% of GPs were not using computer for any clinical function, while 18% reported using computers for all clinical functions. The ZIP model showed that computer anxiety was negatively associated with effective computer use, while practitioners' belief about usefulness of computers was positively associated with effective computer use. Being a female GP or working in partnership or group practice increased the odds of effectively using computers for clinical functions. To fully capitalise on the benefits of computer technology, GPs need to be convinced that this technology is useful and can make a difference.
First molecular modeling report on novel arylpyrimidine kynurenine monooxygenase inhibitors through multi-QSAR analysis against Huntington's disease: A proposal to chemists!

PubMed

Amin, Sk Abdul; Adhikari, Nilanjan; Jha, Tarun; Gayen, Shovanlal

2016-12-01

Huntington's disease (HD) is caused by mutation of huntingtin protein (mHtt) leading to neuronal cell death. The mHtt induced toxicity can be rescued by inhibiting the kynurenine monooxygenase (KMO) enzyme. Therefore, KMO is a promising drug target to address the neurodegenerative disorders such as Huntington's diseases. Fiftysix arylpyrimidine KMO inhibitors are structurally explored through regression and classification based multi-QSAR modeling, pharmacophore mapping and molecular docking approaches. Moreover, ten new compounds are proposed and validated through the modeling that may be effective in accelerating Huntington's disease drug discovery efforts. Copyright © 2016 Elsevier Ltd. All rights reserved.
Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au; Ebert, Martin A.; Bulsara, Max

Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥more » 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions: Logistic regression and MARS were most likely to be the best-performing strategy for the prediction of urinary symptoms with elastic-net and random forest producing competitive results. The predictive power of the models was modest and endpoint-dependent. New features, including spatial dose maps, may be necessary to achieve better models.« less
Stepwise group sparse regression (SGSR): gene-set-based pharmacogenomic predictive models with stepwise selection of functional priors.

PubMed

Jang, In Sock; Dienstmann, Rodrigo; Margolin, Adam A; Guinney, Justin

2015-01-01

Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.
Impacts of land use and population density on seasonal surface water quality using a modified geographically weighted regression.

PubMed

Chen, Qiang; Mei, Kun; Dahlgren, Randy A; Wang, Ting; Gong, Jian; Zhang, Minghua

2016-12-01

As an important regulator of pollutants in overland flow and interflow, land use has become an essential research component for determining the relationships between surface water quality and pollution sources. This study investigated the use of ordinary least squares (OLS) and geographically weighted regression (GWR) models to identify the impact of land use and population density on surface water quality in the Wen-Rui Tang River watershed of eastern China. A manual variable excluding-selecting method was explored to resolve multicollinearity issues. Standard regression coefficient analysis coupled with cluster analysis was introduced to determine which variable had the greatest influence on water quality. Results showed that: (1) Impact of land use on water quality varied with spatial and seasonal scales. Both positive and negative effects for certain land-use indicators were found in different subcatchments. (2) Urban land was the dominant factor influencing N, P and chemical oxygen demand (COD) in highly urbanized regions, but the relationship was weak as the pollutants were mainly from point sources. Agricultural land was the primary factor influencing N and P in suburban and rural areas; the relationship was strong as the pollutants were mainly from agricultural surface runoff. Subcatchments located in suburban areas were identified with urban land as the primary influencing factor during the wet season while agricultural land was identified as a more prevalent influencing factor during the dry season. (3) Adjusted R 2 values in OLS models using the manual variable excluding-selecting method averaged 14.3% higher than using stepwise multiple linear regressions. However, the corresponding GWR models had adjusted R 2 ~59.2% higher than the optimal OLS models, confirming that GWR models demonstrated better prediction accuracy. Based on our findings, water resource protection policies should consider site-specific land-use conditions within each watershed to optimize mitigation strategies for contrasting land-use characteristics and seasonal variations. Copyright © 2016 Elsevier B.V. All rights reserved.
Power of theta waves in the EEG of human subjects increases during recall of haptic information.

PubMed

Grunwald, M; Weiss, T; Krause, W; Beyer, L; Rost, R; Gutberlet, I; Gertz, H J

1999-02-05

Several studies have reported a functional relationship between spectral power within the theta-band of the EEG (theta-power) and memory load while processing visual or semantic information. We investigated theta power during the processing of different complex haptic stimuli using a delayed recall design. The haptic explorations consisted of palpating the structure of twelve sunken reliefs with closed eyes. Subjects had to reproduce each relief by drawing it 10 s after the end of the exploration. The relationship between mean theta power and mean exploration time was analysed using a regression model. A linear relationship was found between the exploration time and theta power over fronto-central regions (Fp1, Fp2, F3, F7, F8, Fz, C3) directly before the recall of the relief. This result is interpreted in favour of the hypothesis that fronto-central theta power of the EEG correlates with the load of working memory independent of stimulus modality.
A longitudinal examination of adolescent career planning and exploration using a social cognitive career theory framework.

PubMed

Rogers, Mary E; Creed, Peter A

2011-02-01

This study used social cognitive career theory (Lent, Brown, & Hackett, 1994), as a framework to investigate predictors of career choice actions, operationalised as career planning and career exploration. The model was tested cross-sectionally and longitudinally with 631 high school students enrolled in Grades 10-12. Students completed measures of self-efficacy, outcome expectations, goals, supports and personality. Results of the hierarchical regression analyses indicated strong support for self-efficacy and goals predicting career planning and exploration across all grades at T1, and predicting change in career planning and exploration from T1 to T2. Whilst support for pathways among other predictor variables (personality, contextual influences and biographic variables) to choice actions was found, these pathways varied across grades at T1, and also from T1 to T2. Implications for social cognitive career theory, career counselling practice and future research are discussed. Copyright Â© 2010 The Association for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Use of Longitudinal Regression in Quality Control. Research Report. ETS RR-14-31

ERIC Educational Resources Information Center

Lu, Ying; Yen, Wendy M.

2014-01-01

This article explores the use of longitudinal regression as a tool for identifying scoring inaccuracies. Student progression patterns, as evaluated through longitudinal regressions, typically are more stable from year to year than are scale score distributions and statistics, which require representative samples to conduct credibility checks.…
Exploring prediction uncertainty of spatial data in geostatistical and machine learning Approaches

NASA Astrophysics Data System (ADS)

Klump, J. F.; Fouedjio, F.

2017-12-01

Geostatistical methods such as kriging with external drift as well as machine learning techniques such as quantile regression forest have been intensively used for modelling spatial data. In addition to providing predictions for target variables, both approaches are able to deliver a quantification of the uncertainty associated with the prediction at a target location. Geostatistical approaches are, by essence, adequate for providing such prediction uncertainties and their behaviour is well understood. However, they often require significant data pre-processing and rely on assumptions that are rarely met in practice. Machine learning algorithms such as random forest regression, on the other hand, require less data pre-processing and are non-parametric. This makes the application of machine learning algorithms to geostatistical problems an attractive proposition. The objective of this study is to compare kriging with external drift and quantile regression forest with respect to their ability to deliver reliable prediction uncertainties of spatial data. In our comparison we use both simulated and real world datasets. Apart from classical performance indicators, comparisons make use of accuracy plots, probability interval width plots, and the visual examinations of the uncertainty maps provided by the two approaches. By comparing random forest regression to kriging we found that both methods produced comparable maps of estimated values for our variables of interest. However, the measure of uncertainty provided by random forest seems to be quite different to the measure of uncertainty provided by kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. These preliminary results raise questions about assessing the risks associated with decisions based on the predictions from geostatistical and machine learning algorithms in a spatial context, e.g. mineral exploration.
Characterizing Touch Using Pressure Data and Auto Regressive Models

PubMed Central

Laufer, Shlomi; Pugh, Carla M.; Van Veen, Barry D.

2014-01-01

Palpation plays a critical role in medical physical exams. Despite the wide range of exams, there are several reproducible and subconscious sets of maneuvers that are common to examination by palpation. Previous studies by our group demonstrated the use of manikins and pressure sensors for measuring and quantifying how physicians palpate during different physical exams. In this study we develop mathematical models that describe some of these common maneuvers. Dynamic pressure data was measured using a simplified testbed and different autoregressive models were used to describe the motion of interest. The frequency, direction and type of motion used were identified from the models. We believe these models can a provide better understanding of how humans explore objects in general and more specifically give insights to understand medical physical exams. PMID:25570335

The Social Anxiety and Depression Life Interference—24 Inventory: Classical and modern psychometric evaluations

PubMed Central

Berzins, Tiffany L.; Garcia, Antonio F.; Acosta, Melina; Osman, Augustine

2017-01-01

Two instrument validation studies broadened the research literature exploring the factor structure, internal consistency reliability, and concurrent validity of scores on the Social Anxiety and Depression Life Interference—24 Inventory (SADLI-24; Osman, Bagge, Freedenthal, Guiterrez, & Emmerich, 2011). Study 1 (N = 1065) was undertaken to concurrently appraise three competing factor models for the instrument: a unidimensional model, a two-factor oblique model and a bifactor model. The bifactor model provided the best fit to the study sample data. Study 2 (N = 220) extended the results from Study 1 with an investigation of the convergent and discriminant validity for the bifactor model of the SADLI-24 with multiple regression analyses and scale-level exploratory structural equation modeling. This project yields data that augments the initial instrument development investigations for the target measure. PMID:28781401
Explaining public support for space exploration funding in America: A multivariate analysis

NASA Astrophysics Data System (ADS)

Nadeau, François

2013-05-01

Recent studies have identified the need to understand what shapes public attitudes toward space policy. I address this gap in the literature by developing a multivariate regression model explaining why many Americans support government spending on space exploration. Using pooled data from the 2006 and 2008 General Social Surveys, the study reveals that spending preferences on space exploration are largely apolitical and associated instead with knowledge and opinions about science. In particular, the odds of wanting to increase funding for space exploration are significantly higher for white, male Babyboomers with a higher socio-economic status, a fondness for organized science, and a post-secondary science education. As such, I argue that public support for NASA's spending epitomizes what Launius termed "Apollo Nostalgia" in American culture. That is, Americans benefitting most from the old social order of the 1960s developed a greater fondness for science that makes them more likely to lament the glory days of space exploration. The article concludes with suggestions for how to elaborate on these findings in future studies.
Poisson Mixture Regression Models for Heart Disease Prediction.

PubMed

Mufudza, Chipo; Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Poisson Mixture Regression Models for Heart Disease Prediction

PubMed Central

Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Antibody treatment of human tumor xenografts elicits active anti-tumor immunity in nude mice

PubMed Central

Liebman, Meredith A.; Roche, Marly I.; Williams, Brent R.; Kim, Jae; Pageau, Steven C.; Sharon, Jacqueline

2007-01-01

Athymic nude mice bearing subcutaneous tumor xenografts of the human anti-colorectal cancer cell line SW480 were used as a preclinical model to explore anti-tumor immunotherapies. Intratumor or systemic treatment of the mice with murine anti-SW480 serum, recombinant anti-SW480 polyclonal antibodies, or the anti-colorectal cancer monoclonal antibody CO17-1A, caused retardation or regression of SW480 tumor xenografts. Interestingly, when mice that had regressed their tumors were re-challenged with SW480 cells, these mice regressed the new tumors without further antibody treatment. Adoptive transfer of spleen cells from mice that had regressed their tumors conferred anti-tumor immunity to naïve nude mice. Pilot experiments suggest that the transferred anti-tumor immunity is mediated by T cells of both γδ and αβ lineages. These results demonstrate that passive anti-tumor immunotherapy can elicit active immunity and support a role for extra-thymic γδ and αβ T cells in tumor rejection. Implications for potential immunotherapies include injection of tumor nodules in cancer patients with anti-tumor antibodies to induce anti-tumor T cell immunity. PMID:17920694
Data mining: Potential applications in research on nutrition and health.

PubMed

Batterham, Marijka; Neale, Elizabeth; Martin, Allison; Tapsell, Linda

2017-02-01

Data mining enables further insights from nutrition-related research, but caution is required. The aim of this analysis was to demonstrate and compare the utility of data mining methods in classifying a categorical outcome derived from a nutrition-related intervention. Baseline data (23 variables, 8 categorical) on participants (n = 295) in an intervention trial were used to classify participants in terms of meeting the criteria of achieving 10 000 steps per day. Results from classification and regression trees (CARTs), random forests, adaptive boosting, logistic regression, support vector machines and neural networks were compared using area under the curve (AUC) and error assessments. The CART produced the best model when considering the AUC (0.703), overall error (18%) and within class error (28%). Logistic regression also performed reasonably well compared to the other models (AUC 0.675, overall error 23%, within class error 36%). All the methods gave different rankings of variables' importance. CART found that body fat, quality of life using the SF-12 Physical Component Summary (PCS) and the cholesterol: HDL ratio were the most important predictors of meeting the 10 000 steps criteria, while logistic regression showed the SF-12PCS, glucose levels and level of education to be the most significant predictors (P ≤ 0.01). Differing outcomes suggest caution is required with a single data mining method, particularly in a dataset with nonlinear relationships and outliers and when exploring relationships that were not the primary outcomes of the research. © 2017 Dietitians Association of Australia.
Fast Screening Technology for Drug Emergency Management: Predicting Suspicious SNPs for ADR with Information Theory-based Models.

PubMed

Liang, Zhaohui; Liu, Jun; Huang, Jimmy X; Zeng, Xing

2018-01-01

The genetic polymorphism of Cytochrome P450 (CYP 450) is considered as one of the main causes for adverse drug reactions (ADRs). In order to explore the latent correlations between ADRs and potentially corresponding single-nucleotide polymorphism (SNPs) in CYP450, three algorithms based on information theory are used as the main method to predict the possible relation. The study uses a retrospective case-control study to explore the potential relation of ADRs to specific genomic locations and single-nucleotide polymorphism (SNP). The genomic data collected from 53 healthy volunteers are applied for the analysis, another group of genomic data collected from 30 healthy volunteers excluded from the study are used as the control group. The SNPs respective on five loci of CYP2D6*2,*10,*14 and CYP1A2*1C, *1F are detected by the Applied Biosystem 3130xl. The raw data is processed by ChromasPro to detect the specific alleles on the above loci from each sample. The secondary data are reorganized and processed by R combined with the reports of ADRs from clinical reports. Three information theory based algorithms are implemented for the screening task: JMI, CMIM, and mRMR. If a SNP is selected by more than two algorithms, we are confident to conclude that it is related to the corresponding ADR. The selection results are compared with the control decision tree + LASSO regression model. In the study group where ADRs occur, 10 SNPs are considered relevant to the occurrence of a specific ADR by the combined information theory model. In comparison, only 5 SNPs are considered relevant to a specific ADR by the decision tree + LASSO regression model. In addition, the new method detects more relevant pairs of SNP and ADR which are affected by both SNP and dosage. This implies that the new information theory based model is effective to discover correlations of ADRs and CYP 450 SNPs and is helpful in predicting the potential vulnerable genotype for some ADRs. The newly proposed information theory based model has superiority performance in detecting the relation between SNP and ADR compared to the decision tree + LASSO regression model. The new model is more sensitive to detect ADRs compared to the old method, while the old method is more reliable. Therefore, the selection criteria for selecting algorithms should depend on the pragmatic needs. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Bayesian block-diagonal variable selection and model averaging

PubMed Central

Papaspiliopoulos, O.; Rossell, D.

2018-01-01

Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
Emotional processing during experiential treatment of depression.

PubMed

Pos, Alberta E; Greenberg, Leslie S; Goldman, Rhonda N; Korman, Lorne M

2003-12-01

This study explored the importance of early and late emotional processing to change in depressive and general symptomology, self-esteem, and interpersonal problems for 34 clients who received 16-20 sessions of experiential treatment for depression. The independent contribution to outcome of the early working alliance was also explored. Early and late emotional processing predicted reductions in reported symptoms and gains in self-esteem. More important, emotional-processing skill significantly improved during treatment. Hierarchical regression models demonstrated that late emotional processing both mediated the relationship between clients' early emotional processing capacity and outcome and was the sole emotional-processing variable that independently predicted improvement. After controlling for emotional processing, the working alliance added an independent contribution to explaining improvement in reported symptomology only. (c) 2003 APA
Exploring Individual and Structural Factors Associated with Employment Among Young Transgender Women of Color Using a No-Cost Transgender Legal Resource Center

PubMed Central

Hill, Brandon J.; Rosentel, Kris; Bak, Trevor; Silverman, Michael; Crosby, Richard; Salazar, Laura; Kipke, Michele

2017-01-01

Abstract Purpose: The purpose of this study was to explore individual and structural factors associated with employment among young transgender women (TW) of color. Methods: Sixty-five trans women of color were recruited from the Transgender Legal Defense and Education Fund to complete a 30-min interviewer-assisted survey assessing sociodemographics, housing, workplace discrimination, job-seeking self-efficacy, self-esteem, perceived public passability, and transactional sex work. Results: Logistic regression models revealed that stable housing (structural factor) and job-seeking self-efficacy (individual factor) were significantly associated with currently being employed. Conclusion: Our findings underscore the need for multilevel approaches to assist TW of color gain employment. PMID:28795154
Internalized stigma among psychiatric outpatients: Associations with quality of life, functioning, hope and self-esteem.

PubMed

Picco, Louisa; Pang, Shirlene; Lau, Ying Wen; Jeyagurunathan, Anitha; Satghare, Pratika; Abdin, Edimansyah; Vaingankar, Janhavi Ajit; Lim, Susan; Poh, Chee Lien; Chong, Siow Ann; Subramaniam, Mythily

2016-12-30

This study aimed to: (i) determine the prevalence, socio-demographic and clinical correlates of internalized stigma and (ii) explore the association between internalized stigma and quality of life, general functioning, hope and self-esteem, among a multi-ethnic Asian population of patients with mental disorders. This cross-sectional, survey recruited adult patients (n=280) who were seeking treatment at outpatient and affiliated clinics of the only tertiary psychiatric hospital in Singapore. Internalized stigma was measured using the Internalized Stigma of Mental Illness scale. 43.6% experienced moderate to high internalized stigma. After making adjustments in multiple logistic regression analysis, results revealed there were no significant socio-demographic or clinical correlates relating to internalized stigma. Individual logistic regression models found a negative relationship between quality of life, self-esteem, general functioning and internalized stigma whereby lower scores were associated with higher internalized stigma. In the final regression model, which included all psychosocial variables together, self-esteem was the only variable significantly and negatively associated with internalized stigma. The results of this study contribute to our understanding of the role internalized stigma plays in patients with mental illness, and the impact it can have on psychosocial aspects of their lives. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Determining delayed admission to intensive care unit for mechanically ventilated patients in the emergency department.

PubMed

Hung, Shih-Chiang; Kung, Chia-Te; Hung, Chih-Wei; Liu, Ber-Ming; Liu, Jien-Wei; Chew, Ghee; Chuang, Hung-Yi; Lee, Wen-Huei; Lee, Tzu-Chi

2014-08-23

The adverse effects of delayed admission to the intensive care unit (ICU) have been recognized in previous studies. However, the definitions of delayed admission varies across studies. This study proposed a model to define "delayed admission", and explored the effect of ICU-waiting time on patients' outcome. This retrospective cohort study included non-traumatic adult patients on mechanical ventilation in the emergency department (ED), from July 2009 to June 2010. The primary outcomes measures were 21-ventilator-day mortality and prolonged hospital stays (over 30 days). Models of Cox regression and logistic regression were used for multivariate analysis. The non-delayed ICU-waiting was defined as a period in which the time effect on mortality was not statistically significant in a Cox regression model. To identify a suitable cut-off point between "delayed" and "non-delayed", subsets from the overall data were made based on ICU-waiting time and the hazard ratio of ICU-waiting hour in each subset was iteratively calculated. The cut-off time was then used to evaluate the impact of delayed ICU admission on mortality and prolonged length of hospital stay. The final analysis included 1,242 patients. The time effect on mortality emerged after 4 hours, thus we deduced ICU-waiting time in ED > 4 hours as delayed. By logistic regression analysis, delayed ICU admission affected the outcomes of 21 ventilator-days mortality and prolonged hospital stay, with odds ratio of 1.41 (95% confidence interval, 1.05 to 1.89) and 1.56 (95% confidence interval, 1.07 to 2.27) respectively. For patients on mechanical ventilation at the ED, delayed ICU admission is associated with higher probability of mortality and additional resource expenditure. A benchmark waiting time of no more than 4 hours for ICU admission is recommended.
The preliminary exploration of 64-slice volume computed tomography in the accurate measurement of pleural effusion.

PubMed

Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao; Lu, Jun-Ying; Zeng, Yan-Hong; Meng, Fan-Jie; Cao, Bin; Zi, Xue-Rong; Han, Shu-Ming; Zhang, Yu-Huan

2013-09-01

Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 × d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l × h × d): V = 0.56 × (l × h × d) + 39.44 (r = 0.92, P = 0.000). The 64-slice CT volume-rendering technique can accurately measure the volume in pleural effusion patients, and a linear regression equation can be used to estimate the volume of the free pleural effusion.
An example of complex modelling in dentistry using Markov chain Monte Carlo (MCMC) simulation.

PubMed

Helfenstein, Ulrich; Menghini, Giorgio; Steiner, Marcel; Murati, Francesca

2002-09-01

In the usual regression setting one regression line is computed for a whole data set. In a more complex situation, each person may be observed for example at several points in time and thus a regression line might be calculated for each person. Additional complexities, such as various forms of errors in covariables may make a straightforward statistical evaluation difficult or even impossible. During recent years methods have been developed allowing convenient analysis of problems where the data and the corresponding models show these and many other forms of complexity. The methodology makes use of a Bayesian approach and Markov chain Monte Carlo (MCMC) simulations. The methods allow the construction of increasingly elaborate models by building them up from local sub-models. The essential structure of the models can be represented visually by directed acyclic graphs (DAG). This attractive property allows communication and discussion of the essential structure and the substantial meaning of a complex model without needing algebra. After presentation of the statistical methods an example from dentistry is presented in order to demonstrate their application and use. The dataset of the example had a complex structure; each of a set of children was followed up over several years. The number of new fillings in permanent teeth had been recorded at several ages. The dependent variables were markedly different from the normal distribution and could not be transformed to normality. In addition, explanatory variables were assumed to be measured with different forms of error. Illustration of how the corresponding models can be estimated conveniently via MCMC simulation, in particular, 'Gibbs sampling', using the freely available software BUGS is presented. In addition, how the measurement error may influence the estimates of the corresponding coefficients is explored. It is demonstrated that the effect of the independent variable on the dependent variable may be markedly underestimated if the measurement error is not taken into account ('regression dilution bias'). Markov chain Monte Carlo methods may be of great value to dentists in allowing analysis of data sets which exhibit a wide range of different forms of complexity.
Parametric regression model for survival data: Weibull regression model as an example

PubMed Central

2016-01-01

Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

ERIC Educational Resources Information Center

Williams, Ryan T.

2012-01-01

Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…
Introduction to the use of regression models in epidemiology.

PubMed

Bender, Ralf

2009-01-01

Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Use of geographically weighted logistic regression to quantify spatial variation in the environmental and sociodemographic drivers of leptospirosis in Fiji: a modelling study.

PubMed

Mayfield, Helen J; Lowry, John H; Watson, Conall H; Kama, Mike; Nilles, Eric J; Lau, Colleen L

2018-05-01

Leptospirosis is a globally important zoonotic disease, with complex exposure pathways that depend on interactions between human beings, animals, and the environment. Major drivers of outbreaks include flooding, urbanisation, poverty, and agricultural intensification. The intensity of these drivers and their relative importance vary between geographical areas; however, non-spatial regression methods are incapable of capturing the spatial variations. This study aimed to explore the use of geographically weighted logistic regression (GWLR) to provide insights into the ecoepidemiology of human leptospirosis in Fiji. We obtained field data from a cross-sectional community survey done in 2013 in the three main islands of Fiji. A blood sample obtained from each participant (aged 1-90 years) was tested for anti-Leptospira antibodies and household locations were recorded using GPS receivers. We used GWLR to quantify the spatial variation in the relative importance of five environmental and sociodemographic covariates (cattle density, distance to river, poverty rate, residential setting [urban or rural], and maximum rainfall in the wettest month) on leptospirosis transmission in Fiji. We developed two models, one using GWLR and one with standard logistic regression; for each model, the dependent variable was the presence or absence of anti-Leptospira antibodies. GWLR results were compared with results obtained with standard logistic regression, and used to produce a predictive risk map and maps showing the spatial variation in odds ratios (OR) for each covariate. The dataset contained location information for 2046 participants from 1922 households representing 81 communities. The Aikaike information criterion value of the GWLR model was 1935·2 compared with 1254·2 for the standard logistic regression model, indicating that the GWLR model was more efficient. Both models produced similar OR for the covariates, but GWLR also detected spatial variation in the effect of each covariate. Maximum rainfall had the least variation across space (median OR 1·30, IQR 1·27-1·35), and distance to river varied the most (1·45, 1·35-2·05). The predictive risk map indicated that the highest risk was in the interior of Viti Levu, and the agricultural region and southern end of Vanua Levu. GWLR provided a valuable method for modelling spatial heterogeneity of covariates for leptospirosis infection and their relative importance over space. Results of GWLR could be used to inform more place-specific interventions, particularly for diseases with strong environmental or sociodemographic drivers of transmission. WHO, Australian National Health & Medical Research Council, University of Queensland, UK Medical Research Council, Chadwick Trust. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
The High Prevalence of Incarceration History Among Black Men Who Have Sex With Men in the United States: Associations and Implications

PubMed Central

Magnus, Manya; Kuo, Irene; Wang, Lei; Liu, Ting-Yuan; Mayer, Kenneth H.

2014-01-01

Objectives. We examined lifetime incarceration history and its association with key characteristics among 1553 Black men who have sex with men (BMSM) recruited in 6 US cities. Methods. We conducted bivariate analyses of data collected from the HIV Prevention Trials Network 061 study from July 2009 through December 2011 to examine the relationship between incarceration history and demographic and psychosocial variables predating incarceration and multivariate logistic regression analyses to explore the associations between incarceration history and demographic and psychosocial variables found to be significant. We then used multivariate logistic regression models to explore the independent association between incarceration history and 6 outcome variables. Results. After adjusting for confounders, we found that increasing age, transgender identity, heterosexual or straight identity, history of childhood violence, and childhood sexual experience were significantly associated with incarceration history. A history of incarceration was also independently associated with any alcohol and drug use in the past 6 months. Conclusions. The findings highlight an elevated lifetime incarceration history among a geographically diverse sample of BMSM and the need to adequately assess the impact of incarceration among BMSM in the United States. PMID:24432948
Utilization of maternal health care services among indigenous women in Bangladesh: A study on the Mru tribe.

PubMed

Islam, Rakibul M

2017-01-01

Despite startling developments in maternal health care services, use of these services has been disproportionately distributed among different minority groups in Bangladesh. This study aimed to explore the factors associated with the use of these services among the Mru indigenous women in Bangladesh. A total of 374 currently married Mru women were interviewed using convenience sampling from three administrative sub-districts of the Bandarban district from June to August of 2009. Associations were assessed using Chi-square tests, and a binary logistic regression model was employed to explore factors associated with the use of maternal health care services. Among the women surveyed, 30% had ever visited maternal health care services in the Mru community, a very low proportion compared with mainstream society. Multivariable logistic regression analyses revealed that place of residence, religion, school attendance, place of service provided, distance to the service center, and exposure to mass media were factors significantly associated with the use of maternal health care services among Mru women. Considering indigenous socio-cultural beliefs and practices, comprehensive community-based outreach health programs are recommended in the community with a special emphasis on awareness through maternal health education and training packages for the Mru adolescents.

Exploring Audiologists' Language and Hearing Aid Uptake in Initial Rehabilitation Appointments.

PubMed

Sciacca, Anna; Meyer, Carly; Ekberg, Katie; Barr, Caitlin; Hickson, Louise

2017-06-13

The study aimed (a) to profile audiologists' language during the diagnosis and management planning phase of hearing assessment appointments and (b) to explore associations between audiologists' language and patients' decisions to obtain hearing aids. Sixty-two audiologist-patient dyads participated. Patient participants were aged 55 years or older. Hearing assessment appointments were audiovisually recorded and transcribed for analysis. Audiologists' language was profiled using two measures: general language complexity and use of jargon. A binomial, multivariate logistic regression analysis was conducted to investigate the associations between these language measures and hearing aid uptake. The logistic regression model revealed that the Flesch-Kincaid reading grade level of audiologists' language was significantly associated with hearing aid uptake. Patients were less likely to obtain hearing aids when audiologists' language was at a higher reading grade level. No associations were found between audiologists' use of jargon and hearing aid uptake. Audiologists' use of complex language may present a barrier for patients to understand hearing rehabilitation recommendations. Reduced understanding may limit patient participation in the decision-making process and result in patients being less willing to trial hearing aids. Clear, concise language is recommended to facilitate shared decision making.
Understanding spatio-temporal strategies of adult zebrafish exploration in the open field test.

PubMed

Stewart, Adam Michael; Gaikwad, Siddharth; Kyzar, Evan; Kalueff, Allan V

2012-04-27

Zebrafish (Danio rerio) are emerging as a useful model organism for neuroscience research. Mounting evidence suggests that various traditional rodent paradigms may be adapted for testing zebrafish behavior. The open field test is a popular rodent test of novelty exploration, recently applied to zebrafish research. To better understand fish novelty behavior, we exposed adult zebrafish to two different open field arenas for 30 min, assessing the amount and temporal patterning of their exploration. While (similar to rodents) zebrafish scale their locomotory activity depending on the size of the tank, the temporal patterning of their activity was independent of arena size. These observations strikingly parallel similar rodent behaviors, suggesting that spatio-temporal strategies of animal exploration may be evolutionarily conserved across vertebrate species. In addition, we found interesting oscillations in zebrafish exploration, with the per-minute distribution of their horizontal activity demonstrating sinusoidal-like patterns. While such patterning is not reported for rodents and other higher vertebrates, a nonlinear regression analysis confirmed the oscillation patterning of all assessed zebrafish behavioral endpoints in both open field arenas, revealing a potentially important aspect of novelty exploration in lower vertebrates. Copyright © 2012 Elsevier B.V. All rights reserved.
The Role of Hierarchy in Response Surface Modeling of Wind Tunnel Data

NASA Technical Reports Server (NTRS)

DeLoach, Richard

2010-01-01

This paper is intended as a tutorial introduction to certain aspects of response surface modeling, for the experimentalist who has started to explore these methods as a means of improving productivity and quality in wind tunnel testing and other aerospace applications. A brief review of the productivity advantages of response surface modeling in aerospace research is followed by a description of the advantages of a common coding scheme that scales and centers independent variables. The benefits of model term reduction are reviewed. A constraint on model term reduction with coded factors is described in some detail, which requires such models to be well-formulated, or hierarchical. Examples illustrate the consequences of ignoring this constraint. The implication for automated regression model reduction procedures is discussed, and some opinions formed from the author s experience are offered on coding, model reduction, and hierarchy.
Normal Tissue Complication Probability (NTCP) Modelling of Severe Acute Mucositis using a Novel Oral Mucosal Surface Organ at Risk.

PubMed

Dean, J A; Welsh, L C; Wong, K H; Aleksic, A; Dunne, E; Islam, M R; Patel, A; Patel, P; Petkar, I; Phillips, I; Sham, J; Schick, U; Newbold, K L; Bhide, S A; Harrington, K J; Nutting, C M; Gulliford, S L

2017-04-01

A normal tissue complication probability (NTCP) model of severe acute mucositis would be highly useful to guide clinical decision making and inform radiotherapy planning. We aimed to improve upon our previous model by using a novel oral mucosal surface organ at risk (OAR) in place of an oral cavity OAR. Predictive models of severe acute mucositis were generated using radiotherapy dose to the oral cavity OAR or mucosal surface OAR and clinical data. Penalised logistic regression and random forest classification (RFC) models were generated for both OARs and compared. Internal validation was carried out with 100-iteration stratified shuffle split cross-validation, using multiple metrics to assess different aspects of model performance. Associations between treatment covariates and severe mucositis were explored using RFC feature importance. Penalised logistic regression and RFC models using the oral cavity OAR performed at least as well as the models using mucosal surface OAR. Associations between dose metrics and severe mucositis were similar between the mucosal surface and oral cavity models. The volumes of oral cavity or mucosal surface receiving intermediate and high doses were most strongly associated with severe mucositis. The simpler oral cavity OAR should be preferred over the mucosal surface OAR for NTCP modelling of severe mucositis. We recommend minimising the volume of mucosa receiving intermediate and high doses, where possible. Copyright © 2016 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
Drought Patterns Forecasting using an Auto-Regressive Logistic Model

NASA Astrophysics Data System (ADS)

del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

2014-12-01

Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.
Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

PubMed

Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

2017-01-01

The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.
Interpretation of commonly used statistical regression models.

PubMed

Kasza, Jessica; Wolfe, Rory

2014-01-01

A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Exploration of walking behavior in Vermont using spatial regression.

DOT National Transportation Integrated Search

2015-06-01

This report focuses on the relationship between walking and its contributing factors by : applying spatial regression methods. Using the Vermont data from the New England : Transportation Survey (NETS), walking variables as well as 170 independent va...
Analysis of the rate of wildcat drilling and deposit discovery

USGS Publications Warehouse

Drew, L.J.

1975-01-01

The rate at which petroleum deposits were discovered during a 16-yr period (1957-72) was examined in relation to changes in a suite of economic and physical variables. The study area encompasses 11,000 mi2 and is located on the eastern flank of the Powder River Basin. A two-stage multiple-regression model was used as a basis for this analysis. The variables employed in this model were: (1) the yearly wildcat drilling rate, (2) a measure of the extent of the physical exhaustion of the resource base of the region, (3) a proxy for the discovery expectation of the exploration operators active in the region, (4) an exploration price/cost ratio, and (5) the expected depths of the exploration targets sought. The rate at which wildcat wells were drilled was strongly correlated with the discovery expectation of the exploration operators. Small additional variations in the wildcat drilling rate were explained by the price/cost ratio and target-depth variables. The number of deposits discovered each year was highly dependent on the wildcat drilling rate, but the aggregate quantity of petroleum discovered each year was independent of the wildcat drilling rate. The independence between these last two variables is a consequence of the cyclical behavior of the exploration play mechanism. Although the discovery success ratio declined sharply during the initial phases of the two exploration plays which developed in the study area, a learning effect occurred whereby the discovery success ratio improved steadily with the passage of time during both exploration plays. ?? 1975 Plenum Publishing Corporation.
Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

NASA Astrophysics Data System (ADS)

Ozdemir, Adnan

2011-07-01

SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.
The interactive effects of genetic polymorphisms within LFA-1/ICAM-1/GSK-3β pathway and environmental hazards on the development of Graves' opthalmopathy.

PubMed

Yang, Ge; Fu, Yang; Lu, Xiaoyan; Wang, Menghua; Dong, Hongtao; Li, Qiuming

2018-05-22

The purpose of this investigation was to explore the combined effects of single nucleotide polymorphisms (SNPs) within LFA-1/ICAM-1/GSK-3β pathway and environmental hazards on susceptibility to Graves' opthalmopathy (GO) among a Chinese Han population. Altogether 305 GO patients and 283 Graves' disease (GD) subjects were recruited. Information relevant to the participants' age, gender, body mass index (BMI), regular physical activity, smoking history, alcohol intake, stressful work environment, stress at work, family history of thyroid disease and 131 I treatment were summarized, and the participants' related SNPs of LFA-1/ICAM-1/GSK-3β were also detected. Then the gene-gene and gene-environment interactions were evaluated by logistic regression model and multi-factor dimensionality reduction (MDR) modeling. The results exhibited that age, BMI, smoking history, stressful work, stress at home, family history of thyroid disease and 131 I treatment appeared as potential indicators regulating GO risk, when either univariate or multivariate regression analysis was performed (all P < 0.05). Moreover, rs12716977 (T > C) and rs2230433 (G > C) of LFA-1, rs1799969 (G > A) and rs5498 (A > G) of ICAM-1, as well as rs6438552 (T > C) and rs334558 (T > C) of GSK-3β were significantly associated with altered susceptibility to GO under the allelic models (all P < 0.05). Also haplotype TGAATC acted as a protective factor against GO risk (P < 0.05), whereas haplotype CGAACC largely elevated risk of GO (P < 0.05). Besides, logistic regression analysis demonstrated that rs12716927, rs5498 and rs6438552 all would affect the influences exerted by age, BMI, smoking history, stressful work, stress at home, family history of thyroid disease or 131 I treatment on GO susceptibility (all P < 0.05). MDR modeling implied that the combined model of rs12716977, rs2230433 and rs1799969 was the supreme interactive model when BMI was co-assessed, and the interactive model of rs12716977, rs334558 and rs5491 was the most desirable among the smoking population. In conclusion, gene-gene and gene-environment interactions served as a crucial manner in affecting susceptibility to GO, providing solid evidences for screening effective GO-susceptible biomarkers and exploring potential GO treatment strategies. Copyright © 2018. Published by Elsevier Ltd.
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.

PubMed

Nagelkerke, Nico; Fidler, Vaclav

2015-01-01

The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
Application of artificial intelligence to the management of urological cancer.

PubMed

Abbod, Maysam F; Catto, James W F; Linkens, Derek A; Hamdy, Freddie C

2007-10-01

Artificial intelligence techniques, such as artificial neural networks, Bayesian belief networks and neuro-fuzzy modeling systems, are complex mathematical models based on the human neuronal structure and thinking. Such tools are capable of generating data driven models of biological systems without making assumptions based on statistical distributions. A large amount of study has been reported of the use of artificial intelligence in urology. We reviewed the basic concepts behind artificial intelligence techniques and explored the applications of this new dynamic technology in various aspects of urological cancer management. A detailed and systematic review of the literature was performed using the MEDLINE and Inspec databases to discover reports using artificial intelligence in urological cancer. The characteristics of machine learning and their implementation were described and reports of artificial intelligence use in urological cancer were reviewed. While most researchers in this field were found to focus on artificial neural networks to improve the diagnosis, staging and prognostic prediction of urological cancers, some groups are exploring other techniques, such as expert systems and neuro-fuzzy modeling systems. Compared to traditional regression statistics artificial intelligence methods appear to be accurate and more explorative for analyzing large data cohorts. Furthermore, they allow individualized prediction of disease behavior. Each artificial intelligence method has characteristics that make it suitable for different tasks. The lack of transparency of artificial neural networks hinders global scientific community acceptance of this method but this can be overcome by neuro-fuzzy modeling systems.
Using Patient Demographics and Statistical Modeling to Predict Knee Tibia Component Sizing in Total Knee Arthroplasty.

PubMed

Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James

2018-06-01

Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.
Using Explanatory Item Response Models to Evaluate Complex Scientific Tasks Designed for the Next Generation Science Standards

NASA Astrophysics Data System (ADS)

Chiu, Tina

This dissertation includes three studies that analyze a new set of assessment tasks developed by the Learning Progressions in Middle School Science (LPS) Project. These assessment tasks were designed to measure science content knowledge on the structure of matter domain and scientific argumentation, while following the goals from the Next Generation Science Standards (NGSS). The three studies focus on the evidence available for the success of this design and its implementation, generally labelled as "validity" evidence. I use explanatory item response models (EIRMs) as the overarching framework to investigate these assessment tasks. These models can be useful when gathering validity evidence for assessments as they can help explain student learning and group differences. In the first study, I explore the dimensionality of the LPS assessment by comparing the fit of unidimensional, between-item multidimensional, and Rasch testlet models to see which is most appropriate for this data. By applying multidimensional item response models, multiple relationships can be investigated, and in turn, allow for a more substantive look into the assessment tasks. The second study focuses on person predictors through latent regression and differential item functioning (DIF) models. Latent regression models show the influence of certain person characteristics on item responses, while DIF models test whether one group is differentially affected by specific assessment items, after conditioning on latent ability. Finally, the last study applies the linear logistic test model (LLTM) to investigate whether item features can help explain differences in item difficulties.
Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling.

PubMed

Melcher, Michael; Scharl, Theresa; Luchner, Markus; Striedner, Gerald; Leisch, Friedrich

2017-02-01

The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A new approach to correct the QT interval for changes in heart rate using a nonparametric regression model in beagle dogs.

PubMed

Watanabe, Hiroyuki; Miyazaki, Hiroyasu

2006-01-01

Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Logistic regression model can reduce unnecessary artificial liver support in hepatitis B virus-associated acute-on-chronic liver failure: decision curve analysis.

PubMed

Qin, Gang; Bian, Zhao-Lian; Shen, Yi; Zhang, Lei; Zhu, Xiao-Hong; Liu, Yan-Mei; Shao, Jian-Guo

2016-06-04

Several models have been proposed to predict the short-term outcome of acute-on-chronic liver failure (ACLF) after treatment. We aimed to determine whether better decisions for artificial liver support system (ALSS) treatment could be made with a model than without, through decision curve analysis (DCA). The medical profiles of a cohort of 232 patients with hepatitis B virus (HBV)-associated ACLF were retrospectively analyzed to explore the role of plasma prothrombin activity (PTA), model for end-stage liver disease (MELD) and logistic regression model (LRM) in identifying patients who could benefit from ALSS. The accuracy and reliability of PTA, MELD and LRM were evaluated with previously reported cutoffs. DCA was performed to evaluate the clinical role of these models in predicting the treatment outcome. With the cut-off value of 0.2, LRM had sensitivity of 92.6 %, specificity of 42.3 % and an area under the receiving operating characteristic curve (AUC) of 0.68, which showed superior discrimination over PTA and MELD. DCA revealed that the LRM-guided ALSS treatment was superior over other strategies including "treating all" and MELD-guided therapy, for the midrange threshold probabilities of 16 to 64 %. The use of LRM-guided ALSS treatment could increase both the accuracy and efficiency of this procedure, allowing the avoidance of unnecessary ALSS.
Exploring QSARs of the interaction of flavonoids with GABA (A) receptor using MLR, ANN and SVM techniques.

PubMed

Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K

2014-10-01

Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.
Does cancer in a child affect parents' employment and earnings? A population-based study.

PubMed

Syse, Astri; Larsen, Inger Kristin; Tretli, Steinar

2011-06-01

Cancer in a child may adversely affect parents' work opportunities due to enlarged care burdens and/or altered priorities. Few studies exist, and possible effects on parental employment and earnings were therefore explored. Data on the entire Norwegian population aged 27-65 with children under the age of 20 in 1990-2002 (N=1.2 million) was retrieved from national registries. Employment rates for parents of 3263 children with cancer were compared to those of parents with children without cancer by means of logistic regression models. Log-linear regression models were used to explore childhood cancer's effect on parental earnings for the large majority of parents who remained employed. Cancer in a child was in general not associated with a reduced risk of employment, although some exceptions exist among both mothers and fathers. For employed mothers, CNS cancers, germinal cell cancers, and unspecified leukemia were associated with significant reductions in earnings (10%, 21%, and 60%, respectively). Reductions were particularly pronounced for mothers with a young and alive child, and became more pronounced with time elapsed from diagnosis. Fathers' earnings were not affected significantly. Parents' employment is not adversely affected by a child's cancer in Norway. Earnings are reduced in certain instances, but the overall effects are minor. Generous welfare options and flexible labor markets typical for Nordic welfare states may account for this. In line with traditional caregiving responsibilities, reductions in earnings were most pronounced for mothers. Copyright © 2010 Elsevier Ltd. All rights reserved.

Acute Effects of Nitrogen Dioxide on Cardiovascular Mortality in Beijing: An Exploration of Spatial Heterogeneity and the District-specific Predictors

NASA Astrophysics Data System (ADS)

Luo, Kai; Li, Runkui; Li, Wenjing; Wang, Zongshuang; Ma, Xinming; Zhang, Ruiming; Fang, Xin; Wu, Zhenglai; Cao, Yang; Xu, Qun

2016-12-01

The exploration of spatial variation and predictors of the effects of nitrogen dioxide (NO2) on fatal health outcomes is still sparse. In a multilevel case-crossover study in Beijing, China, we used mixed Cox proportional hazard model to examine the citywide effects and conditional logistic regression to evaluate the district-specific effects of NO2 on cardiovascular mortality. District-specific predictors that could be related to the spatial pattern of NO2 effects were examined by robust regression models. We found that a 10 μg/m3 increase in daily mean NO2 concentration was associated with a 1.89% [95% confidence interval (CI): 1.33-2.45%], 2.07% (95% CI: 1.23-2.91%) and 1.95% (95% CI: 1.16-2.72%) increase in daily total cardiovascular (lag03), cerebrovascular (lag03) and ischemic heart disease (lag02) mortality, respectively. For spatial variation of NO2 effects across 16 districts, significant effects were only observed in 5, 4 and 2 districts for the above three outcomes, respectively. Generally, NO2 was likely having greater adverse effects on districts with larger population, higher consumption of coal and more civilian vehicles. Our results suggested independent and spatially varied effects of NO2 on total and subcategory cardiovascular mortalities. The identification of districts with higher risk can provide important insights for reducing NO2 related health hazards.
Inferring the use of forelimb suspensory locomotion by extinct primate species via shape exploration of the ulna.

PubMed

Rein, Thomas R; Harvati, Katerina; Harrison, Terry

2015-01-01

Uncovering links between skeletal morphology and locomotor behavior is an essential component of paleobiology because it allows researchers to infer the locomotor repertoire of extinct species based on preserved fossils. In this study, we explored ulnar shape in anthropoid primates using 3D geometric morphometrics to discover novel aspects of shape variation that correspond to observed differences in the relative amount of forelimb suspensory locomotion performed by species. The ultimate goal of this research was to construct an accurate predictive model that can be applied to infer the significance of these behaviors. We studied ulnar shape variation in extant species using principal component analysis. Species mainly clustered into phylogenetic groups along the first two principal components. Upon closer examination, the results showed that the position of species within each major clade corresponded closely with the proportion of forelimb suspensory locomotion that they have been observed to perform in nature. We used principal component regression to construct a predictive model for the proportion of these behaviors that would be expected to occur in the locomotor repertoire of anthropoid primates. We then applied this regression analysis to Pliopithecus vindobonensis, a stem catarrhine from the Miocene of central Europe, and found strong evidence that this species was adapted to perform a proportion of forelimb suspensory locomotion similar to that observed in the extant woolly monkey, Lagothrix lagothricha. Copyright © 2014 Elsevier Ltd. All rights reserved.
Low-level violence in schools: is there an association between school safety measures and peer victimization?

PubMed

Blosnich, John; Bossarte, Robert

2011-02-01

Low-level violent behavior, particularly school bullying, remains a critical public health issue that has been associated with negative mental and physical health outcomes. School-based prevention programs, while a valuable line of defense to stave off bullying, have shown inconsistent results in terms of decreasing bullying. This study explored whether school safety measures (eg, security guards, cameras, ID badges) were associated with student reports of different forms of peer victimization related to bullying. Data came from the 2007 School Crime Supplement of the National Crime Victimization Survey. Chi-square tests of independence were used to examine differences among categorical variables. Logistic regression models were constructed for the peer victimization outcomes. A count variable was constructed among the bullying outcomes (0-7) with which a Poisson regression model was constructed to analyze school safety measures' impacts on degree of victimization. Of the various school safety measures, only having adults in hallways resulted in a significant reduction in odds of being physically bullied, having property vandalized, or having rumors spread. In terms of degree of victimization, having adults and/or staff supervising hallways was associated with an approximate 26% decrease in students experiencing an additional form of peer victimization. Results indicated that school safety measures overall were not associated with decreased reports of low-level violent behaviors related to bullying. More research is needed to further explore what best promotes comprehensive safety in schools. © 2011, American School Health Association.
Performance-Based Contracting Within a State Substance Abuse Treatment System: A Preliminary Exploration of Differences in Client Access and Client Outcomes

PubMed Central

Brucker, Debra L.; Stewart, Maureen

2013-01-01

To explore whether the implementation of performance-based contracting (PBC) within the State of Maine’s substance abuse treatment system resulted in improved performance, one descriptive and two empirical analyses were conducted. The first analysis examined utilization and payment structure. The second study was designed to examine whether timeliness of access to outpatient (OP) and intensive outpatient (IOP) substance abuse assessments and treatment, measures that only became available after the implementation of PBC, differed between PBC and non-PBC agencies in the year following implementation of PBC. Using treatment admission records from the state treatment data system (N=9,128), logistic regression models run using generalized equation estimation techniques found no significant difference between PBC agencies and other agencies on timeliness of access to assessments or treatment, for both OP and IOP services. The third analysis, conducted using discharge data from the years prior to and after the implementation of performance-based contracting (N=6,740) for those agencies that became a part of the performance-based contracting system, was designed to assess differences in level of participation, retention, and completion of treatment. Regression models suggest that performance on OP client engagement and retention measures was significantly poorer the year after the implementation of PBC, but that temporal rather than a PBC effects were more significant. No differences were found between years for IOP level of participation or completion of treatment measures. PMID:21249461
A Comparison between the Use of Beta Weights and Structure Coefficients in Interpreting Regression Results

ERIC Educational Resources Information Center

Tong, Fuhui

2006-01-01

Background: An extensive body of researches has favored the use of regression over other parametric analyses that are based on OVA. In case of noteworthy regression results, researchers tend to explore magnitude of beta weights for the respective predictors. Purpose: The purpose of this paper is to examine both beta weights and structure…
Observing Consistency in Online Communication Patterns for User Re-Identification.

PubMed

Adeyemi, Ikuesan Richard; Razak, Shukor Abd; Salleh, Mazleena; Venter, Hein S

2016-01-01

Comprehension of the statistical and structural mechanisms governing human dynamics in online interaction plays a pivotal role in online user identification, online profile development, and recommender systems. However, building a characteristic model of human dynamics on the Internet involves a complete analysis of the variations in human activity patterns, which is a complex process. This complexity is inherent in human dynamics and has not been extensively studied to reveal the structural composition of human behavior. A typical method of anatomizing such a complex system is viewing all independent interconnectivity that constitutes the complexity. An examination of the various dimensions of human communication pattern in online interactions is presented in this paper. The study employed reliable server-side web data from 31 known users to explore characteristics of human-driven communications. Various machine-learning techniques were explored. The results revealed that each individual exhibited a relatively consistent, unique behavioral signature and that the logistic regression model and model tree can be used to accurately distinguish online users. These results are applicable to one-to-one online user identification processes, insider misuse investigation processes, and online profiling in various areas.
[Impact analysis of shuxuetong injection on abnormal changes of ALT based on generalized boosted models propensity score weighting].

PubMed

Yang, Wei; Yi, Dan-Hui; Xie, Yan-Ming; Yang, Wei; Dai, Yi; Zhi, Ying-Jie; Zhuang, Yan; Yang, Hu

2013-09-01

To estimate treatment effects of Shuxuetong injection on abnormal changes on ALT index, that is, to explore whether the Shuxuetong injection harms liver function in clinical settings and to provide clinical guidance for its safe application. Clinical information of traditional Chinese medicine (TCM) injections is gathered from hospital information system (HIS) of eighteen general hospitals. This is a retrospective cohort study, using abnormal changes in ALT index as an outcome. A large number of confounding biases are taken into account through the generalized boosted models (GBM) and multiple logistic regression model (MLRM) to estimate the treatment effects of Shuxuetong injections on abnormal changes in ALT index and to explore possible influencing factors. The advantages and process of application of GBM has been demonstrated with examples which eliminate the biases from most confounding variables between groups. This serves to modify the estimation of treatment effects of Shuxuetong injection on ALT index making the results more reliable. Based on large scale clinical observational data from HIS database, significant effects of Shuxuetong injection on abnormal changes in ALT have not been found.
Statistical tools for analysis and modeling of cosmic populations and astronomical time series: CUDAHM and TSE

NASA Astrophysics Data System (ADS)

Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.

2018-01-01

This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.
The recovery of bladder epithelial hyperplasia caused by a melamine diet-induced bladder calculus in mice.

PubMed

Sun, Ying; Jiang, Yi-Na; Xu, Chang-Fu; Du, Yun-Xia; Zhang, Jiao-Jiao; Yan, Yang; Gao, Xiao-Li

2014-02-01

Applying a model of bladder epithelial hyperplasia (BEH) caused by melamine-induced bladder calculus (BC), the recovery of BEH after melamine withdrawal was investigated. One experiment, comprising untreated, melamine and recovery groups, was conducted in Balb/c mice. Each group included 4 subgroups. Mice were fed normal-diet in untreated or a melamine-diet in other groups. The melamine-diet was then substituted with normal-diet in recovery group. Both of BC and BEH were observed after 14 and 56 days of melamine-diet. The BC is relatively uniform at the same melamine-diet durations. The BEH was diffuse with many mitotic figures, 4-7 rows of nuclei, and well-defined umbrella/intermediate cells. No marked differences in BEH degree were observed in the two different melamine-diet durations. On 4-42 days after melamine withdrawal, BC was not found, as the progressive regression with complete regression of BEH was observed, along with well-defined ageing/apoptotic cells in the superficial regions of BEH regression tissue. Conclusion, the melamine-induced BEH is relatively uniform, may be self-limiting in rows of nuclei, and can return to normal. Melamine withdrawal duration is critical for the BEH regression. Tissue of the BEH and its regression is ideal for exploring the renewal as well as growth biology of mammalian urothelium. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Outsourcing primary health care services--how politicians explain the grounds for their decisions.

PubMed

Laamanen, Ritva; Simonsen-Rehn, Nina; Suominen, Sakari; Øvretveit, John; Brommels, Mats

2008-12-01

To explore outsourcing of primary health care (PHC) services in four municipalities in Finland with varying amounts and types of outsourcing: a Southern municipality (SM) which contracted all PHC services to a not-for-profit voluntary organization, and Eastern (EM), South-Western (SWM) and Western (WM) municipalities which had contracted out only a few services to profit or public organizations. A mail survey to all municipality politicians (response rate 52%, N=101) in 2004. Data were analyzed using cross-tabulations, Spearman correlation and linear regression analyses. Politicians were willing to outsource PHC services only partially, and many problems relating to outsourcing were reported. Politicians in all municipalities were least likely to outsource preventive services. A multiple linear regression model showed that reported preference to outsource in EM and in SWM was lower than in SM, and also lower among politicians from "leftist" political parties than "rightist" political parties. Perceived difficulties in local health policy issues were related to reduced preference to outsource. The model explained 27% of the variance of the inclination to outsource PHC services. The findings highlight how important it is to take into account local health policy issues when assessing service-provision models.
A quantile regression approach can reveal the effect of fruit and vegetable consumption on plasma homocysteine levels.

PubMed

Verly, Eliseu; Steluti, Josiane; Fisberg, Regina Mara; Marchioni, Dirce Maria Lobo

2014-01-01

A reduction in homocysteine concentration due to the use of supplemental folic acid is well recognized, although evidence of the same effect for natural folate sources, such as fruits and vegetables (FV), is lacking. The traditional statistical analysis approaches do not provide further information. As an alternative, quantile regression allows for the exploration of the effects of covariates through percentiles of the conditional distribution of the dependent variable. To investigate how the associations of FV intake with plasma total homocysteine (tHcy) differ through percentiles in the distribution using quantile regression. A cross-sectional population-based survey was conducted among 499 residents of Sao Paulo City, Brazil. The participants provided food intake and fasting blood samples. Fruit and vegetable intake was predicted by adjusting for day-to-day variation using a proper measurement error model. We performed a quantile regression to verify the association between tHcy and the predicted FV intake. The predicted values of tHcy for each percentile model were calculated considering an increase of 200 g in the FV intake for each percentile. The results showed that tHcy was inversely associated with FV intake when assessed by linear regression whereas, the association was different when using quantile regression. The relationship with FV consumption was inverse and significant for almost all percentiles of tHcy. The coefficients increased as the percentile of tHcy increased. A simulated increase of 200 g in the FV intake could decrease the tHcy levels in the overall percentiles, but the higher percentiles of tHcy benefited more. This study confirms that the effect of FV intake on lowering the tHcy levels is dependent on the level of tHcy using an innovative statistical approach. From a public health point of view, encouraging people to increase FV intake would benefit people with high levels of tHcy.
An ultra low power feature extraction and classification system for wearable seizure detection.

PubMed

Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

2015-01-01

In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.
Modified Regression Correlation Coefficient for Poisson Regression Model

NASA Astrophysics Data System (ADS)

Kaengthong, Nattacha; Domthong, Uthumporn

2017-09-01

This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Land Use Regression Modeling of Outdoor Noise Exposure in Informal Settlements in Western Cape, South Africa.

PubMed

Sieber, Chloé; Ragettli, Martina S; Brink, Mark; Toyib, Olaniyan; Baatjies, Roslyn; Saucy, Apolline; Probst-Hensch, Nicole; Dalvie, Mohamed Aqiel; Röösli, Martin

2017-10-20

In low- and middle-income countries, noise exposure and its negative health effects have been little explored. The present study aimed to assess the noise exposure situation in adults living in informal settings in the Western Cape Province, South Africa. We conducted continuous one-week outdoor noise measurements at 134 homes in four different areas. These data were used to develop a land use regression (LUR) model to predict A-weighted day-evening-night equivalent sound levels (L den ) from geographic information system (GIS) variables. Mean noise exposure during day (6:00-18:00) was 60.0 A-weighted decibels (dB(A)) (interquartile range 56.9-62.9 dB(A)), during night (22:00-6:00) 52.9 dB(A) (49.3-55.8 dB(A)) and average L den was 63.0 dB(A) (60.1-66.5 dB(A)). Main predictors of the LUR model were related to road traffic and household density. Model performance was low (adjusted R 2 = 0.130) suggesting that other influences than those represented in the geographic predictors are relevant for noise exposure. This is one of the few studies on the noise exposure situation in low- and middle-income countries. It demonstrates that noise exposure levels are high in these settings.
Estimating carbon and showing impacts of drought using satellite data in regression-tree models

USGS Publications Warehouse

Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.

2018-01-01

Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.
School Collective Efficacy and Bullying Behaviour: A Multilevel Study.

PubMed

Olsson, Gabriella; Låftman, Sara Brolin; Modin, Bitte

2017-12-20

As with other forms of violent behaviour, bullying is the result of multiple influences acting on different societal levels. Yet the majority of studies on bullying focus primarily on the characteristics of individual bullies and bullied. Fewer studies have explored how the characteristics of central contexts in young people's lives are related to bullying behaviour over and above the influence of individual-level characteristics. This study explores how teacher-rated school collective efficacy is related to student-reported bullying behaviour (traditional and cyberbullying victimization and perpetration). A central focus is to explore if school collective efficacy is related similarly to both traditional bullying and cyberbullying. Analyses are based on combined information from two independent data collections conducted in 2016 among 11th grade students ( n = 6067) and teachers ( n = 1251) in 58 upper secondary schools in Stockholm. The statistical method used is multilevel modelling, estimating two-level binary logistic regression models. The results demonstrate statistically significant between-school differences in all outcomes, except traditional bullying perpetration. Strong school collective efficacy is related to less traditional bullying perpetration and less cyberbullying victimization and perpetration, indicating that collective norm regulation and school social cohesion may contribute to reducing the occurrence of bullying.
School Collective Efficacy and Bullying Behaviour: A Multilevel Study

PubMed Central

Olsson, Gabriella; Låftman, Sara Brolin; Modin, Bitte

2017-01-01

As with other forms of violent behaviour, bullying is the result of multiple influences acting on different societal levels. Yet the majority of studies on bullying focus primarily on the characteristics of individual bullies and bullied. Fewer studies have explored how the characteristics of central contexts in young people’s lives are related to bullying behaviour over and above the influence of individual-level characteristics. This study explores how teacher-rated school collective efficacy is related to student-reported bullying behaviour (traditional and cyberbullying victimization and perpetration). A central focus is to explore if school collective efficacy is related similarly to both traditional bullying and cyberbullying. Analyses are based on combined information from two independent data collections conducted in 2016 among 11th grade students (n = 6067) and teachers (n = 1251) in 58 upper secondary schools in Stockholm. The statistical method used is multilevel modelling, estimating two-level binary logistic regression models. The results demonstrate statistically significant between-school differences in all outcomes, except traditional bullying perpetration. Strong school collective efficacy is related to less traditional bullying perpetration and less cyberbullying victimization and perpetration, indicating that collective norm regulation and school social cohesion may contribute to reducing the occurrence of bullying. PMID:29261114
Social deprivation and use of mental health legislation in New Zealand.

PubMed

O'Brien, Anthony John; Kydd, Robert; Frampton, Christopher

2012-11-01

Low socioeconomic status has consistently been associated with poorer health outcomes. Few studies have used ecological analysis to explore relationships between area measures of deprivation and use of mental health legislation. We used an ecological design to explore associations between two area measures of relative deprivation and the two most commonly used sections of New Zealand mental health legislation. High levels of relative deprivation were positively correlated with use of both acute and long-term community care provisions of mental health legislation with the correlation with long-term care achieving significance (r = .518, p = .016). Low levels of relative deprivation showed negative correlations with use of both provisions. The correlation of -.493 between low levels of relative deprivation and acute care provisions was significant at p = .023. In stepwise regression, the proportion of the population aged 15-64 contributed to the model for section 11, but ethnicity contributed to neither model. Mental health legislation is used disproportionately in areas with high levels of relative deprivation. The results have implications for regional allocation of funding for mental health and social services to support community-based care. Further research is needed to explore other factors that may account for the regional variation.
Psychosocial work factors and long sickness absence in Europe.

PubMed

Slany, Corinna; Schütte, Stefanie; Chastang, Jean-François; Parent-Thirion, Agnès; Vermeylen, Greet; Niedhammer, Isabelle

2014-01-01

Studies exploring a wide range of psychosocial work factors separately and together in association with long sickness absence are still lacking. The objective of this study was to explore the associations between psychosocial work factors measured following a comprehensive instrument (Copenhagen psychosocial questionnaire, COPSOQ) and long sickness absence (> 7 days/year) in European employees of 34 countries. An additional objective was to study the differences in these associations according to gender and countries. The study population consisted of 16 120 male and 16 588 female employees from the 2010 European working conditions survey. Twenty-five psychosocial work factors were explored. Statistical analysis was performed using multilevel logistic regression models and interaction testing. When studied together in the same model, factors related to job demands (quantitative demands and demands for hiding emotions), possibilities for development, social relationships (role conflicts, quality of leadership, social support, and sense of community), workplace violence (physical violence, bullying, and discrimination), shift work, and job promotion were associated with long sickness absence. Almost no difference was observed according to gender and country. Comprehensive prevention policies oriented to psychosocial work factors may be useful to prevent long sickness absence at European level.
Psychosocial work factors and long sickness absence in Europe

PubMed Central

Slany, Corinna; Schütte, Stefanie; Chastang, Jean-François; Parent-Thirion, Agnès; Vermeylen, Greet; Niedhammer, Isabelle

2014-01-01

Background: Studies exploring a wide range of psychosocial work factors separately and together in association with long sickness absence are still lacking. Objectives: The objective of this study was to explore the associations between psychosocial work factors measured following a comprehensive instrument (Copenhagen psychosocial questionnaire, COPSOQ) and long sickness absence (>7 days/year) in European employees of 34 countries. An additional objective was to study the differences in these associations according to gender and countries. Methods: The study population consisted of 16 120 male and 16 588 female employees from the 2010 European working conditions survey. Twenty-five psychosocial work factors were explored. Statistical analysis was performed using multilevel logistic regression models and interaction testing. Results: When studied together in the same model, factors related to job demands (quantitative demands and demands for hiding emotions), possibilities for development, social relationships (role conflicts, quality of leadership, social support, and sense of community), workplace violence (physical violence, bullying, and discrimination), shift work, and job promotion were associated with long sickness absence. Almost no difference was observed according to gender and country. Conclusions: Comprehensive prevention policies oriented to psychosocial work factors may be useful to prevent long sickness absence at European level. PMID:24176393

Determination of total iron-reactive phenolics, anthocyanins and tannins in wine grapes of skins and seeds based on near-infrared hyperspectral imaging.

PubMed

Zhang, Ni; Liu, Xu; Jin, Xiaoduo; Li, Chen; Wu, Xuan; Yang, Shuqin; Ning, Jifeng; Yanne, Paul

2017-12-15

Phenolics contents in wine grapes are key indicators for assessing ripeness. Near-infrared hyperspectral images during ripening have been explored to achieve an effective method for predicting phenolics contents. Principal component regression (PCR), partial least squares regression (PLSR) and support vector regression (SVR) models were built, respectively. The results show that SVR behaves globally better than PLSR and PCR, except in predicting tannins content of seeds. For the best prediction results, the squared correlation coefficient and root mean square error reached 0.8960 and 0.1069g/L (+)-catechin equivalents (CE), respectively, for tannins in skins, 0.9065 and 0.1776 (g/L CE) for total iron-reactive phenolics (TIRP) in skins, 0.8789 and 0.1442 (g/L M3G) for anthocyanins in skins, 0.9243 and 0.2401 (g/L CE) for tannins in seeds, and 0.8790 and 0.5190 (g/L CE) for TIRP in seeds. Our results indicated that NIR hyperspectral imaging has good prospects for evaluation of phenolics in wine grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Near-infrared reflectance spectroscopy predicts protein, starch, and seed weight in intact seeds of common bean ( Phaseolus vulgaris L.).

PubMed

Hacisalihoglu, Gokhan; Larbi, Bismark; Settles, A Mark

2010-01-27

The objective of this study was to explore the potential of near-infrared reflectance (NIR) spectroscopy to determine individual seed composition in common bean ( Phaseolus vulgaris L.). NIR spectra and analytical measurements of seed weight, protein, and starch were collected from 267 individual bean seeds representing 91 diverse genotypes. Partial least-squares (PLS) regression models were developed with 61 bean accessions randomly assigned to a calibration data set and 30 accessions assigned to an external validation set. Protein gave the most accurate PLS regression, with the external validation set having a standard error of prediction (SEP) = 1.6%. PLS regressions for seed weight and starch had sufficient accuracy for seed sorting applications, with SEP = 41.2 mg and 4.9%, respectively. Seed color had a clear effect on the NIR spectra, with black beans having a distinct spectral type. Seed coat color did not impact the accuracy of PLS predictions. This research demonstrates that NIR is a promising technique for simultaneous sorting of multiple seed traits in single bean seeds with no sample preparation.
Bayesian Factor Analysis as a Variable Selection Problem: Alternative Priors and Consequences

PubMed Central

Lu, Zhao-Hua; Chow, Sy-Miin; Loken, Eric

2016-01-01

Factor analysis is a popular statistical technique for multivariate data analysis. Developments in the structural equation modeling framework have enabled the use of hybrid confirmatory/exploratory approaches in which factor loading structures can be explored relatively flexibly within a confirmatory factor analysis (CFA) framework. Recently, a Bayesian structural equation modeling (BSEM) approach (Muthén & Asparouhov, 2012) has been proposed as a way to explore the presence of cross-loadings in CFA models. We show that the issue of determining factor loading patterns may be formulated as a Bayesian variable selection problem in which Muthén and Asparouhov’s approach can be regarded as a BSEM approach with ridge regression prior (BSEM-RP). We propose another Bayesian approach, denoted herein as the Bayesian structural equation modeling with spike and slab prior (BSEM-SSP), which serves as a one-stage alternative to the BSEM-RP. We review the theoretical advantages and disadvantages of both approaches and compare their empirical performance relative to two modification indices-based approaches and exploratory factor analysis with target rotation. A teacher stress scale data set (Byrne, 2012; Pettegrew & Wolf, 1982) is used to demonstrate our approach. PMID:27314566
Translating statistical species-habitat models to interactive decision support tools

USGS Publications Warehouse

Wszola, Lyndsie S.; Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

2017-01-01

Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.
Employment status and intimate partner violence among Mexican women.

PubMed

Terrazas-Carrillo, Elizabeth C; McWhirter, Paula T

2015-04-01

Exploring risk factors and profiles of intimate partner violence in other countries provides information about whether existing theories of this phenomenon hold consistent in different cultural settings. This study will present results of a regression analysis involving domestic violence among Mexican women (n = 83,159). Significant predictors of domestic violence among Mexican women included age, number of children in the household, income, education, self-esteem, family history of abuse, and controlling behavior of the husband. Women's employment status was not a significant predictor when all variables were included in the model; however, when controlling behavior of the husband was withdrawn from the model, women's employment status was a significant predictor of domestic violence toward women. Results from this research indicate that spousal controlling behavior may serve as a mediator of the predictive relationship between women's employment status and domestic violence among Mexican women. Findings provide support for continued exploration of the factors that mediate experiences of domestic violence among women worldwide. © The Author(s) 2014.
Caregiving in a patient's place of residence: turnover of direct care workers in home care and hospice agencies.

PubMed

Dill, Janette S; Cagle, John

2010-09-01

High turnover and staff shortages among home care and hospice workers may compromise the quality and availability of in-home care. This study explores turnover rates of direct care workers for home care and hospice agencies. OLS (ordinary least square) regression models are run using organizational data from 93 home care agencies and 29 hospice agencies in North Carolina. Home care agencies have higher total turnover rates than hospice agencies, but profit status may be an important covariate. Higher unemployment rates are associated with lower voluntary turnover. Agencies that do not offer health benefits experience higher involuntary turnover. Differences in turnover between hospice and home health agencies suggest that organizational characteristics of hospice care contribute to lower turnover rates. However, the variation in turnover rates is not fully explained by the proposed multivariate models. Future research should explore individual and structural-level variables that affect voluntary and involuntary turnover in these settings.
Translating statistical species-habitat models to interactive decision support tools.

PubMed

Wszola, Lyndsie S; Simonsen, Victoria L; Stuber, Erica F; Gillespie, Caitlyn R; Messinger, Lindsey N; Decker, Karie L; Lusk, Jeffrey J; Jorgensen, Christopher F; Bishop, Andrew A; Fontaine, Joseph J

2017-01-01

Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.
Translating statistical species-habitat models to interactive decision support tools

PubMed Central

Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

2017-01-01

Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences. PMID:29236707
[Physical and sexual abuse during childhood and revictimization during adulthood in Mexican women].

PubMed

Rivera-Rivera, Leonor; Allen, Betania; Chávez-Ayala, Rubén; Avila-Burgos, Leticia

2006-01-01

To quantify the association between physical and sexual abuse during childhood and violence during adulthood in a representative sample of female health care users in Mexico. A questionnaire was administered to 26 042 women over 14 years of age who sought medical consultation from public health care services between October 2002 and March 2003, in all 32 states in Mexico. Two models were constructed: a) Multiple polytomic logistic regression models to explore the association between violent victimization by the partner during adulthood and violence during childhood. b) Multiple logistic regression models to explore the association between experiencing rape during adulthood and violence during childhood. Among women studied, an association was found between experiencing physical violence during childhood and suffering physical and sexual violence from the male partner or experiencing rape, during adulthood. When physical violence during childhood occurred "almost always", it was more likely that the woman undergo physical and sexual violence (OR = 3.1; 95% CI 2.6-3.7) and rape (OR = 2.9; 95% CI 2.4-3.6), during her adult life. In addition, when violence during childhood was more frequent, the likelihood of experiencing violence during adulthood was greater. A positive association was found between physical and sexual abuse before 15 years of age (OR = 2.8; 95% CI 2.2-3.5). Experiencing rape during adulthood was also associated with sexual abuse before 15 years of age (OR = 11.8; 95% CI 10.2-13.7). In this sample of Mexican women, both physical and sexual violence during childhood has negative results during adulthood, including a greater likelihood of revictimization by the male partner and rape. Physical and sexual abuse during childhood must be prevented or at least detected and treated.
Associations between ambient fine particulate air pollution and hypertension: A nationwide cross-sectional study in China.

PubMed

Liu, Cong; Chen, Renjie; Zhao, Yaohui; Ma, Zongwei; Bi, Jun; Liu, Yang; Meng, Xia; Wang, Yafeng; Chen, Xinxin; Li, Weihua; Kan, Haidong

2017-04-15

Limited evidence is available regarding the long-term effects of fine particulate (PM 2.5 ) air pollution on hypertension in developing countries. This study aimed to explore the associations of long-term exposure to PM 2.5 with hypertension prevalence and blood pressure (BP) in China. We conducted a cross-sectional study based on a nationally representative survey (13,975 participants). We estimated the long-term average exposure to PM 2.5 for all subjects during the study period (June 2011 to March 2012) by a satellite-based model with a spatial resolution of 10×10km. We applied multivariable logistic regression models to evaluate the associations between PM 2.5 and hypertension prevalence and linear regression models for the associations between PM 2.5 and systolic BP and diastolic BP. We also explored potential effect modification by stratification analyses. There were 5715 cases of hypertension, accounting for 40.9% of the study population in this analysis. The annual mean exposure to PM 2.5 for all participants was 72.8μg/m 3 on average. An interquartile range increase (IQR, 41.7μg/m 3 ) in PM 2.5 was associated with higher prevalence of hypertension with an odds ratio of 1.11 [95% confidence interval (CI): 1.05, 1.17]. Systolic BP increased by 0.60mmHg (95% CI: 0.05, 1.15) per an IQR increase in PM 2.5 . The effects of PM 2.5 on hypertension prevalence were stronger among middle-aged, obese and urban participants. This national study indicated that long-term exposure to PM 2.5 was associated with increased prevalence of hypertension and slightly higher systolic BP in China. Copyright © 2017 Elsevier B.V. All rights reserved.
Reconstruction of spatio-temporal temperature from sparse historical records using robust probabilistic principal component regression

USGS Publications Warehouse

Tipton, John; Hooten, Mevin B.; Goring, Simon

2017-01-01

Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
Association between surgeon volume and hospitalisation costs for patients with oral cancer: a nationwide population base study in Taiwan.

PubMed

Lee, C-C; Ho, H-C; Jack, Lee C-C; Su, Y-C; Lee, M-S; Hung, S-K; Chou, Pesus

2010-02-01

Oral cancer leads to a considerable use of and expenditure on health care. Wide resection of the tumour and reconstruction with a pedicle flap/free flap is widely used. This study was conducted to explore the relationship between hospitalisation costs and surgeon case volume when this operation was performed. A population-based study. This study uses data for the years 2005-2006 obtained from the National Health Insurance Research Database published in the Taiwanese National Health Research Institute. From this population-based data, the authors selected a total of 2663 oral cancer patients who underwent tumour resection and reconstruction. Case volume relationships were based on the following criteria; low-, medium-, high-, very high-volume surgeons were defined by or= 56 resections with reconstruction, respectively. Hierarchical linear regression analysis was subsequently performed to explore the relationship between surgeon case volume and the cost and length of hospitalisation. The mean hospitalisation cost among the 2663 patients was US$ 9528 (all costs are given in US dollars). After adjusting for physician, hospital, and patient characteristics in a hierarchical linear regression model, the cost per patient for low-volume surgeons was found to be US$ 741 (P = 0.012) higher than that for medium-volume surgeons, US$ 1546 (P < 0.001) higher than that for high-volume surgeons, and US$ 1820 (P < 0.001) higher than that for very-high-volume surgeons. After adjustment for physician, hospital, and patient characteristics, the hierarchical linear regression model revealed that the mean length of stay per patient for low-volume surgeons was the highest (P < 0.001). After adjustment for physician, hospital, and patient characteristics, low-volume surgeons performing wide excision with reconstructive surgery in oral cancer patients incurred significantly higher costs and longer hospital stays per patient than did other surgeons. Treatment strategies adopted by high- and very-high-volume surgeons should be analysed further and utilised more widely.
Is the maturity of hospitals' quality improvement systems associated with measures of quality and patient safety?

PubMed Central

2011-01-01

Background Previous research addressed the development of a classification scheme for quality improvement systems in European hospitals. In this study we explore associations between the 'maturity' of the hospitals' quality improvement system and clinical outcomes. Methods The maturity classification scheme was developed based on survey results from 389 hospitals in eight European countries. We matched the hospitals from the Spanish sample (113 hospitals) with those hospitals participating in a nation-wide, voluntary hospital performance initiative. We then compared sample distributions and explored associations between the 'maturity' of the hospitals' quality improvement system and a range of composite outcomes measures, such as adjusted hospital-wide mortality, -readmission, -complication and -length of stay indices. Statistical analysis includes bivariate correlations for parametrically and non-parametrically distributed data, multiple robust regression models and bootstrapping techniques to obtain confidence-intervals for the correlation and regression estimates. Results Overall, 43 hospitals were included. Compared to the original sample of 113, this sample was characterized by a higher representation of university hospitals. Maturity of the quality improvement system was similar, although the matched sample showed less variability. Analysis of associations between the quality improvement system and hospital-wide outcomes suggests significant correlations for the indicator adjusted hospital complications, borderline significance for adjusted hospital readmissions and non-significance for the adjusted hospital mortality and length of stay indicators. These results are confirmed by the bootstrap estimates of the robust regression model after adjusting for hospital characteristics. Conclusions We assessed associations between hospitals' quality improvement systems and clinical outcomes. From this data it seems that having a more developed quality improvement system is associated with lower rates of adjusted hospital complications. A number of methodological and logistic hurdles remain to link hospital quality improvement systems to outcomes. Further research should aim at identifying the latent dimensions of quality improvement systems that predict quality and safety outcomes. Such research would add pertinent knowledge regarding the implementation of organizational strategies related with quality of care outcomes. PMID:22185479
Spectral Estimation Model Construction of Heavy Metals in Mining Reclamation Areas

PubMed Central

Dong, Jihong; Dai, Wenting; Xu, Jiren; Li, Songnian

2016-01-01

The study reported here examined, as the research subject, surface soils in the Liuxin mining area of Xuzhou, and explored the heavy metal content and spectral data by establishing quantitative models with Multivariable Linear Regression (MLR), Generalized Regression Neural Network (GRNN) and Sequential Minimal Optimization for Support Vector Machine (SMO-SVM) methods. The study results are as follows: (1) the estimations of the spectral inversion models established based on MLR, GRNN and SMO-SVM are satisfactory, and the MLR model provides the worst estimation, with R2 of more than 0.46. This result suggests that the stress sensitive bands of heavy metal pollution contain enough effective spectral information; (2) the GRNN model can simulate the data from small samples more effectively than the MLR model, and the R2 between the contents of the five heavy metals estimated by the GRNN model and the measured values are approximately 0.7; (3) the stability and accuracy of the spectral estimation using the SMO-SVM model are obviously better than that of the GRNN and MLR models. Among all five types of heavy metals, the estimation for cadmium (Cd) is the best when using the SMO-SVM model, and its R2 value reaches 0.8628; (4) using the optimal model to invert the Cd content in wheat that are planted on mine reclamation soil, the R2 and RMSE between the measured and the estimated values are 0.6683 and 0.0489, respectively. This result suggests that the method using the SMO-SVM model to estimate the contents of heavy metals in wheat samples is feasible. PMID:27367708
Spectral Estimation Model Construction of Heavy Metals in Mining Reclamation Areas.

PubMed

Dong, Jihong; Dai, Wenting; Xu, Jiren; Li, Songnian

2016-06-28

The study reported here examined, as the research subject, surface soils in the Liuxin mining area of Xuzhou, and explored the heavy metal content and spectral data by establishing quantitative models with Multivariable Linear Regression (MLR), Generalized Regression Neural Network (GRNN) and Sequential Minimal Optimization for Support Vector Machine (SMO-SVM) methods. The study results are as follows: (1) the estimations of the spectral inversion models established based on MLR, GRNN and SMO-SVM are satisfactory, and the MLR model provides the worst estimation, with R² of more than 0.46. This result suggests that the stress sensitive bands of heavy metal pollution contain enough effective spectral information; (2) the GRNN model can simulate the data from small samples more effectively than the MLR model, and the R² between the contents of the five heavy metals estimated by the GRNN model and the measured values are approximately 0.7; (3) the stability and accuracy of the spectral estimation using the SMO-SVM model are obviously better than that of the GRNN and MLR models. Among all five types of heavy metals, the estimation for cadmium (Cd) is the best when using the SMO-SVM model, and its R² value reaches 0.8628; (4) using the optimal model to invert the Cd content in wheat that are planted on mine reclamation soil, the R² and RMSE between the measured and the estimated values are 0.6683 and 0.0489, respectively. This result suggests that the method using the SMO-SVM model to estimate the contents of heavy metals in wheat samples is feasible.
Regression modeling of ground-water flow

USGS Publications Warehouse

Cooley, R.L.; Naff, R.L.

1985-01-01

Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Discovery of potent NEK2 inhibitors as potential anticancer agents using structure-based exploration of NEK2 pharmacophoric space coupled with QSAR analyses.

PubMed

Khanfar, Mohammad A; Banat, Fahmy; Alabed, Shada; Alqtaishat, Saja

2017-02-01

High expression of Nek2 has been detected in several types of cancer and it represents a novel target for human cancer. In the current study, structure-based pharmacophore modeling combined with multiple linear regression (MLR)-based QSAR analyses was applied to disclose the structural requirements for NEK2 inhibition. Generated pharmacophoric models were initially validated with receiver operating characteristic (ROC) curve, and optimum models were subsequently implemented in QSAR modeling with other physiochemical descriptors. QSAR-selected models were implied as 3D search filters to mine the National Cancer Institute (NCI) database for novel NEK2 inhibitors, whereas the associated QSAR model prioritized the bioactivities of captured hits for in vitro evaluation. Experimental validation identified several potent NEK2 inhibitors of novel structural scaffolds. The most potent captured hit exhibited an [Formula: see text] value of 237 nM.
Retention modelling of polychlorinated biphenyls in comprehensive two-dimensional gas chromatography.

PubMed

D'Archivio, Angelo Antonio; Incani, Angela; Ruggieri, Fabrizio

2011-01-01

In this paper, we use a quantitative structure-retention relationship (QSRR) method to predict the retention times of polychlorinated biphenyls (PCBs) in comprehensive two-dimensional gas chromatography (GC×GC). We analyse the GC×GC retention data taken from the literature by comparing predictive capability of different regression methods. The various models are generated using 70 out of 209 PCB congeners in the calibration stage, while their predictive performance is evaluated on the remaining 139 compounds. The two-dimensional chromatogram is initially estimated by separately modelling retention times of PCBs in the first and in the second column ((1) t (R) and (2) t (R), respectively). In particular, multilinear regression (MLR) combined with genetic algorithm (GA) variable selection is performed to extract two small subsets of predictors for (1) t (R) and (2) t (R) from a large set of theoretical molecular descriptors provided by the popular software Dragon, which after removal of highly correlated or almost constant variables consists of 237 structure-related quantities. Based on GA-MLR analysis, a four-dimensional and a five-dimensional relationship modelling (1) t (R) and (2) t (R), respectively, are identified. Single-response partial least square (PLS-1) regression is alternatively applied to independently model (1) t (R) and (2) t (R) without the need for preliminary GA variable selection. Further, we explore the possibility of predicting the two-dimensional chromatogram of PCBs in a single calibration procedure by using a two-response PLS (PLS-2) model or a feed-forward artificial neural network (ANN) with two output neurons. In the first case, regression is carried out on the full set of 237 descriptors, while the variables previously selected by GA-MLR are initially considered as ANN inputs and subjected to a sensitivity analysis to remove the redundant ones. Results show PLS-1 regression exhibits a noticeably better descriptive and predictive performance than the other investigated approaches. The observed values of determination coefficients for (1) t (R) and (2) t (R) in calibration (0.9999 and 0.9993, respectively) and prediction (0.9987 and 0.9793, respectively) provided by PLS-1 demonstrate that GC×GC behaviour of PCBs is properly modelled. In particular, the predicted two-dimensional GC×GC chromatogram of 139 PCBs not involved in the calibration stage closely resembles the experimental one. Based on the above lines of evidence, the proposed approach ensures accurate simulation of the whole GC×GC chromatogram of PCBs using experimental determination of only 1/3 retention data of representative congeners.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

ERIC Educational Resources Information Center

Haberman, Shelby J.; Sinharay, Sandip

2010-01-01

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Injury risk functions based on population-based finite element model responses: Application to femurs under dynamic three-point bending.

PubMed

Park, Gwansik; Forman, Jason; Kim, Taewung; Panzer, Matthew B; Crandall, Jeff R

2018-02-28

The goal of this study was to explore a framework for developing injury risk functions (IRFs) in a bottom-up approach based on responses of parametrically variable finite element (FE) models representing exemplar populations. First, a parametric femur modeling tool was developed and validated using a subject-specific (SS)-FE modeling approach. Second, principal component analysis and regression were used to identify parametric geometric descriptors of the human femur and the distribution of those factors for 3 target occupant sizes (5th, 50th, and 95th percentile males). Third, distributions of material parameters of cortical bone were obtained from the literature for 3 target occupant ages (25, 50, and 75 years) using regression analysis. A Monte Carlo method was then implemented to generate populations of FE models of the femur for target occupants, using a parametric femur modeling tool. Simulations were conducted with each of these models under 3-point dynamic bending. Finally, model-based IRFs were developed using logistic regression analysis, based on the moment at fracture observed in the FE simulation. In total, 100 femur FE models incorporating the variation in the population of interest were generated, and 500,000 moments at fracture were observed (applying 5,000 ultimate strains for each synthesized 100 femur FE models) for each target occupant characteristics. Using the proposed framework on this study, the model-based IRFs for 3 target male occupant sizes (5th, 50th, and 95th percentiles) and ages (25, 50, and 75 years) were developed. The model-based IRF was located in the 95% confidence interval of the test-based IRF for the range of 15 to 70% injury risks. The 95% confidence interval of the developed IRF was almost in line with the mean curve due to a large number of data points. The framework proposed in this study would be beneficial for developing the IRFs in a bottom-up manner, whose range of variabilities is informed by the population-based FE model responses. Specifically, this method mitigates the uncertainties in applying empirical scaling and may improve IRF fidelity when a limited number of experimental specimens are available.

A microarray whole-genome gene expression dataset in a rat model of inflammatory corneal angiogenesis.

PubMed

Mukwaya, Anthony; Lindvall, Jessica M; Xeroudaki, Maria; Peebo, Beatrice; Ali, Zaheer; Lennikov, Anton; Jensen, Lasse Dahl Ejby; Lagali, Neil

2016-11-22

In angiogenesis with concurrent inflammation, many pathways are activated, some linked to VEGF and others largely VEGF-independent. Pathways involving inflammatory mediators, chemokines, and micro-RNAs may play important roles in maintaining a pro-angiogenic environment or mediating angiogenic regression. Here, we describe a gene expression dataset to facilitate exploration of pro-angiogenic, pro-inflammatory, and remodelling/normalization-associated genes during both an active capillary sprouting phase, and in the restoration of an avascular phenotype. The dataset was generated by microarray analysis of the whole transcriptome in a rat model of suture-induced inflammatory corneal neovascularisation. Regions of active capillary sprout growth or regression in the cornea were harvested and total RNA extracted from four biological replicates per group. High quality RNA was obtained for gene expression analysis using microarrays. Fold change of selected genes was validated by qPCR, and protein expression was evaluated by immunohistochemistry. We provide a gene expression dataset that may be re-used to investigate corneal neovascularisation, and may also have implications in other contexts of inflammation-mediated angiogenesis.
Nonparametric regression applied to quantitative structure-activity relationships

PubMed

Constans; Hirst

2000-03-01

Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
A cross-sectional study for estimation of associations between education level and osteoporosis in a Chinese men sample.

PubMed

Yu, Cai-Xia; Zhang, Xiu-Zhen; Zhang, Keqin; Tang, Zihui

2015-12-09

The main aim of this study was to evaluate the association between education level and osteoporosis (OP) in general Chinese Men. We conducted a large-scale, community-based, cross-sectional study to investigate the association by using self-report questionnaire to assess education levels. The data of 1092 men were available for analysis in this study. Multiple regression models controlling for confounding factors to include education level were performed to explore the relationship between education level and OP. Positive correlations between education level and T-score of quantitative bone ultrasound (QUS-T score) were reported (β = 0.108, P value < 0.001). Multiple regression analysis indicated that the education level was independently and significantly associated with OP (P < 0.1 for all models). The men with lower education level had a higher prevalence of OP. The education level was independently and significantly associated with OP. The prevalence of OP was more frequent in Chinese men with lower education level. ClinicalTrials.gov Identifier: NCT02451397 ; date of registration: 05/28/2015).
Squared exponential covariance function for prediction of hydrocarbon in seabed logging application

NASA Astrophysics Data System (ADS)

Mukhtar, Siti Mariam; Daud, Hanita; Dass, Sarat Chandra

2016-11-01

Seabed Logging technology (SBL) has progressively emerged as one of the demanding technologies in Exploration and Production (E&P) industry. Hydrocarbon prediction in deep water areas is crucial task for a driller in any oil and gas company as drilling cost is very expensive. Simulation data generated by Computer Software Technology (CST) is used to predict the presence of hydrocarbon where the models replicate real SBL environment. These models indicate that the hydrocarbon filled reservoirs are more resistive than surrounding water filled sediments. Then, as hydrocarbon depth is increased, it is more challenging to differentiate data with and without hydrocarbon. MATLAB is used for data extractions for curve fitting process using Gaussian process (GP). GP can be classified into regression and classification problems, where this work only focuses on Gaussian process regression (GPR) problem. Most popular choice to supervise GPR is squared exponential (SE), as it provides stability and probabilistic prediction in huge amounts of data. Hence, SE is used to predict the presence or absence of hydrocarbon in the reservoir from the data generated.
Moderation analysis using a two-level regression model.

PubMed

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

2014-10-01

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
The microcomputer scientific software series 2: general linear model--regression.

Treesearch

Harold M. Rauscher

1983-01-01

The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
What makes home health workers think about leaving their job? The role of physical injury and organizational support.

PubMed

Lee, Ahyoung Anna; Jang, Yuri

2016-01-01

Based on the job demands-resources (JD-R) model, this study explored the role of physical injury and organizational support in predicting home health workers' turnover intention. In a sample of home health workers in Central Texas (n = 150), about 37% reported turnover intention. The logistic regression model showed that turnover intention was 3.23 times more likely among those who had experienced work-related injury. On the other hand, organizational support was found to reduce the likelihood of turnover intention. Findings suggest that injury and organizational support should be prioritized in prevention and intervention efforts to promote home health workers' safety and retention.
Carrying the burdens of poverty, parenting, and addiction: depression symptoms and self-silencing among ethnically diverse women.

PubMed

Grant, Therese M; Jack, Dana C; Fitzpatrick, Annette L; Ernst, Cara C

2011-02-01

Depression among women commonly co-occurs with substance abuse. We explore the association between women's depressive symptoms and self-silencing accounting for the effects of known childhood and adult risk indicators. Participants are 233 ethnically diverse, low-income women who abused alcohol/drugs prenatally. Depressive symptomatology was assessed using the Addiction Severity Index. Multivariate logistic regression models examined the association between self-silencing and the dependent depression variable. The full model indicated a 3% increased risk for depressive distress for each point increase in self-silencing score (OR = 1.03; P = .001). Differences in depressive symptomatology by ethnic groups were accounted for by their differences in self-silencing.
Exploration of charity toward busking (street performance) as a function of religion.

PubMed

Lemay, John O; Bates, Larry W

2013-04-01

To examine conceptions of religion and charity in a new venue--busking (street performance)--103 undergraduate students at a regional university in the southeastern U.S. completed a battery of surveys regarding religion, and attitudes and behaviors toward busking. For those 85 participants who had previously encountered a busker, stepwise regression was used to predict increased frequency of giving to buskers. The best predictive model of giving to buskers consisted of three variables including less experienced irritation toward buskers, prior experience with giving to the homeless, and lower religious fundamentalism.
Results of the 2009 AORN salary survey.

PubMed

Bacon, Donald

2009-12-01

AORN conducted its seventh annual compensation survey for perioperative nurses in August of 2009. A multiple regression model was used to examine how a variety of variables including job title, education level, certification, experience, and geographic region affect nursing compensation. Comparisons between the 2009 data and previous years' data are presented. The effects of other forms of compensation, such as on-call compensation, overtime, bonuses, and shift differentials on average base compensation rates also are examined. Additional analyses explore the effect of the current economic downturn on the perioperative work environment. (c) AORN, Inc, 2009.
Climate variations and salmonellosis transmission in Adelaide, South Australia: a comparison between regression models

NASA Astrophysics Data System (ADS)

Zhang, Ying; Bi, Peng; Hiller, Janet

2008-01-01

This is the first study to identify appropriate regression models for the association between climate variation and salmonellosis transmission. A comparison between different regression models was conducted using surveillance data in Adelaide, South Australia. By using notified salmonellosis cases and climatic variables from the Adelaide metropolitan area over the period 1990-2003, four regression methods were examined: standard Poisson regression, autoregressive adjusted Poisson regression, multiple linear regression, and a seasonal autoregressive integrated moving average (SARIMA) model. Notified salmonellosis cases in 2004 were used to test the forecasting ability of the four models. Parameter estimation, goodness-of-fit and forecasting ability of the four regression models were compared. Temperatures occurring 2 weeks prior to cases were positively associated with cases of salmonellosis. Rainfall was also inversely related to the number of cases. The comparison of the goodness-of-fit and forecasting ability suggest that the SARIMA model is better than the other three regression models. Temperature and rainfall may be used as climatic predictors of salmonellosis cases in regions with climatic characteristics similar to those of Adelaide. The SARIMA model could, thus, be adopted to quantify the relationship between climate variations and salmonellosis transmission.
QSAR studies of the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by multiple linear regression (MLR) and support vector machine (SVM).

PubMed

Qin, Zijian; Wang, Maolin; Yan, Aixia

2017-07-01

In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC 50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonen's self-organizing map (SOM) method. The correlation coefficients (r 2 ) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline. Copyright © 2017 Elsevier Ltd. All rights reserved.
Priors in Whole-Genome Regression: The Bayesian Alphabet Returns

PubMed Central

Gianola, Daniel

2013-01-01

Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p. PMID:23636739
Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

PubMed Central

2016-01-01

Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

PubMed

Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

2016-01-01

Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

PubMed Central

Wang, Lan; Wu, Yichao

2012-01-01

Ultra-high dimensional data often display heterogeneity due to either heteroscedastic variance or other forms of non-location-scale covariate effects. To accommodate heterogeneity, we advocate a more general interpretation of sparsity which assumes that only a small number of covariates influence the conditional distribution of the response variable given all candidate covariates; however, the sets of relevant covariates may differ when we consider different segments of the conditional distribution. In this framework, we investigate the methodology and theory of nonconvex penalized quantile regression in ultra-high dimension. The proposed approach has two distinctive features: (1) it enables us to explore the entire conditional distribution of the response variable given the ultra-high dimensional covariates and provides a more realistic picture of the sparsity pattern; (2) it requires substantially weaker conditions compared with alternative methods in the literature; thus, it greatly alleviates the difficulty of model checking in the ultra-high dimension. In theoretic development, it is challenging to deal with both the nonsmooth loss function and the nonconvex penalty function in ultra-high dimensional parameter space. We introduce a novel sufficient optimality condition which relies on a convex differencing representation of the penalized loss function and the subdifferential calculus. Exploring this optimality condition enables us to establish the oracle property for sparse quantile regression in the ultra-high dimension under relaxed conditions. The proposed method greatly enhances existing tools for ultra-high dimensional data analysis. Monte Carlo simulations demonstrate the usefulness of the proposed procedure. The real data example we analyzed demonstrates that the new approach reveals substantially more information compared with alternative methods. PMID:23082036
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

PubMed

Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

2017-03-10

To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
High-frequency fluctuations in Denmark Strait transport

NASA Astrophysics Data System (ADS)

Haine, T. W. N.

2010-07-01

Denmark Strait ocean current transport exhibits quasi-regular fluctuations immediately south of the sill with periods of 2-4 days. The transport variability is similar to the mean transport itself. Using a circulation model we explore prospects to monitor the fluctuations. The model has realistic transport and shows water leaving Denmark Strait in equivalent-barotropic cyclones that are nearly geostrophic and correlate with sea-surface height (SSH). Existing satellite altimeter observations of SSH have adequate space/time sampling to reconstruct the transport fluctuations using a regression developed from the model results, but measurement error overwhelms the signal. From the model results, the pending Surface Water and Ocean Topography (SWOT) wide-swath altimeter appears accurate enough, and with good-enough coverage, to allow the transport fluctuations to be reconstructed. Bottom pressure recorders at the exit of the Denmark Strait can also reproduce the transport variability.
Non-destructive evaluation of chlorophyll content in quinoa and amaranth leaves by simple and multiple regression analysis of RGB image components.

PubMed

Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E

2014-06-01

Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth.
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine

Treesearch

Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold

1992-01-01

A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...

Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

NASA Astrophysics Data System (ADS)

Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

2017-06-01

A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Analyzing the non-stationary space relationship of a city's degree of vegetation and social economic conditions in Shanghai, China using OLS and GWR models

NASA Astrophysics Data System (ADS)

Wang, Kejing; Zhang, Yuan; An, Youzhi; Jing, Zhuoxin; Wang, Chao

2013-09-01

With the fast urbanization process, how does the vegetation environment change in one of the most economically developed metropolis, Shanghai in East China? To answer this question, there is a pressing demand to explore the non-stationary relationship between socio-economic conditions and vegetation across Shanghai. In this study, environmental data on vegetation cover, the Normalized Difference Vegetation Index (NDVI) derived from MODIS imagery in 2003 were integrated with socio-economic data to reflect the city's vegetative conditions at the census block group level. To explore regional variations in the relationship of vegetation and socio-economic conditions, Ordinary Least Squares (OLS) and Geographically Weighted Regression (GWR) models were applied to characterize mean NDVI against three independent socio-economic variables, an urban land use ratio, Gross Domestic Product (GDP) and population density. The study results show that a considerable distinctive spatial variation exists in the relationship for each model. The GWR model has superior effects and higher precision than the OLS model at the census block group scale. So, it is more suitable to account for local effects and geographical variations. This study also indicates that unreasonable excessive urbanization, together with non-sustainable economic development, has a negative influence of vegetation vigor for some neighborhoods in Shanghai.
Work-related injuries involving a hand or fingers among union carpenters in Washington State, 1989 to 2008.

PubMed

Lipscomb, Hester J; Schoenfisch, Ashley; Cameron, Wilfrid

2013-07-01

We evaluated work-related injuries involving a hand or fingers and associated costs among a cohort of 24,830 carpenters between 1989 and 2008. Injury rates and rate ratios were calculated by using Poisson regression to explore higher risk on the basis of age, sex, time in the union, predominant work, and calendar time. Negative binomial regression was used to model dollars paid per claim after adjustment for inflation and discounting. Hand injuries accounted for 21.1% of reported injuries and 9.5% of paid lost time injuries. Older carpenters had proportionately more amputations, fractures, and multiple injuries, but their rates of these more severe injuries were not higher. Costs exceeded $21 million, a cost burden of $0.11 per hour worked. Older carpenters' higher proportion of serious injuries in the absence of higher rates likely reflects age-related reporting differences.
Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.

PubMed

Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina

2017-07-01

This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history ( P < .01) and vaccine decision made before the visit ( P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history ( P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.
Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Suparti; Wahyu Utami, Tiani

2018-03-01

Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Within-person variation in security of attachment: a self-determination theory perspective on attachment, need fulfillment, and well-being.

PubMed

La Guardia, J G; Ryan, R M; Couchman, C E; Deci, E L

2000-09-01

Attachment research has traditionally focused on individual differences in global patterns of attachment to important others. The current research instead focuses primarily on within-person variability in attachments across relational partners. It was predicted that within-person variability would be substantial, even among primary attachment figures of mother, father, romantic partner, and best friend. The prediction was supported in three studies. Furthermore, in line with self-determination theory, multilevel modeling and regression analyses showed that, at the relationship level, individuals' experience of fulfillment of the basic needs for autonomy, competence, and relatedness positively predicted overall attachment security, model of self, and model of other. Relations of both attachment and need satisfaction to well-being were also explored.
Novel Methods in Disease Biogeography: A Case Study with Heterosporosis

PubMed Central

Escobar, Luis E.; Qiao, Huijie; Lee, Christine; Phelps, Nicholas B. D.

2017-01-01

Disease biogeography is currently a promising field to complement epidemiology, and ecological niche modeling theory and methods are a key component. Therefore, applying the concepts and tools from ecological niche modeling to disease biogeography and epidemiology will provide biologically sound and analytically robust descriptive and predictive analyses of disease distributions. As a case study, we explored the ecologically important fish disease Heterosporosis, a relatively poorly understood disease caused by the intracellular microsporidian parasite Heterosporis sutherlandae. We explored two novel ecological niche modeling methods, the minimum-volume ellipsoid (MVE) and the Marble algorithm, which were used to reconstruct the fundamental and the realized ecological niche of H. sutherlandae, respectively. Additionally, we assessed how the management of occurrence reports can impact the output of the models. Ecological niche models were able to reconstruct a proxy of the fundamental and realized niche for this aquatic parasite, identifying specific areas suitable for Heterosporosis. We found that the conceptual and methodological advances in ecological niche modeling provide accessible tools to update the current practices of spatial epidemiology. However, careful data curation and a detailed understanding of the algorithm employed are critical for a clear definition of the assumptions implicit in the modeling process and to ensure biologically sound forecasts. In this paper, we show how sensitive MVE is to the input data, while Marble algorithm may provide detailed forecasts with a minimum of parameters. We showed that exploring algorithms of different natures such as environmental clusters, climatic envelopes, and logistic regressions (e.g., Marble, MVE, and Maxent) provide different scenarios of potential distribution. Thus, no single algorithm should be used for disease mapping. Instead, different algorithms should be employed for a more informed and complete understanding of the pathogen or parasite in question. PMID:28770215
Regression models for predicting peak and continuous three-dimensional spinal loads during symmetric and asymmetric lifting tasks.

PubMed

Fathallah, F A; Marras, W S; Parnianpour, M

1999-09-01

Most biomechanical assessments of spinal loading during industrial work have focused on estimating peak spinal compressive forces under static and sagittally symmetric conditions. The main objective of this study was to explore the potential of feasibly predicting three-dimensional (3D) spinal loading in industry from various combinations of trunk kinematics, kinetics, and subject-load characteristics. The study used spinal loading, predicted by a validated electromyography-assisted model, from 11 male participants who performed a series of symmetric and asymmetric lifts. Three classes of models were developed: (a) models using workplace, subject, and trunk motion parameters as independent variables (kinematic models); (b) models using workplace, subject, and measured moments variables (kinetic models); and (c) models incorporating workplace, subject, trunk motion, and measured moments variables (combined models). The results showed that peak 3D spinal loading during symmetric and asymmetric lifting were predicted equally well using all three types of regression models. Continuous 3D loading was predicted best using the combined models. When the use of such models is infeasible, the kinematic models can provide adequate predictions. Finally, lateral shear forces (peak and continuous) were consistently underestimated using all three types of models. The study demonstrated the feasibility of predicting 3D loads on the spine under specific symmetric and asymmetric lifting tasks without the need for collecting EMG information. However, further validation and development of the models should be conducted to assess and extend their applicability to lifting conditions other than those presented in this study. Actual or potential applications of this research include exposure assessment in epidemiological studies, ergonomic intervention, and laboratory task assessment.
Temporal and long-term trend analysis of class C notifiable diseases in China from 2009 to 2014

PubMed Central

Zhang, Xingyu; Hou, Fengsu; Qiao, Zhijiao; Li, Xiaosong; Zhou, Lijun; Liu, Yuanyuan; Zhang, Tao

2016-01-01

Objectives Time series models are effective tools for disease forecasting. This study aims to explore the time series behaviour of 11 notifiable diseases in China and to predict their incidence through effective models. Settings and participants The Chinese Ministry of Health started to publish class C notifiable diseases in 2009. The monthly reported case time series of 11 infectious diseases from the surveillance system between 2009 and 2014 was collected. Methods We performed a descriptive and a time series study using the surveillance data. Decomposition methods were used to explore (1) their seasonality expressed in the form of seasonal indices and (2) their long-term trend in the form of a linear regression model. Autoregressive integrated moving average (ARIMA) models have been established for each disease. Results The number of cases and deaths caused by hand, foot and mouth disease ranks number 1 among the detected diseases. It occurred most often in May and July and increased, on average, by 0.14126/100 000 per month. The remaining incidence models show good fit except the influenza and hydatid disease models. Both the hydatid disease and influenza series become white noise after differencing, so no available ARIMA model can be fitted for these two diseases. Conclusion Time series analysis of effective surveillance time series is useful for better understanding the occurrence of the 11 types of infectious disease. PMID:27797981
Design an optimum safety policy for personnel safety management - A system dynamic approach

NASA Astrophysics Data System (ADS)

Balaji, P.

2014-10-01

Personnel safety management (PSM) ensures that employee's work conditions are healthy and safe by various proactive and reactive approaches. Nowadays it is a complex phenomenon because of increasing dynamic nature of organisations which results in an increase of accidents. An important part of accident prevention is to understand the existing system properly and make safety strategies for that system. System dynamics modelling appears to be an appropriate methodology to explore and make strategy for PSM. Many system dynamics models of industrial systems have been built entirely for specific host firms. This thesis illustrates an alternative approach. The generic system dynamics model of Personnel safety management was developed and tested in a host firm. The model was undergone various structural, behavioural and policy tests. The utility and effectiveness of model was further explored through modelling a safety scenario. In order to create effective safety policy under resource constraint, DOE (Design of experiment) was used. DOE uses classic designs, namely, fractional factorials and central composite designs. It used to make second order regression equation which serve as an objective function. That function was optimized under budget constraint and optimum value used for safety policy which shown greatest improvement in overall PSM. The outcome of this research indicates that personnel safety management model has the capability for acting as instruction tool to improve understanding of safety management and also as an aid to policy making.
Air pollution and environmental justice in the Great Lakes region

NASA Astrophysics Data System (ADS)

Comer, Bryan

While it is true that air quality has steadily improved in the Great Lakes region, air pollution remains at unhealthy concentrations in many areas. Research suggests that vulnerable and susceptible groups in society -- e.g., minorities, the poor, children, and poorly educated -- are often disproportionately impacted by exposure to environmental hazards, including air pollution. This dissertation explores the relationship between exposure to ambient air pollution (interpolated concentrations of fine particulate matter, PM2.5) and sociodemographic factors (race, housing value, housing status, education, age, and population density) at the Census block-group level in the Great Lakes region of the United States. A relatively novel approach to quantitative environmental justice analysis, geographically weighted regression (GWR), is compared with a simplified approach: ordinary least squares (OLS) regression. While OLS creates one global model to describe the relationship between air pollution exposure and sociodemographic factors, GWR creates many local models (one at each Census block group) that account for local variations in this relationship by allowing the value of regression coefficients to vary over space, overcoming OLS's assumption of homogeneity and spatial independence. Results suggest that GWR can elucidate patterns of potential environmental injustices that OLS models may miss. In fact, GWR results show that the relationship between exposure to ambient air pollution and sociodemographic characteristics is non-stationary and can vary geographically and temporally throughout the Great Lakes region. This suggests that regulators may need to address environmental justice issues at the neighborhood level, while understanding that the severity of environmental injustices can change throughout the year.
Long Term Association of Tropospheric Trace gases over Pakistan by exploiting satellite observations and development of Econometric Regression based Model

NASA Astrophysics Data System (ADS)

Zeb, Naila; Fahim Khokhar, Muhammad; Khan, Saud Ahmed; Noreen, Asma; Murtaza, Rabbia

2017-04-01

Air pollution is the expected key environmental issue of Pakistan as it is ranked among top polluted countries in the region. Ongoing rapid economic growth without any adequate measures is leading to worst air quality over time. The study aims to monitor long term atmospheric composition and association of trace gases over Pakistan. Tropospheric concentrations of CO, TOC, NO2 and HCHO derived from multiple satellite instruments are used for study from year 2005 to 2014. The study will provide first database for tropospheric trace gases over Pakistan. Spatio-temporal assessment identified hotspots and possible sources of trace gases over the Pakistan. High concentrations of trace gases are mainly observed over Punjab region, which may be attributed to its metropolitan importance. It is the major agricultural, industrialized and urbanized (nearly 60 % of the Pakistan's population) sector of the country. The expected sources are the agricultural fires, biomass/fossil fuel burning for heating purposes, urbanization, industrialization and meteorological variations. Seasonal variability is observed to explore seasonal patterns over the decade. Well defined seasonal cycles of trace gases are observed over the whole study period. The observed seasonal patterns also showed some noteworthy association among trace gases, which is further explored by different statistical tests. Seasonal Mann Kendall test is applied to test the significance of trend in series whereas correlation is carried out to measure the strength of association among trace gases. Strong correlation is observed for trace gases especially between CO and TOC. Partial Mann Kendall test is used to ideally identify the impact of each covariate on long term trend of CO and TOC by partialling out each correlating trace gas (covariate). It is observed that TOC, NO2 and HCHO has significant impact on long term trend of CO whereas, TOC critically depends on NO2 concentrations for long term increase over the region. Furthermore to explore causal relation, regression analysis is employed to estimate model for CO and TOC. This model numerically estimated the long term association of trace gases over the region.
Screen and clean: a tool for identifying interactions in genome-wide association studies.

PubMed

Wu, Jing; Devlin, Bernie; Ringquist, Steven; Trucco, Massimo; Roeder, Kathryn

2010-04-01

Epistasis could be an important source of risk for disease. How interacting loci might be discovered is an open question for genome-wide association studies (GWAS). Most researchers limit their statistical analyses to testing individual pairwise interactions (i.e., marginal tests for association). A more effective means of identifying important predictors is to fit models that include many predictors simultaneously (i.e., higher-dimensional models). We explore a procedure called screen and clean (SC) for identifying liability loci, including interactions, by using the lasso procedure, which is a model selection tool for high-dimensional regression. We approach the problem by using a varying dictionary consisting of terms to include in the model. In the first step the lasso dictionary includes only main effects. The most promising single-nucleotide polymorphisms (SNPs) are identified using a screening procedure. Next the lasso dictionary is adjusted to include these main effects and the corresponding interaction terms. Again, promising terms are identified using lasso screening. Then significant terms are identified through the cleaning process. Implementation of SC for GWAS requires algorithms to explore the complex model space induced by the many SNPs genotyped and their interactions. We propose and explore a set of algorithms and find that SC successfully controls Type I error while yielding good power to identify risk loci and their interactions. When the method is applied to data obtained from the Wellcome Trust Case Control Consortium study of Type 1 Diabetes it uncovers evidence supporting interaction within the HLA class II region as well as within Chromosome 12q24.
Characterization and spatial modeling of urban sprawl in the Wuhan Metropolitan Area, China

NASA Astrophysics Data System (ADS)

Zeng, Chen; Liu, Yaolin; Stein, Alfred; Jiao, Limin

2015-02-01

Urban sprawl has led to environmental problems and large losses of arable land in China. In this study, we monitor and model urban sprawl by means of a combination of remote sensing, geographical information system and spatial statistics. We use time-series data to explore the potential socio-economic driving forces behind urban sprawl, and spatial models in different scenarios to explore the spatio-temporal interactions. The methodology is applied to the city of Wuhan, China, for the period from 1990 to 2013. The results reveal that the built-up land has expanded and has dispersed in urban clusters. Population growth, and economic and transportation development are still the main causes of urban sprawl; however, when they have developed to certain levels, the area affected by construction in urban areas (Jian Cheng Qu (JCQ)) and the area of cultivated land (ACL) tend to be stable. Spatial regression models are shown to be superior to the traditional models. The interaction among districts with the same administrative status is stronger than if one of those neighbors is in the city center and the other in the suburban area. The expansion of urban built-up land is driven by the socio-economic development at the same period, and greatly influenced by its spatio-temporal neighbors. We conclude that the integration of remote sensing, a geographical information system, and spatial statistics offers an excellent opportunity to explore the spatio-temporal variation and interactions among the districts in the sprawling metropolitan areas. Relevant regulations to control the urban sprawl process are suggested accordingly.
Evaluation of a Pharmacokinetic-Pharmacodynamic Model for Hypouricemic Effects of Febuxostat Using Datasets Obtained from Real-world Patients.

PubMed

Hirai, Toshinori; Itoh, Toshimasa; Kimura, Toshimi; Echizen, Hirotoshi

2018-06-06

Febuxostat is an active xanthine oxidase (XO) inhibitor that is widely used in the hyperuricemia treatment. We aimed to evaluate the predictive performance of a pharmacokinetic-pharmacodynamic (PK-PD) model for hypouricemic effects of febuxostat. Previously, we have formulated a PK--PD model for predicting hypouricemic effects of febuxostat as a function of baseline serum urate levels, body weight, renal function, and drug dose using datasets reported in preapproval studies (Hirai T et al., Biol Pharm Bull 2016; 39: 1013-21). Using an updated model with sensitivity analysis, we examined the predictive performance of the PK-PD model using datasets obtained from the medical records of patients who received febuxostat from March 2011 to December 2015 at Tokyo Women's Medical University Hospital. Multivariate regression analysis was performed to explore clinical variables to improve the predictive performance of the model. A total of 1,199 serum urate data were retrieved from 168 patients (age: 60.5 ±17.7 years, 71.4% males) who received febuxostat as hyperuricemia treatment. There was a significant correlation (r=0.68, p<0.01) between serum urate levels observed and those predicted by the modified PK-PD model. A multivariate regression analysis revealed that the predictive performance of the model may be improved further by considering comorbidities, such as diabetes mellitus, estimated glomerular filtration rate (eGFR), and co-administration of loop diuretics (r = 0.77, p<0.01). The PK-PD model may be useful for predicting individualized maintenance doses of febuxostat in real-world patients. This article is protected by copyright. All rights reserved.
The Plumbing of Land Surface Models: Is Poor Performance a Result of Methodology or Data Quality?

NASA Technical Reports Server (NTRS)

Haughton, Ned; Abramowitz, Gab; Pitman, Andy J.; Or, Dani; Best, Martin J.; Johnson, Helen R.; Balsamo, Gianpaolo; Boone, Aaron; Cuntz, Matthais; Decharme, Bertrand;

2016-01-01

The PALS Land sUrface Model Benchmarking Evaluation pRoject (PLUMBER) illustrated the value of prescribing a priori performance targets in model intercomparisons. It showed that the performance of turbulent energy flux predictions from different land surface models, at a broad range of flux tower sites using common evaluation metrics, was on average worse than relatively simple empirical models. For sensible heat fluxes, all land surface models were outperformed by a linear regression against downward shortwave radiation. For latent heat flux, all land surface models were outperformed by a regression against downward shortwave, surface air temperature and relative humidity. These results are explored here in greater detail and possible causes are investigated. We examine whether particular metrics or sites unduly influence the collated results, whether results change according to time-scale aggregation and whether a lack of energy conservation in fluxtower data gives the empirical models an unfair advantage in the intercomparison. We demonstrate that energy conservation in the observational data is not responsible for these results. We also show that the partitioning between sensible and latent heat fluxes in LSMs, rather than the calculation of available energy, is the cause of the original findings. Finally, we present evidence suggesting that the nature of this partitioning problem is likely shared among all contributing LSMs. While we do not find a single candidate explanation forwhy land surface models perform poorly relative to empirical benchmarks in PLUMBER, we do exclude multiple possible explanations and provide guidance on where future research should focus.

Applying Kaplan-Meier to Item Response Data

ERIC Educational Resources Information Center

McNeish, Daniel

2018-01-01

Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
Exploring the Influence of Nursing Work Environment and Patient Safety Culture on Missed Nursing Care in Korea.

PubMed

Kim, Kyoung-Ja; Yoo, Moon Sook; Seo, Eun Ji

2018-04-20

This study aimed to explore the influence of nurse work environment and patient safety culture in hospital on instances of missed nursing care in South Korea. A cross-sectional design was used, in which a structured questionnaire was administered to 186 nurses working at a tertiary university hospital. Data were analyzed using descriptive statistics, t-test or ANOVA, Pearson correlation and multiple regression analysis. Missed nursing care was found to be correlated with clinical career, nursing work environment and patient safety culture. The regression model explained approximately 30.3 % of missed nursing care. Meanwhile, staffing and resource adequacy (β = -.31, p = .001), nurse manager ability, leadership and support of nurses (β = -.26, p = .004), clinical career (β = -.21, p = .004), and perception on patient safety culture within unit (β = -.19, p = .041) were determined to be influencing factors on missed nursing care. This study has significance as it suggested that missed nursing care is affected by work environment factors within unit. This means that missed nursing care is a unit outcome affected by nurse work environment factors and patient safety culture. Therefore, missed nursing care can be managed through the implementation of interventions that promote a positive nursing work environment and patient safety culture. Copyright © 2018. Published by Elsevier B.V.
Demand-supply dynamics in tourism systems: A spatio-temporal GIS analysis. The Alberta ski industry case study

NASA Astrophysics Data System (ADS)

Bertazzon, Stefania

The present research focuses on the interaction of supply and demand of down-hill ski tourism in the province of Alberta. The main hypothesis is that the demand for skiing depends on the socio-economic and demographic characteristics of the population living in the province and outside it. A second, consequent hypothesis is that the development of ski resorts (supply) is a response to the demand for skiing. From the latter derives the hypothesis of a dynamic interaction between supply (ski resorts) and demand (skiers). Such interaction occurs in space, within a range determined by physical distance and the means available to overcome it. The above hypotheses implicitly define interactions that take place in space and evolve over time. The hypotheses are tested by temporal, spatial, and spatio-temporal regression models, using the best available data and the latest commercially available software. The main purpose of this research is to explore analytical techniques to model spatial, temporal, and spatio-temporal dynamics in the context of regional science. The completion of the present research has produced more significant contributions than was originally expected. Many of the unexpected contributions resulted from theoretical and applied needs arising from the application of spatial regression models. Spatial regression models are a new and largely under-applied technique. The models are fairly complex and a considerable amount of preparatory work is needed, prior to their specification and estimation. Most of this work is specific to the field of application. The originality of the solutions devised is increased by the lack of applications in the field of tourism. The scarcity of applications in other fields adds to their value for other applications. The estimation of spatio-temporal models has been only partially attained in the present research. This apparent limitation is due to the novelty and complexity of the analytical methods applied. This opens new directions for further work in the field of spatial analysis, in conjunction with the development of specific software.
In rheumatoid arthritis, country of residence has an important influence on fatigue: results from the multinational COMORA study.

PubMed

Hifinger, Monika; Putrik, Polina; Ramiro, Sofia; Keszei, András P; Hmamouchi, Ihsane; Dougados, Maxime; Gossec, Laure; Boonen, Annelies

2016-04-01

To investigate the relationship between country of residence and fatigue in RA, and to explore which country characteristics are related to fatigue. Data from the multinational COMORA study were analysed. Contribution of country of residence to level of fatigue [0-10 on visual analogue scale (VAS)] and presence of severe fatigue (VAS ⩾ 5) was explored in multivariable linear or logistic regression models including first socio-demographics and objective disease outcomes (M1), and then also subjective outcomes (M2). Next, country of residence was replaced by country characteristics: gross domestic product (GDP), human development index (HDI), latitude (as indicator of climate), language and income inequality index (gini-index). Model fit (R(2)) for linear models was compared. A total of 3920 patients from 17 countries were included, mean age 56 years (s.d. 13), 82% females. Mean fatigue across countries ranged from 1.86 (s.d. 2.46) to 4.99 (s.d. 2.64) and proportion of severe fatigue from 14% (Venezuela) to 65% (Egypt). Objective disease outcomes did not explain much of the variation in fatigue ([Formula: see text] = 0.12), while subjective outcomes had a strong negative impact and partly explained the variation in fatigue ([Formula: see text]= 0.27). Country of residence had a significant additional effect (increasing model fit to [Formula: see text] = 0.20 and [Formula: see text] = 0.36, respectively). Remarkably, higher GDP and better HDI were associated with higher fatigue, and explained a large part of the country effect. Logistic regression confirmed the limited contribution of objective outcomes and the relevant contribution of country of residence. Country of residence has an important influence on fatigue. Paradoxically, patients from wealthier countries had higher fatigue. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

The Association of Age, Insomnia, and Self-Efficacy with Continuous Positive Airway Pressure Adherence in Black, White, and Hispanic US Veterans

PubMed Central

Wallace, Douglas M.; Shafazand, Shirin; Aloia, Mark S.; Wohlgemuth, William K.

2013-01-01

Study Objectives: Studies of continuous positive airway pressure (CPAP) adherence in multi-ethnic samples are lacking. This study explores previously described factors associated with therapeutic CPAP use in South Florida veterans with obstructive sleep apnea-hypopnea syndrome (OSAHS). Methods: We performed a retrospective, cross-sectional analysis of CPAP adherence comparing white, black, and Hispanic veterans returning to the Miami VA sleep clinic over a 4-month period. Participants had CPAP use download and completed questionnaires on demographics, sleepiness, insomnia, and social cognitive measures related to adherence. Linear regression modeling was used to explore the impact of measured variables and potential interactions with race-ethnicity on mean daily CPAP use. Results: Participants (N = 248) were 94% male with mean age of 59 ± 11 years and included 95 blacks (38%), 91 whites (37%), and 62 Hispanic (25%) veterans. Blacks had less mean daily CPAP use than whites (-1.6 h, p < 0.001) and Hispanics (-1.3 h, p < 0.01). Blacks reported worse sleep onset insomnia symptoms compared to whites. In the final multivariable regression model, black race-ethnicity (p < 0.01), insomnia symptoms (p < 0.001), and self-efficacy (p < 0.001) were significantly associated with mean daily CPAP use. In addition, the black race by age interaction term showed a trend towards significance (p = 0.10). Conclusions: In agreement with recent studies, we found that mean daily CPAP use in blacks was 1 hour less than whites after adjusting for covariates. No CPAP adherence differences were noted between whites and Hispanics. Further investigations exploring sociocultural barriers to regular CPAP use in minority individuals with OSAHS are needed. Citation: Wallace DM; Shafazand S; Aloia MS; Wohlgemuth WK. The association of age, insomnia, and self-efficacy with continuous positive airway pressure adherence in black, white, and Hispanic US veterans. J Clin Sleep Med 2013;9(9):885-895. PMID:23997701
Mobile phone use during driving: Effects on speed and effectiveness of driver compensatory behaviour.

PubMed

Choudhary, Pushpa; Velaga, Nagendra R

2017-09-01

This study analysed and modelled the effects of conversation and texting (each with two difficulty levels) on driving performance of Indian drivers in terms of their mean speed and accident avoiding abilities; and further explored the relationship between speed reduction strategy of the drivers and their corresponding accident frequency. 100 drivers of three different age groups (young, mid-age and old-age) participated in the simulator study. Two sudden events of Indian context: unexpected crossing of pedestrians and joining of parked vehicles from road side, were simulated for estimating the accident probabilities. Generalized linear mixed models approach was used for developing linear regression models for mean speed and binary logistic regression models for accident probability. The results of the models showed that the drivers significantly compensated the increased workload by reducing their mean speed by 2.62m/s and 5.29m/s in the presence of conversation and texting tasks respectively. The logistic models for accident probabilities showed that the accident probabilities increased by 3 and 4 times respectively when the drivers were conversing or texting on a phone during driving. Further, the relationship between the speed reduction patterns and their corresponding accident frequencies showed that all the drivers compensated differently; but, among all the drivers, only few drivers, who compensated by reducing the speed by 30% or more, were able to fully offset the increased accident risk associated with the phone use. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bayesian hierarchical modelling of continuous non‐negative longitudinal data with a spike at zero: An application to a study of birds visiting gardens in winter

PubMed Central

Buckland, Stephen T.; King, Ruth; Toms, Mike P.

2015-01-01

The development of methods for dealing with continuous data with a spike at zero has lagged behind those for overdispersed or zero‐inflated count data. We consider longitudinal ecological data corresponding to an annual average of 26 weekly maximum counts of birds, and are hence effectively continuous, bounded below by zero but also with a discrete mass at zero. We develop a Bayesian hierarchical Tweedie regression model that can directly accommodate the excess number of zeros common to this type of data, whilst accounting for both spatial and temporal correlation. Implementation of the model is conducted in a Markov chain Monte Carlo (MCMC) framework, using reversible jump MCMC to explore uncertainty across both parameter and model spaces. This regression modelling framework is very flexible and removes the need to make strong assumptions about mean‐variance relationships a priori. It can also directly account for the spike at zero, whilst being easily applicable to other types of data and other model formulations. Whilst a correlative study such as this cannot prove causation, our results suggest that an increase in an avian predator may have led to an overall decrease in the number of one of its prey species visiting garden feeding stations in the United Kingdom. This may reflect a change in behaviour of house sparrows to avoid feeding stations frequented by sparrowhawks, or a reduction in house sparrow population size as a result of sparrowhawk increase. PMID:25737026
Negative correlation between altitudes and oxygen isotope ratios of seeds: exploring its applicability to assess vertical seed dispersal.

PubMed

Naoe, Shoji; Tayasu, Ichiro; Masaki, Takashi; Koike, Shinsuke

2016-10-01

Vertical seed dispersal, which plays a key role in plant escape and/or expansion under climate change, was recently evaluated for the first time using negative correlation between altitudes and oxygen isotope ratio of seeds. Although this method is innovative, its applicability to other plants is unknown. To explore the applicability of the method, we regressed altitudes on δ 18 O of seeds of five woody species constituting three families in temperate forests in central Japan. Because climatic factors, including temperature and precipitation that influence δ 18 O of plant materials, demonstrate intensive seasonal fluctuation in the temperate zone, we also evaluated the effect of fruiting season of each species on δ 18 O of seeds using generalized linear mixed models (GLMM). Negative correlation between altitudes and δ 18 O of seeds was found in four of five species tested. The slope of regression lines tended to be lower in late-fruiting species. The GLMM analysis revealed that altitudes and date of fruiting peak negatively affected δ 18 O of seeds. These results indicate that the estimation of vertical seed dispersal using δ 18 O of seeds can be applicable for various species, not just confined to specific taxa, by identifying the altitudes of plants that produced seeds. The results also suggest that the regression line between altitudes and δ 18 O of seeds is rather species specific and that vertical seed dispersal in late-fruiting species is estimated at a low resolution due to their small regression slopes. A future study on the identification of environmental factors and plant traits that cause a difference in δ 18 O of seeds, combined with an improvement of analysis, will lead to effective evaluation of vertical seed dispersal in various species and thereby promote our understanding about the mechanism and ecological functions of vertical seed dispersal.
Is the Critical Shields Stress for Incipient Sediment Motion Dependent on Bed Slope in Natural Channels? No.

NASA Astrophysics Data System (ADS)

Phillips, C. B.; Jerolmack, D. J.

2017-12-01

Understanding when coarse sediment begins to move in a river is essential for linking rivers to the evolution of mountainous landscapes. Unfortunately, the threshold of surface particle motion is notoriously difficult to measure in the field. However, recent studies have shown that the threshold of surface motion is empirically correlated with channel slope, a property that is easy to measure and readily available from the literature. These studies have thoroughly examined the mechanistic underpinnings behind the observed correlation and produced suitably complex models. These models are difficult to implement for natural rivers using widely available data, and thus others have treated the empirical regression between slope and the threshold of motion as a predictive model. We note that none of the authors of the original studies exploring this correlation suggested their empirical regressions be used in a predictive fashion, nevertheless these regressions between slope and the threshold of motion have found their way into numerous recent studies engendering potentially spurious conclusions. We demonstrate that there are two significant problems with using these empirical equations for prediction: (1) the empirical regressions are based on a limited sampling of the phase space of bed-load rivers and (2) the empirical measurements of bankfull and critical shear stresses are paired. The upshot of these problems limits the empirical relations predictive capacity to field sites drawn from the same region of the bed-load river phase space and that the paired nature of the data introduces a spurious correlation when considering the ratio of bankfull to critical shear stress. Using a large compilation of bed-load river hydraulic geometry data, we demonstrate that the variation within independently measured values of the threshold of motion changes systematically with bankfull shields stress and not channel slope. Additionally, we highlight using several recent datasets the potential pitfalls that one can encounter when using simplistic empirical regressions to predict the threshold of motion showing that while these concerns could be construed as subtle the resulting implications can be substantial.
[The warning model and influence of climatic changes on hemorrhagic fever with renal syndrome in Changsha city].

PubMed

Xiao, Hong; Tian, Huai-yu; Zhang, Xi-xing; Zhao, Jian; Zhu, Pei-juan; Liu, Ru-chun; Chen, Tian-mu; Dai, Xiang-yu; Lin, Xiao-ling

2011-10-01

To realize the influence of climatic changes on the transmission of hemorrhagic fever with renal syndrome (HFRS), and to explore the adoption of climatic factors in warning HFRS. A total of 2171 cases of HFRS and the synchronous climatic data in Changsha from 2000 to 2009 were collected to a climate-based forecasting model for HFRS transmission. The Cochran-Armitage trend test was employed to explore the variation trend of the annual incidence of HFRS. Cross-correlations analysis was then adopted to assess the time-lag period between the climatic factors, including monthly average temperature, relative humidity, rainfall and Multivariate Elño-Southern Oscillation Index (MEI) and the monthly HFRS cases. Finally the time-series Poisson regression model was constructed to analyze the influence of different climatic factors on the HFRS transmission. The annual incidence of HFRS in Changsha between 2000 - 2009 was 13.09/100 000 (755 cases), 9.92/100 000 (578 cases), 5.02/100 000 (294 cases), 2.55/100 000 (150 cases), 1.13/100 000 (67 cases), 1.16/100 000 (70 cases), 0.95/100 000 (58 cases), 1.40/100 000 (87 cases), 0.75/100 000 (47 cases) and 1.02/100 000 (65 cases), respectively. The incidence showed a decline during these years (Z = -5.78, P < 0.01). The results of Poisson regression model indicated that the monthly average temperature (18.00°C, r = 0.26, P < 0.01, 1-month lag period; IRR = 1.02, 95%CI: 1.00 - 1.03, P < 0.01), relative humidity (75.50%, r = 0.62, P < 0.01, 3-month lag period; IRR = 1.03, 95%CI: 1.02 - 1.04, P < 0.01), rainfall (112.40 mm, r = 0.25, P < 0.01, 6-month lag period; IRR = 1.01, 95CI: 1.01 - 1.02, P = 0.02), and MEI (r = 0.31, P < 0.01, 3-month lag period; IRR = 0.77, 95CI: 0.67 - 0.88, P < 0.01) were closely associated with monthly HFRS cases (18.10 cases). Climate factors significantly influence the incidence of HFRS. If the influence of variable-autocorrelation, seasonality, and long-term trend were controlled, the accuracy of forecasting by the time-series Poisson regression model in Changsha would be comparatively high, and we could forecast the incidence of HFRS in advance.
Are there specific health-related factors that can accentuate the risk of suicide among men with prostate cancer?

PubMed

Lehuluante, Abraraw; Fransson, Per

2014-06-01

The aim of this study was to explore if there were some specific factors pertinent to health-related quality of life (HRQoL) that could affect self-experienced suicide ideation in men with prostate cancer (PCa). Questionnaires containing 45 items were distributed to members of the Swedish Prostate Cancer Federation in May 2012. Out of 6,400 distributed questionnaires, 3,165 members (50 %) with PCa completed the questionnaires. Those members expressed their experienced HRQoL and experienced suicide ideation using VAS-like scales as well as multiple-choice questions. Both descriptive and analytical statistical methods were employed. A regression model was used to explore the relationship between experienced health-related quality of life and experienced suicide ideation. Generally, the respondents rated their self-experienced health-related quality of life as good. About 40 % of the participants had experienced problem with incontinence, and 23 % had obstructions during miction. About 7 % of the respondents experienced suicidal ideation, at least sometime. The regression model showed statistically significant relationships between suicide ideation, on the one hand, and lower self-rated health-related quality of life (P < 0.001), physical pain (P = 0.04), pain during miction (P = 0.03), and low-rated mental / physical energy (P = 0.03), on the other. It is quite necessary to know which specific disease and treatment-related problems can trigger suicide ideations in men with prostate cancer and to try to direct treatment, care, and psychosocial resources to alleviate these problems in time.
Compulsive cell phone use and history of motor vehicle crash.

PubMed

O'Connor, Stephen S; Whitehill, Jennifer M; King, Kevin M; Kernic, Mary A; Boyle, Linda Ng; Bresnahan, Brian W; Mack, Christopher D; Ebel, Beth E

2013-10-01

Few studies have examined the psychological factors underlying the association between cell phone use and motor vehicle crash. We sought to examine the factor structure and convergent validity of a measure of problematic cell phone use, and to explore whether compulsive cell phone use is associated with a history of motor vehicle crash. We recruited a sample of 383 undergraduate college students to complete an online assessment that included cell phone use and driving history. We explored the dimensionality of the Cell Phone Overuse Scale (CPOS) using factor analytic methods. Ordinary least-squares regression models were used to examine associations between identified subscales and measures of impulsivity, alcohol use, and anxious relationship style, to establish convergent validity. We used negative binomial regression models to investigate associations between the CPOS and motor vehicle crash incidence. We found the CPOS to be composed of four subscales: anticipation, activity interfering, emotional reaction, and problem recognition. Each displayed significant associations with aspects of impulsivity, problematic alcohol use, and anxious relationship style characteristics. Only the anticipation subscale demonstrated statistically significant associations with reported motor vehicle crash incidence, controlling for clinical and demographic characteristics (relative ratio, 1.13; confidence interval, 1.01-1.26). For each 1-point increase on the 6-point anticipation subscale, risk for previous motor vehicle crash increased by 13%. Crash risk is strongly associated with heightened anticipation about incoming phone calls or messages. The mean score on the CPOS is associated with increased risk of motor vehicle crash but does not reach statistical significance. Copyright © 2013 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Compulsive Cell Phone Use and History of Motor Vehicle Crash

PubMed Central

O’Connor, Stephen S.; Whitehill, Jennifer M.; King, Kevin M.; Kernic, Mary A.; Boyle, Linda Ng; Bresnahan, Brian; Mack, Christopher D.; Ebel, Beth E.

2013-01-01

Introduction Few studies have examined the psychological factors underlying the association between cell phone use and motor vehicle crash. We sought to examine the factor structure and convergent validity of a measure of problematic cell phone use and explore whether compulsive cell phone use is associated with a history of motor vehicle crash. Methods We recruited a sample of 383 undergraduate college students to complete an on-line assessment that included cell phone use and driving history. We explored the dimensionality of the Cell Phone Overuse Scale (CPOS) using factor analytic methods. Ordinary least squares regression models were used to examine associations between identified subscales and measures of impulsivity, alcohol use, and anxious relationship style to establish convergent validity. We used negative binomial regression models to investigate associations between the CPOS and motor vehicle crash incidence. Results We found the CPOS to be comprised of four subscales: anticipation, activity interfering, emotional reaction, and problem recognition. Each displayed significant associations with aspects of impulsivity, problematic alcohol use, and anxious relationship style characteristics. Only the anticipation subscale demonstrated statistically significant associations with reported motor vehicle crash incidence, controlling for clinical and demographic characteristics (RR 1.13, CI 1.01 to 1.26). For each one-point increase on the 6-point anticipation subscale, risk for previous motor vehicle crash increased by 13%. Conclusions Crash risk is strongly associated with heightened anticipation about incoming phone calls or messages. The mean score on the CPOS is associated with increased risk of motor vehicle crash but does not reach statistical significance. PMID:23910571
Determinants of Exclusive Breast Feeding in sub-Saharan Africa: A Multilevel Approach.

PubMed

Yalçin, Siddika Songül; Berde, Anselm S; Yalçin, Suzan

2016-09-01

The study aimed to provide an overall picture of the general pattern of exclusive breast feeding (EBF) in sub-Saharan Africa (SSA) by examining maternal sociodemographic, antenatal and postnatal factors associated with EBF in the region, as well as explore countries variations in EBF rates. We utilised cross-sectional data from the Demographic Health Surveys in 27 SSA countries. Our study sample included 25 084 infants under 6 months of age. The key outcome variable was EBF in the last 24 h. Due to the hierarchical structure of the data, a multilevel logistic regression model was used to explore factors associated with EBF. The overall prevalence of EBF in SSA was 36.0%, the prevalence was highest in Rwanda and lowest in Gabon. In the multilevel regression model, factors that were associated with increased likelihood of EBF included secondary and above maternal education, mothers within the ages of 25-34 years, rural residence, richer household wealth quantile, 4+ antenatal care visit, delivering in a health facility, singleton births, female infants, early initiation of breast feeding (EIBF), and younger infants. However, countries with higher gross national income per capita had lower EBF rates. To achieve a substantial increase in EBF rates in SSA, breast-feeding interventions and policies should target all women but with more emphasis to mothers with younger age, low educational status, urban residence, poor status, multiple births, and male infants. In addition, there is a need to promote antenatal care utilisation, hospital deliveries, and EIBF. © 2016 John Wiley & Sons Ltd.
Provincial income inequality and self‐reported health status in China during 1991–7

PubMed Central

Pei, X; Rodriguez, E

2006-01-01

Background The relationship between income inequality and health has been widely explored. Today there is some evidence suggesting that good health is inversely related to income inequality. After the economic reforms initiated in the early 1980s, China experienced one of the fastest‐growing income inequalities in the world. The state of China in the 1990s is focussed on and possible effects of provincial income inequality on individual health status are explored. Methods A multilevel regression model is used to analyse the data collected in 1991, 1993 and 1997 from nine provinces included in the China Health and Nutrition Survey. The effects of provincial Gini coefficients on self‐rated health in each year are evaluated by two logistic regressions estimating the odds ratios of reporting poor or fair health. The patterns of this effect are compared among the survey years and also among different demographic groups. Results The analyses show an independent effect of income inequality on self‐reported health after adjusting for individual and household variables. Furthermore, the effect of income distribution is not attenuated when household income and provincial gross domestic product per capita are included in the model. The results show that there is an increased risk of about 10–15% on average for fair or poor health for people living in provinces with greater income inequalities compared with provinces with modest income inequalities. Conclusions In China, societal income inequality appears to be an important determinant of population health during 1991–7. PMID:17108303
Risk Factors and Stroke Characteristic in Patients with Postoperative Strokes.

PubMed

Dong, Yi; Cao, Wenjie; Cheng, Xin; Fang, Kun; Zhang, Xiaolong; Gu, Yuxiang; Leng, Bing; Dong, Qiang

2017-07-01

Intravenous thrombolysis and intra-arterial thrombectomy are now the standard therapies for patients with acute ischemic stroke. In-house strokes have often been overlooked even at stroke centers and there is no consensus on how they should be managed. Perioperative stroke happens rather frequently but treatment protocol is lacking, In China, the issue of in-house strokes has not been explored. The aim of this study is to explore the current management of in-house stroke and identify the common risk factors associated with perioperative strokes. Altogether, 51,841 patients were admitted to a tertiary hospital in Shanghai and the records of those who had a neurological consult for stroke were reviewed. Their demographics, clinical characteristics, in-hospital complications and operations, and management plans were prospectively studied. Routine laboratory test results and risk factors of these patients were analyzed by multiple logistic regression model. From January 1, 2015, to December 31, 2015, over 1800 patients had neurological consultations. Among these patients, 37 had an in-house stroke and 20 had more severe stroke during the postoperative period. Compared to in-house stroke patients without a procedure or operation, leukocytosis and elevated fasting glucose levels were more common in perioperative strokes. In multiple logistic regression model, perioperative strokes were more likely related to large vessel occlusion. Patients with perioperative strokes had different risk factors and severity from other in-house strokes. For these patients, obtaining a neurological consultation prior to surgery may be appropriate in order to evaluate the risk of perioperative stroke. Copyright © 2017. Published by Elsevier Inc.
Effect of price and information on the food choices of women university students in Saudi Arabia: An experimental study.

PubMed

Halimic, Aida; Gage, Heather; Raats, Monique; Williams, Peter

2018-04-01

To explore the impact of price manipulation and healthy eating information on intended food choices. Health information was provided to a random half of subjects (vs. information on Saudi agriculture). Each subject chose from the same lunch menu, containing two healthy and two unhealthy entrees, deserts and beverages, on five occasions. Reference case prices were 5, 3 and 2 Saudi Arabian Reals (SARs). Prices of healthy and unhealthy items were manipulated up (taxed) and down (subsidized) by 1 SAR in four menu variations (random order); subjects were given a budget enabling full choice within any menu. The number of healthy food choices were compared with different price combinations, and between information groups. Linear regression modelling explored the effect of relative prices of healthy/unhealthy options and information on number of healthy choices controlling for dietary behaviours and hunger levels. University campus, Saudi Arabia, 2013. 99 women students. In the reference case, 49.5% of choices were for healthy items. When the price of healthy items was reduced, 58.5% of selections were healthy; 57.2% when the price of unhealthy items rose. In regression modelling, reducing the price of healthy items and increasing the price of unhealthy items increased the number of healthy choices by 5% and 6% respectively. Students reporting a less healthy usual diet selected significantly fewer healthy items. Providing healthy eating information was not a significant influence. Price manipulation offers potential for altering behaviours to combat rising youth obesity in Saudi Arabia. Copyright © 2018 Elsevier Ltd. All rights reserved.
Learning Through Experience: Influence of Formal and Informal Training on Medical Error Disclosure Skills in Residents.

PubMed

Wong, Brian M; Coffey, Maitreya; Nousiainen, Markku T; Brydges, Ryan; McDonald-Blumer, Heather; Atkinson, Adelle; Levinson, Wendy; Stroud, Lynfa

2017-02-01

Residents' attitudes toward error disclosure have improved over time. It is unclear whether this has been accompanied by improvements in disclosure skills. To measure the disclosure skills of internal medicine (IM), paediatrics, and orthopaedic surgery residents, and to explore resident perceptions of formal versus informal training in preparing them for disclosure in real-world practice. We assessed residents' error disclosure skills using a structured role play with a standardized patient in 2012-2013. We compared disclosure skills across programs using analysis of variance. We conducted a multiple linear regression, including data from a historical cohort of IM residents from 2005, to investigate the influence of predictor variables on performance: training program, cohort year, and prior disclosure training and experience. We conducted a qualitative descriptive analysis of data from semistructured interviews with residents to explore resident perceptions of formal versus informal disclosure training. In a comparison of disclosure skills for 49 residents, there was no difference in overall performance across specialties (4.1 to 4.4 of 5, P = .19). In regression analysis, only the current cohort was significantly associated with skill: current residents performed better than a historical cohort of 42 IM residents ( P < .001). Qualitative analysis identified the importance of both formal (workshops, morbidity and mortality rounds) and informal (role modeling, debriefing) activities in preparation for disclosure in real-world practice. Residents across specialties have similar skills in disclosure of errors. Residents identified role modeling and a strong local patient safety culture as key facilitators for disclosure.
Paleohydrogeology of the San Joaquin basin, California

USGS Publications Warehouse

Wilson, A.M.; Garven, G.; Boles, J.R.

1999-01-01

Mass transport can have a significant effect on chemical diagenetic processes in sedimentary basins. This paper presents results from the first part of a study that was designed to explore the role of an evolving hydrodynamic system in driving mass transport and chemical diagenesis, using the San Joaquin basin of California as a field area. We use coupled hydrogeologic models to establish the paleohydrogeology, thermal history, and behavior of nonreactive solutes in the basin. These models rely on extensive geological information and account for variable-density fluid flow, heat transport, solute transport, tectonic uplift, sediment compaction, and clay dehydration. In our numerical simulations, tectonic uplift and ocean regression led to large-scale changes in fluid flow and composition by strengthening topography-driven fluid flow and allowing deep influx of fresh ground water in the San Joaquin basin. Sediment compaction due to rapid deposition created moderate overpressures, leading to upward flow from depth. The unusual distribution of salinity in the basin reflects influx of fresh ground water to depths of as much as 2 km and dilution of saline fluids by dehydration reactions at depths greater than ???2.5 km. Simulations projecting the future salinity of the basin show marine salinities persisting for more than 10 m.y. after ocean regression. Results also show a change from topography-to compaction-driven flow in the Stevens Sandstone at ca. 5 Ma that coincides with an observed change in the diagenetic sequence. Results of this investigation provide a framework for future hydrologic research exploring the link between fluid flow and diagenesis.
Diabetes, depressive symptoms, and functional disability in African Americans: the Jackson Heart Study.

PubMed

Kalyani, Rita Rastogi; Ji, Nan; Carnethon, Mercedes; Bertoni, Alain G; Selvin, Elizabeth; Gregg, Edward W; Sims, Mario; Golden, Sherita Hill

2017-08-01

To investigate the degree to which comorbid depression contributes to the relationship of diabetes with functional disability in African Americans (AAs), a population at high-risk for complications. We examined 2989 African Americans (AAs) in the Jackson Heart Study who had diabetes and depressive symptoms (CES-D) assessed at baseline. Overall functional disability was defined as the inability to perform at least one task of daily living. Multivariable logistic regression models explored the association of diabetes and depressive symptoms with functional disability. Prevalence of overall functional disability was highest with both diabetes and depressive symptoms (54%), similar with diabetes alone (31%) or depressive symptoms alone (33%), and lowest with neither (15%). Adjusting for demographics, smoking, BMI, cardiovascular comorbidities, and hsCRP, the association of depressive symptoms alone (OR=2.30,95% CI 1.75-3.03) and both diabetes and depressive symptoms (OR=2.75,1.88-4.04) with overall functional disability was significant, but not for diabetes alone (OR=1.26,0.95-1.67), compared to neither. In regression analyses including any diabetes and any depressive symptoms together in models, the main effect of depressive symptoms but not diabetes was associated with overall functional disability, and the interaction term was not significant (p-value=0.84). Functional disability was highest among AAs who have both diabetes and depressive symptoms; the latter was a stronger contributor. Future studies should explore mechanisms underlying functional disability in diabetes, particularly the role of depression. Copyright © 2017 Elsevier Inc. All rights reserved.
Prediction of Biological Motion Perception Performance from Intrinsic Brain Network Regional Efficiency

PubMed Central

Wang, Zengjian; Zhang, Delong; Liang, Bishan; Chang, Song; Pan, Jinghua; Huang, Ruiwang; Liu, Ming

2016-01-01

Biological motion perception (BMP) refers to the ability to perceive the moving form of a human figure from a limited amount of stimuli, such as from a few point lights located on the joints of a moving body. BMP is commonplace and important, but there is great inter-individual variability in this ability. This study used multiple regression model analysis to explore the association between BMP performance and intrinsic brain activity, in order to investigate the neural substrates underlying inter-individual variability of BMP performance. The resting-state functional magnetic resonance imaging (rs-fMRI) and BMP performance data were collected from 24 healthy participants, for whom intrinsic brain networks were constructed, and a graph-based network efficiency metric was measured. Then, a multiple linear regression model was used to explore the association between network regional efficiency and BMP performance. We found that the local and global network efficiency of many regions was significantly correlated with BMP performance. Further analysis showed that the local efficiency rather than global efficiency could be used to explain most of the BMP inter-individual variability, and the regions involved were predominately located in the Default Mode Network (DMN). Additionally, discrimination analysis showed that the local efficiency of certain regions such as the thalamus could be used to classify BMP performance across participants. Notably, the association pattern between network nodal efficiency and BMP was different from the association pattern of static directional/gender information perception. Overall, these findings show that intrinsic brain network efficiency may be considered a neural factor that explains BMP inter-individual variability. PMID:27853427
Diesel engine exhaust and lung cancer risks - evaluation of the meta-analysis by Vermeulen et al. 2014.

PubMed

Morfeld, Peter; Spallek, Michael

2015-01-01

Vermeulen et al. 2014 published a meta-regression analysis of three relevant epidemiological US studies (Steenland et al. 1998, Garshick et al. 2012, Silverman et al. 2012) that estimated the association between occupational diesel engine exhaust (DEE) exposure and lung cancer mortality. The DEE exposure was measured as cumulative exposure to estimated respirable elemental carbon in μg/m(3)-years. Vermeulen et al. 2014 found a statistically significant dose-response association and described elevated lung cancer risks even at very low exposures. We performed an extended re-analysis using different modelling approaches (fixed and random effects regression analyses, Greenland/Longnecker method) and explored the impact of varying input data (modified coefficients of Garshick et al. 2012, results from Crump et al. 2015 replacing Silverman et al. 2012, modified analysis of Moehner et al. 2013). We reproduced the individual and main meta-analytical results of Vermeulen et al. 2014. However, our analysis demonstrated a heterogeneity of the baseline relative risk levels between the three studies. This heterogeneity was reduced after the coefficients of Garshick et al. 2012 were modified while the dose coefficient dropped by an order of magnitude for this study and was far from being significant (P = 0.6). A (non-significant) threshold estimate for the cumulative DEE exposure was found at 150 μg/m(3)-years when extending the meta-analyses of the three studies by hockey-stick regression modelling (including the modified coefficients for Garshick et al. 2012). The data used by Vermeulen and colleagues led to the highest relative risk estimate across all sensitivity analyses performed. The lowest relative risk estimate was found after exclusion of the explorative study by Steenland et al. 1998 in a meta-regression analysis of Garshick et al. 2012 (modified), Silverman et al. 2012 (modified according to Crump et al. 2015) and Möhner et al. 2013. The meta-coefficient was estimated to be about 10-20 % of the main effect estimate in Vermeulen et al. 2014 in this analysis. The findings of Vermeulen et al. 2014 should not be used without reservations in any risk assessments. This is particularly true for the low end of the exposure scale.
A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

ERIC Educational Resources Information Center

Liou, Pey-Yan

2009-01-01

The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…
Breeding value accuracy estimates for growth traits using random regression and multi-trait models in Nelore cattle.

PubMed

Boligon, A A; Baldi, F; Mercadante, M E Z; Lobo, R B; Pereira, R J; Albuquerque, L G

2011-06-28

We quantified the potential increase in accuracy of expected breeding value for weights of Nelore cattle, from birth to mature age, using multi-trait and random regression models on Legendre polynomials and B-spline functions. A total of 87,712 weight records from 8144 females were used, recorded every three months from birth to mature age from the Nelore Brazil Program. For random regression analyses, all female weight records from birth to eight years of age (data set I) were considered. From this general data set, a subset was created (data set II), which included only nine weight records: at birth, weaning, 365 and 550 days of age, and 2, 3, 4, 5, and 6 years of age. Data set II was analyzed using random regression and multi-trait models. The model of analysis included the contemporary group as fixed effects and age of dam as a linear and quadratic covariable. In the random regression analyses, average growth trends were modeled using a cubic regression on orthogonal polynomials of age. Residual variances were modeled by a step function with five classes. Legendre polynomials of fourth and sixth order were utilized to model the direct genetic and animal permanent environmental effects, respectively, while third-order Legendre polynomials were considered for maternal genetic and maternal permanent environmental effects. Quadratic polynomials were applied to model all random effects in random regression models on B-spline functions. Direct genetic and animal permanent environmental effects were modeled using three segments or five coefficients, and genetic maternal and maternal permanent environmental effects were modeled with one segment or three coefficients in the random regression models on B-spline functions. For both data sets (I and II), animals ranked differently according to expected breeding value obtained by random regression or multi-trait models. With random regression models, the highest gains in accuracy were obtained at ages with a low number of weight records. The results indicate that random regression models provide more accurate expected breeding values than the traditionally finite multi-trait models. Thus, higher genetic responses are expected for beef cattle growth traits by replacing a multi-trait model with random regression models for genetic evaluation. B-spline functions could be applied as an alternative to Legendre polynomials to model covariance functions for weights from birth to mature age.

Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

ERIC Educational Resources Information Center

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Flourishing: exploring predictors of mental health within the college environment.

PubMed

Fink, John E

2014-01-01

To explore the predictive factors of student mental health within the college environment. Students enrolled at 7 unique universities during years 2008 (n=1,161) and 2009 (n=1,459). Participants completed survey measures of mental health, consequences of alcohol use, and engagement in the college environment. In addition to replicating previous findings related to Keyes' Mental Health Continuum, multiple regression analysis revealed several predictors of college student mental health, including supportive college environments, students' sense of belonging, professional confidence, and civic engagement. However, multiple measures of engaged learning were not found to predict mental health. Results suggest that supportive college environments foster student flourishing. Implications for promoting mental health across campus are discussed. Future research should build on exploratory findings and test confirmatory models to better understand relationships between the college environment and student flourishing.
Exploring Crossing Differential Item Functioning by Gender in Mathematics Assessment

ERIC Educational Resources Information Center

Ong, Yoke Mooi; Williams, Julian; Lamprianou, Iasonas

2015-01-01

The purpose of this article is to explore crossing differential item functioning (DIF) in a test drawn from a national examination of mathematics for 11-year-old pupils in England. An empirical dataset was analyzed to explore DIF by gender in a mathematics assessment. A two-step process involving the logistic regression (LR) procedure for…
Modeling spatial patterns of wildfire susceptibility in southern California: Applications of MODIS remote sensing data and mesoscale numerical weather models

NASA Astrophysics Data System (ADS)

Schneider, Philipp

This dissertation investigates the potential of Moderate Resolution Imaging Spectroradiometer (MODIS) imagery and mesoscale numerical weather models for mapping wildfire susceptibility in general and for improving the Fire Potential Index (FPI) in southern California in particular. The dissertation explores the use of the Visible Atmospherically Resistant Index (VARI) from MODIS data for mapping relative greenness (RG) of vegetation and subsequently for computing the FPI. VARI-based RG was validated against in situ observations of live fuel moisture. The results indicate that VARI is superior to the previously used Normalized Difference Vegetation Index (NDVI) for computing RG. FPI computed using VARI-based RG was found to outperform the traditional FPI when validated against historical fire detections using logistic regression. The study further investigates the potential of using Multiple Endmember Spectral Mixture Analysis (MESMA) on MODIS data for estimating live and dead fractions of vegetation. MESMA fractions were compared against in situ measurements and fractions derived from data of a high-resolution, hyperspectral sensor. The results show that live and dead fractions obtained from MODIS using MESMA are well correlated with the reference data. Further, FPI computed using MESMA-based green vegetation fraction in lieu of RG was validated against historical fire occurrence data. MESMA-based FPI performs at a comparable level to the traditional NDVI-based FPI, but can do so using a single MODIS image rather than an extensive remote sensing time series as required for the RG approach. Finally this dissertation explores the potential of integrating gridded wind speed data obtained from the MM5 mesoscale numerical weather model in the FPI. A new fire susceptibility index, the Wind-Adjusted Fire Potential Index (WAFPI), was introduced. It modifies the FPI algorithm by integrating normalized wind speed. Validating WAFPI against historical wildfire events using logistic regression indicates that gridded data sets of wind speed are a valuable addition to the FPI as they can significantly increase the probability range of the fitted model and can further increase the model's discriminatory power over that of the traditional FPI.
A spatially filtered multilevel model to account for spatial dependency: application to self-rated health status in South Korea

PubMed Central

2014-01-01

Background This study aims to suggest an approach that integrates multilevel models and eigenvector spatial filtering methods and apply it to a case study of self-rated health status in South Korea. In many previous health-related studies, multilevel models and single-level spatial regression are used separately. However, the two methods should be used in conjunction because the objectives of both approaches are important in health-related analyses. The multilevel model enables the simultaneous analysis of both individual and neighborhood factors influencing health outcomes. However, the results of conventional multilevel models are potentially misleading when spatial dependency across neighborhoods exists. Spatial dependency in health-related data indicates that health outcomes in nearby neighborhoods are more similar to each other than those in distant neighborhoods. Spatial regression models can address this problem by modeling spatial dependency. This study explores the possibility of integrating a multilevel model and eigenvector spatial filtering, an advanced spatial regression for addressing spatial dependency in datasets. Methods In this spatially filtered multilevel model, eigenvectors function as additional explanatory variables accounting for unexplained spatial dependency within the neighborhood-level error. The specification addresses the inability of conventional multilevel models to account for spatial dependency, and thereby, generates more robust outputs. Results The findings show that sex, employment status, monthly household income, and perceived levels of stress are significantly associated with self-rated health status. Residents living in neighborhoods with low deprivation and a high doctor-to-resident ratio tend to report higher health status. The spatially filtered multilevel model provides unbiased estimations and improves the explanatory power of the model compared to conventional multilevel models although there are no changes in the signs of parameters and the significance levels between the two models in this case study. Conclusions The integrated approach proposed in this paper is a useful tool for understanding the geographical distribution of self-rated health status within a multilevel framework. In future research, it would be useful to apply the spatially filtered multilevel model to other datasets in order to clarify the differences between the two models. It is anticipated that this integrated method will also out-perform conventional models when it is used in other contexts. PMID:24571639
A forecasting method to reduce estimation bias in self-reported cell phone data.

PubMed

Redmayne, Mary; Smith, Euan; Abramson, Michael J

2013-01-01

There is ongoing concern that extended exposure to cell phone electromagnetic radiation could be related to an increased risk of negative health effects. Epidemiological studies seek to assess this risk, usually relying on participants' recalled use, but recall is notoriously poor. Our objectives were primarily to produce a forecast method, for use by such studies, to reduce estimation bias in the recalled extent of cell phone use. The method we developed, using Bayes' rule, is modelled with data we collected in a cross-sectional cluster survey exploring cell phone user-habits among New Zealand adolescents. Participants recalled their recent extent of SMS-texting and retrieved from their provider the current month's actual use-to-date. Actual use was taken as the gold standard in the analyses. Estimation bias arose from a large random error, as observed in all cell phone validation studies. We demonstrate that this seriously exaggerates upper-end forecasts of use when used in regression models. This means that calculations using a regression model will lead to underestimation of heavy-users' relative risk. Our Bayesian method substantially reduces estimation bias. In cases where other studies' data conforms to our method's requirements, application should reduce estimation bias, leading to a more accurate relative risk calculation for mid-to-heavy users.
Estimating Building Age with 3d GIS

NASA Astrophysics Data System (ADS)

Biljecki, F.; Sindram, M.

2017-10-01

Building datasets (e.g. footprints in OpenStreetMap and 3D city models) are becoming increasingly available worldwide. However, the thematic (attribute) aspect is not always given attention, as many of such datasets are lacking in completeness of attributes. A prominent attribute of buildings is the year of construction, which is useful for some applications, but its availability may be scarce. This paper explores the potential of estimating the year of construction (or age) of buildings from other attributes using random forest regression. The developed method has a two-fold benefit: enriching datasets and quality control (verification of existing attributes). Experiments are carried out on a semantically rich LOD1 dataset of Rotterdam in the Netherlands using 9 attributes. The results are mixed: the accuracy in the estimation of building age depends on the available information used in the regression model. In the best scenario we have achieved predictions with an RMSE of 11 years, but in more realistic situations with limited knowledge about buildings the error is much larger (RMSE = 26 years). Hence the main conclusion of the paper is that inferring building age with 3D city models is possible to a certain extent because it reveals the approximate period of construction, but precise estimations remain a difficult task.
A New Metric for Land-Atmosphere Coupling Strength: Applications on Observations and Modeling

NASA Astrophysics Data System (ADS)

Tang, Q.; Xie, S.; Zhang, Y.; Phillips, T. J.; Santanello, J. A., Jr.; Cook, D. R.; Riihimaki, L.; Gaustad, K.

2017-12-01

A new metric is proposed to quantify the land-atmosphere (LA) coupling strength and is elaborated by correlating the surface evaporative fraction and impacting land and atmosphere variables (e.g., soil moisture, vegetation, and radiation). Based upon multiple linear regression, this approach simultaneously considers multiple factors and thus represents complex LA coupling mechanisms better than existing single variable metrics. The standardized regression coefficients quantify the relative contributions from individual drivers in a consistent manner, avoiding the potential inconsistency in relative influence of conventional metrics. Moreover, the unique expendable feature of the new method allows us to verify and explore potentially important coupling mechanisms. Our observation-based application of the new metric shows moderate coupling with large spatial variations at the U.S. Southern Great Plains. The relative importance of soil moisture vs. vegetation varies by location. We also show that LA coupling strength is generally underestimated by single variable methods due to their incompleteness. We also apply this new metric to evaluate the representation of LA coupling in the Accelerated Climate Modeling for Energy (ACME) V1 Contiguous United States (CONUS) regionally refined model (RRM). This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-734201
Modeling absolute differences in life expectancy with a censored skew-normal regression approach

PubMed Central

Clough-Gorr, Kerri; Zwahlen, Marcel

2015-01-01

Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544
Probabilistic estimates of number of undiscovered deposits and their total tonnages in permissive tracts using deposit densities

USGS Publications Warehouse

Singer, Donald A.; Kouda, Ryoichi

2011-01-01

Empirical evidence indicates that processes affecting number and quantity of resources in geologic settings are very general across deposit types. Sizes of permissive tracts that geologically could contain the deposits are excellent predictors of numbers of deposits. In addition, total ore tonnage of mineral deposits of a particular type in a tract is proportional to the type’s median tonnage in a tract. Regressions using size of permissive tracts and median tonnage allow estimation of number of deposits and of total tonnage of mineralization. These powerful estimators, based on 10 different deposit types from 109 permissive worldwide control tracts, generalize across deposit types. Estimates of number of deposits and of total tonnage of mineral deposits are made by regressing permissive area, and mean (in logs) tons in deposits of the type, against number of deposits and total tonnage of deposits in the tract for the 50th percentile estimates. The regression equations (R2 = 0.91 and 0.95) can be used for all deposit types just by inserting logarithmic values of permissive area in square kilometers, and mean tons in deposits in millions of metric tons. The regression equations provide estimates at the 50th percentile, and other equations are provided for 90% confidence limits for lower estimates and 10% confidence limits for upper estimates of number of deposits and total tonnage. Equations for these percentile estimates along with expected value estimates are presented here along with comparisons with independent expert estimates. Also provided are the equations for correcting for the known well-explored deposits in a tract. These deposit-density models require internally consistent grade and tonnage models and delineations for arriving at unbiased estimates.
Marital status and survival of patients with oral cavity squamous cell carcinoma: a population-based study.

PubMed

Shi, Xiao; Zhang, Ting-Ting; Hu, Wei-Ping; Ji, Qing-Hai

2017-04-25

The relationship between marital status and oral cavity squamous cell carcinoma (OCSCC) survival has not been explored. The objective of our study was to evaluate the impact of marital status on OCSCC survival and investigate the potential mechanisms. Married patients had better 5-year cancer-specific survival (CSS) (66.7% vs 54.9%) and 5-year overall survival (OS) (56.0% vs 41.1%). In multivariate Cox regression models, unmarried patients also showed higher mortality risk for both CSS (Hazard Ratio [HR]: 1.260, 95% confidence interval (CI): 1.187-1.339, P < 0.001) and OS (HR: 1.328, 95% CI: 1.266-1.392, P < 0.001). Multivariate logistic regression showed married patients were more likely to be diagnosed at earlier stage (P < 0.001) and receive surgery (P < 0.001). Married patients still demonstrated better prognosis in the 1:1 matched group analysis (CSS: 62.9% vs 60.8%, OS: 52.3% vs 46.5%). 11022 eligible OCSCC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, including 5902 married and 5120 unmarried individuals. Kaplan-Meier analysis, Log-rank test and Cox proportional hazards regression model were used to analyze survival and mortality risk. Influence of marital status on stage, age at diagnosis and selection of treatment was determined by binomial and multinomial logistic regression. Propensity score matching method was adopted to perform a 1:1 matched cohort. Marriage has an independently protective effect on OCSCC survival. Earlier diagnosis and more sufficient treatment are possible explanations. Besides, even after 1:1 matching, survival advantage of married group still exists, indicating that spousal support from other aspects may also play an important role.
Marital status and survival of patients with oral cavity squamous cell carcinoma: a population-based study

PubMed Central

Shi, Xiao; Zhang, Ting-ting; Hu, Wei-ping; Ji, Qing-hai

2017-01-01

Background The relationship between marital status and oral cavity squamous cell carcinoma (OCSCC) survival has not been explored. The objective of our study was to evaluate the impact of marital status on OCSCC survival and investigate the potential mechanisms. Results Married patients had better 5-year cancer-specific survival (CSS) (66.7% vs 54.9%) and 5-year overall survival (OS) (56.0% vs 41.1%). In multivariate Cox regression models, unmarried patients also showed higher mortality risk for both CSS (Hazard Ratio [HR]: 1.260, 95% confidence interval (CI): 1.187–1.339, P < 0.001) and OS (HR: 1.328, 95% CI: 1.266–1.392, P < 0.001). Multivariate logistic regression showed married patients were more likely to be diagnosed at earlier stage (P < 0.001) and receive surgery (P < 0.001). Married patients still demonstrated better prognosis in the 1:1 matched group analysis (CSS: 62.9% vs 60.8%, OS: 52.3% vs 46.5%). Materials and Methods 11022 eligible OCSCC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, including 5902 married and 5120 unmarried individuals. Kaplan-Meier analysis, Log-rank test and Cox proportional hazards regression model were used to analyze survival and mortality risk. Influence of marital status on stage, age at diagnosis and selection of treatment was determined by binomial and multinomial logistic regression. Propensity score matching method was adopted to perform a 1:1 matched cohort. Conclusions Marriage has an independently protective effect on OCSCC survival. Earlier diagnosis and more sufficient treatment are possible explanations. Besides, even after 1:1 matching, survival advantage of married group still exists, indicating that spousal support from other aspects may also play an important role. PMID:28415710
Estimating and Predicting Metal Concentration Using Online Turbidity Values and Water Quality Models in Two Rivers of the Taihu Basin, Eastern China

PubMed Central

Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin

2016-01-01

Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R2 = 0.86–0.93 for 72 data sets collected in the industrial river and R2 = 0.60–0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals’ concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density. PMID:27028017
Estimating and Predicting Metal Concentration Using Online Turbidity Values and Water Quality Models in Two Rivers of the Taihu Basin, Eastern China.

PubMed

Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin

2016-01-01

Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R(2) = 0.86-0.93 for 72 data sets collected in the industrial river and R(2) = 0.60-0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals' concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density.
GWAS with longitudinal phenotypes: performance of approximate procedures

PubMed Central

Sikorska, Karolina; Montazeri, Nahid Mostafavi; Uitterlinden, André; Rivadeneira, Fernando; Eilers, Paul HC; Lesaffre, Emmanuel

2015-01-01

Analysis of genome-wide association studies with longitudinal data using standard procedures, such as linear mixed model (LMM) fitting, leads to discouragingly long computation times. There is a need to speed up the computations significantly. In our previous work (Sikorska et al: Fast linear mixed model computations for genome-wide association studies with longitudinal data. Stat Med 2012; 32.1: 165–180), we proposed the conditional two-step (CTS) approach as a fast method providing an approximation to the P-value for the longitudinal single-nucleotide polymorphism (SNP) effect. In the first step a reduced conditional LMM is fit, omitting all the SNP terms. In the second step, the estimated random slopes are regressed on SNPs. The CTS has been applied to the bone mineral density data from the Rotterdam Study and proved to work very well even in unbalanced situations. In another article (Sikorska et al: GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies. BMC Bioinformatics 2013; 14: 166), we suggested semi-parallel computations, greatly speeding up fitting many linear regressions. Combining CTS with fast linear regression reduces the computation time from several weeks to a few minutes on a single computer. Here, we explore further the properties of the CTS both analytically and by simulations. We investigate the performance of our proposal in comparison with a related but different approach, the two-step procedure. It is analytically shown that for the balanced case, under mild assumptions, the P-value provided by the CTS is the same as from the LMM. For unbalanced data and in realistic situations, simulations show that the CTS method does not inflate the type I error rate and implies only a minimal loss of power. PMID:25712081
Land Use Regression Modeling of Outdoor Noise Exposure in Informal Settlements in Western Cape, South Africa

PubMed Central

Sieber, Chloé; Ragettli, Martina S.; Toyib, Olaniyan; Baatjies, Roslyn; Saucy, Apolline; Probst-Hensch, Nicole; Dalvie, Mohamed Aqiel; Röösli, Martin

2017-01-01

In low- and middle-income countries, noise exposure and its negative health effects have been little explored. The present study aimed to assess the noise exposure situation in adults living in informal settings in the Western Cape Province, South Africa. We conducted continuous one-week outdoor noise measurements at 134 homes in four different areas. These data were used to develop a land use regression (LUR) model to predict A-weighted day-evening-night equivalent sound levels (Lden) from geographic information system (GIS) variables. Mean noise exposure during day (6:00–18:00) was 60.0 A-weighted decibels (dB(A)) (interquartile range 56.9–62.9 dB(A)), during night (22:00–6:00) 52.9 dB(A) (49.3–55.8 dB(A)) and average Lden was 63.0 dB(A) (60.1–66.5 dB(A)). Main predictors of the LUR model were related to road traffic and household density. Model performance was low (adjusted R2 = 0.130) suggesting that other influences than those represented in the geographic predictors are relevant for noise exposure. This is one of the few studies on the noise exposure situation in low- and middle-income countries. It demonstrates that noise exposure levels are high in these settings. PMID:29053590
On comparison of net survival curves.

PubMed

Pavlič, Klemen; Perme, Maja Pohar

2017-05-02

Relative survival analysis is a subfield of survival analysis where competing risks data are observed, but the causes of death are unknown. A first step in the analysis of such data is usually the estimation of a net survival curve, possibly followed by regression modelling. Recently, a log-rank type test for comparison of net survival curves has been introduced and the goal of this paper is to explore its properties and put this methodological advance into the context of the field. We build on the association between the log-rank test and the univariate or stratified Cox model and show the analogy in the relative survival setting. We study the properties of the methods using both the theoretical arguments as well as simulations. We provide an R function to enable practical usage of the log-rank type test. Both the log-rank type test and its model alternatives perform satisfactory under the null, even if the correlation between their p-values is rather low, implying that both approaches cannot be used simultaneously. The stratified version has a higher power in case of non-homogeneous hazards, but also carries a different interpretation. The log-rank type test and its stratified version can be interpreted in the same way as the results of an analogous semi-parametric additive regression model despite the fact that no direct theoretical link can be established between the test statistics.
Observing Consistency in Online Communication Patterns for User Re-Identification

PubMed Central

Venter, Hein S.

2016-01-01

Comprehension of the statistical and structural mechanisms governing human dynamics in online interaction plays a pivotal role in online user identification, online profile development, and recommender systems. However, building a characteristic model of human dynamics on the Internet involves a complete analysis of the variations in human activity patterns, which is a complex process. This complexity is inherent in human dynamics and has not been extensively studied to reveal the structural composition of human behavior. A typical method of anatomizing such a complex system is viewing all independent interconnectivity that constitutes the complexity. An examination of the various dimensions of human communication pattern in online interactions is presented in this paper. The study employed reliable server-side web data from 31 known users to explore characteristics of human-driven communications. Various machine-learning techniques were explored. The results revealed that each individual exhibited a relatively consistent, unique behavioral signature and that the logistic regression model and model tree can be used to accurately distinguish online users. These results are applicable to one-to-one online user identification processes, insider misuse investigation processes, and online profiling in various areas. PMID:27918593
Influence of visual clutter on the effect of navigated safety inspection: a case study on elevator installation.

PubMed

Liao, Pin-Chao; Sun, Xinlu; Liu, Mei; Shih, Yu-Nien

2018-01-11

Navigated safety inspection based on task-specific checklists can increase the hazard detection rate, theoretically with interference from scene complexity. Visual clutter, a proxy of scene complexity, can theoretically impair visual search performance, but its impact on the effect of safety inspection performance remains to be explored for the optimization of navigated inspection. This research aims to explore whether the relationship between working memory and hazard detection rate is moderated by visual clutter. Based on a perceptive model of hazard detection, we: (a) developed a mathematical influence model for construction hazard detection; (b) designed an experiment to observe the performance of hazard detection rate with adjusted working memory under different levels of visual clutter, while using an eye-tracking device to observe participants' visual search processes; (c) utilized logistic regression to analyze the developed model under various visual clutter. The effect of a strengthened working memory on the detection rate through increased search efficiency is more apparent in high visual clutter. This study confirms the role of visual clutter in construction-navigated inspections, thus serving as a foundation for the optimization of inspection planning.
Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

PubMed

Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

2018-06-29

A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.

Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

ERIC Educational Resources Information Center

Chen, Chau-Kuang

2005-01-01

Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Job satisfaction amongst aged care staff: exploring the influence of person-centered care provision.

PubMed

Edvardsson, David; Fetherstonhaugh, Deirdre; McAuliffe, Linda; Nay, Rhonda; Chenco, Carol

2011-10-01

There are challenges in attracting and sustaining a competent and stable workforce in aged care, and key issues of concern such as low staff job satisfaction and feelings of not being able to provide high quality care have been described. This study aimed to explore the association between person-centered care provision and job satisfaction in aged care staff. Residential aged care staff (n = 297) in Australia completed the measure of job satisfaction and the person-centered care assessment tool. Univariate analyses examined relationships between variables, and multiple linear regression analysis explored the extent to whichperceived person-centredness could predict job satisfaction of staff. Perceived person-centred care provision was significantly associated with job satisfaction, and person-centred care provision could explain nearly half of the variation in job satisfaction. The regression model with the three person-centered care subscales as predictor variables accounted for 40% of the variance in job satisfaction. Personalizing care had the largest independent influence on job satisfaction, followed by amount of organizational support and degree of environmental accessibility. Personalizing care and amount of organizational support had a statistically significant unique influence. As person-centered care positively correlated with staff job satisfaction, supporting staff in providing person-centered care can enhance job satisfaction and might facilitate attracting and retaining staff in residential aged care. The findings reiterate a need to shift focus from merely completing care tasks and following organizational routines to providing high quality person-centered care that promotes the good life of residents in aged care.
Bayesian Unimodal Density Regression for Causal Inference

ERIC Educational Resources Information Center

Karabatsos, George; Walker, Stephen G.

2011-01-01

Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

ERIC Educational Resources Information Center

Culpepper, Steven Andrew; Park, Trevor

2017-01-01

A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
A simple approach to power and sample size calculations in logistic regression and Cox regression models.

PubMed

Vaeth, Michael; Skovlund, Eva

2004-06-15

For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Impact of weather factors on hand, foot and mouth disease, and its role in short-term incidence trend forecast in Huainan City, Anhui Province.

PubMed

Zhao, Desheng; Wang, Lulu; Cheng, Jian; Xu, Jun; Xu, Zhiwei; Xie, Mingyu; Yang, Huihui; Li, Kesheng; Wen, Lingying; Wang, Xu; Zhang, Heng; Wang, Shusi; Su, Hong

2017-03-01

Hand, foot, and mouth disease (HFMD) is one of the most common communicable diseases in China, and current climate change had been recognized as a significant contributor. Nevertheless, no reliable models have been put forward to predict the dynamics of HFMD cases based on short-term weather variations. The present study aimed to examine the association between weather factors and HFMD, and to explore the accuracy of seasonal auto-regressive integrated moving average (SARIMA) model with local weather conditions in forecasting HFMD. Weather and HFMD data from 2009 to 2014 in Huainan, China, were used. Poisson regression model combined with a distributed lag non-linear model (DLNM) was applied to examine the relationship between weather factors and HFMD. The forecasting model for HFMD was performed by using the SARIMA model. The results showed that temperature rise was significantly associated with an elevated risk of HFMD. Yet, no correlations between relative humidity, barometric pressure and rainfall, and HFMD were observed. SARIMA models with temperature variable fitted HFMD data better than the model without it (sR 2 increased, while the BIC decreased), and the SARIMA (0, 1, 1)(0, 1, 0) 52 offered the best fit for HFMD data. In addition, compared with females and nursery children, males and scattered children may be more suitable for using SARIMA model to predict the number of HFMD cases and it has high precision. In conclusion, high temperature could increase the risk of contracting HFMD. SARIMA model with temperature variable can effectively improve its forecast accuracy, which can provide valuable information for the policy makers and public health to construct a best-fitting model and optimize HFMD prevention.
Time Series Analysis for Forecasting Hospital Census: Application to the Neonatal Intensive Care Unit

PubMed Central

Hoover, Stephen; Jackson, Eric V.; Paul, David; Locke, Robert

2016-01-01

Summary Background Accurate prediction of future patient census in hospital units is essential for patient safety, health outcomes, and resource planning. Forecasting census in the Neonatal Intensive Care Unit (NICU) is particularly challenging due to limited ability to control the census and clinical trajectories. The fixed average census approach, using average census from previous year, is a forecasting alternative used in clinical practice, but has limitations due to census variations. Objective Our objectives are to: (i) analyze the daily NICU census at a single health care facility and develop census forecasting models, (ii) explore models with and without patient data characteristics obtained at the time of admission, and (iii) evaluate accuracy of the models compared with the fixed average census approach. Methods We used five years of retrospective daily NICU census data for model development (January 2008 – December 2012, N=1827 observations) and one year of data for validation (January – December 2013, N=365 observations). Best-fitting models of ARIMA and linear regression were applied to various 7-day prediction periods and compared using error statistics. Results The census showed a slightly increasing linear trend. Best fitting models included a non-seasonal model, ARIMA(1,0,0), seasonal ARIMA models, ARIMA(1,0,0)x(1,1,2)7 and ARIMA(2,1,4)x(1,1,2)14, as well as a seasonal linear regression model. Proposed forecasting models resulted on average in 36.49% improvement in forecasting accuracy compared with the fixed average census approach. Conclusions Time series models provide higher prediction accuracy under different census conditions compared with the fixed average census approach. Presented methodology is easily applicable in clinical practice, can be generalized to other care settings, support short- and long-term census forecasting, and inform staff resource planning. PMID:27437040
Impact of weather factors on hand, foot and mouth disease, and its role in short-term incidence trend forecast in Huainan City, Anhui Province

NASA Astrophysics Data System (ADS)

Zhao, Desheng; Wang, Lulu; Cheng, Jian; Xu, Jun; Xu, Zhiwei; Xie, Mingyu; Yang, Huihui; Li, Kesheng; Wen, Lingying; Wang, Xu; Zhang, Heng; Wang, Shusi; Su, Hong

2017-03-01

Hand, foot, and mouth disease (HFMD) is one of the most common communicable diseases in China, and current climate change had been recognized as a significant contributor. Nevertheless, no reliable models have been put forward to predict the dynamics of HFMD cases based on short-term weather variations. The present study aimed to examine the association between weather factors and HFMD, and to explore the accuracy of seasonal auto-regressive integrated moving average (SARIMA) model with local weather conditions in forecasting HFMD. Weather and HFMD data from 2009 to 2014 in Huainan, China, were used. Poisson regression model combined with a distributed lag non-linear model (DLNM) was applied to examine the relationship between weather factors and HFMD. The forecasting model for HFMD was performed by using the SARIMA model. The results showed that temperature rise was significantly associated with an elevated risk of HFMD. Yet, no correlations between relative humidity, barometric pressure and rainfall, and HFMD were observed. SARIMA models with temperature variable fitted HFMD data better than the model without it (s R 2 increased, while the BIC decreased), and the SARIMA (0, 1, 1)(0, 1, 0)52 offered the best fit for HFMD data. In addition, compared with females and nursery children, males and scattered children may be more suitable for using SARIMA model to predict the number of HFMD cases and it has high precision. In conclusion, high temperature could increase the risk of contracting HFMD. SARIMA model with temperature variable can effectively improve its forecast accuracy, which can provide valuable information for the policy makers and public health to construct a best-fitting model and optimize HFMD prevention.
Time Series Analysis for Forecasting Hospital Census: Application to the Neonatal Intensive Care Unit.

PubMed

Capan, Muge; Hoover, Stephen; Jackson, Eric V; Paul, David; Locke, Robert

2016-01-01

Accurate prediction of future patient census in hospital units is essential for patient safety, health outcomes, and resource planning. Forecasting census in the Neonatal Intensive Care Unit (NICU) is particularly challenging due to limited ability to control the census and clinical trajectories. The fixed average census approach, using average census from previous year, is a forecasting alternative used in clinical practice, but has limitations due to census variations. Our objectives are to: (i) analyze the daily NICU census at a single health care facility and develop census forecasting models, (ii) explore models with and without patient data characteristics obtained at the time of admission, and (iii) evaluate accuracy of the models compared with the fixed average census approach. We used five years of retrospective daily NICU census data for model development (January 2008 - December 2012, N=1827 observations) and one year of data for validation (January - December 2013, N=365 observations). Best-fitting models of ARIMA and linear regression were applied to various 7-day prediction periods and compared using error statistics. The census showed a slightly increasing linear trend. Best fitting models included a non-seasonal model, ARIMA(1,0,0), seasonal ARIMA models, ARIMA(1,0,0)x(1,1,2)7 and ARIMA(2,1,4)x(1,1,2)14, as well as a seasonal linear regression model. Proposed forecasting models resulted on average in 36.49% improvement in forecasting accuracy compared with the fixed average census approach. Time series models provide higher prediction accuracy under different census conditions compared with the fixed average census approach. Presented methodology is easily applicable in clinical practice, can be generalized to other care settings, support short- and long-term census forecasting, and inform staff resource planning.
Estimation of elimination half-lives of organic chemicals in humans using gradient boosting machine.

PubMed

Lu, Jing; Lu, Dong; Zhang, Xiaochen; Bi, Yi; Cheng, Keguang; Zheng, Mingyue; Luo, Xiaomin

2016-11-01

Elimination half-life is an important pharmacokinetic parameter that determines exposure duration to approach steady state of drugs and regulates drug administration. The experimental evaluation of half-life is time-consuming and costly. Thus, it is attractive to build an accurate prediction model for half-life. In this study, several machine learning methods, including gradient boosting machine (GBM), support vector regressions (RBF-SVR and Linear-SVR), local lazy regression (LLR), SA, SR, and GP, were employed to build high-quality prediction models. Two strategies of building consensus models were explored to improve the accuracy of prediction. Moreover, the applicability domains (ADs) of the models were determined by using the distance-based threshold. Among seven individual models, GBM showed the best performance (R(2)=0.820 and RMSE=0.555 for the test set), and Linear-SVR produced the inferior prediction accuracy (R(2)=0.738 and RMSE=0.672). The use of distance-based ADs effectively determined the scope of QSAR models. However, the consensus models by combing the individual models could not improve the prediction performance. Some essential descriptors relevant to half-life were identified and analyzed. An accurate prediction model for elimination half-life was built by GBM, which was superior to the reference model (R(2)=0.723 and RMSE=0.698). Encouraged by the promising results, we expect that the GBM model for elimination half-life would have potential applications for the early pharmacokinetic evaluations, and provide guidance for designing drug candidates with favorable in vivo exposure profile. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang. Copyright © 2016 Elsevier B.V. All rights reserved.
Modeling Of In-Vehicle Human Exposure to Ambient Fine Particulate Matter

PubMed Central

Liu, Xiaozhen; Frey, H. Christopher

2012-01-01

A method for estimating in-vehicle PM2.5 exposure as part of a scenario-based population simulation model is developed and assessed. In existing models, such as the Stochastic Exposure and Dose Simulation model for Particulate Matter (SHEDS-PM), in-vehicle exposure is estimated using linear regression based on area-wide ambient PM2.5 concentration. An alternative modeling approach is explored based on estimation of near-road PM2.5 concentration and an in-vehicle mass balance. Near-road PM2.5 concentration is estimated using a dispersion model and fixed site monitor (FSM) data. In-vehicle concentration is estimated based on air exchange rate and filter efficiency. In-vehicle concentration varies with road type, traffic flow, windspeed, stability class, and ventilation. Average in-vehicle exposure is estimated to contribute 10 to 20 percent of average daily exposure. The contribution of in-vehicle exposure to total daily exposure can be higher for some individuals. Recommendations are made for updating exposure models and implementation of the alternative approach. PMID:23101000
Comparative evaluation of urban storm water quality models

NASA Astrophysics Data System (ADS)

Vaze, J.; Chiew, Francis H. S.

2003-10-01

The estimation of urban storm water pollutant loads is required for the development of mitigation and management strategies to minimize impacts to receiving environments. Event pollutant loads are typically estimated using either regression equations or "process-based" water quality models. The relative merit of using regression models compared to process-based models is not clear. A modeling study is carried out here to evaluate the comparative ability of the regression equations and process-based water quality models to estimate event diffuse pollutant loads from impervious surfaces. The results indicate that, once calibrated, both the regression equations and the process-based model can estimate event pollutant loads satisfactorily. In fact, the loads estimated using the regression equation as a function of rainfall intensity and runoff rate are better than the loads estimated using the process-based model. Therefore, if only estimates of event loads are required, regression models should be used because they are simpler and require less data compared to process-based models.
Can Job Control Ameliorate Work-family Conflict and Enhance Job Satisfaction among Chinese Registered Nurses? A Mediation Model.

PubMed

Ding, Xiaotong; Yang, Yajuan; Su, Dan; Zhang, Ting; Li, Lunlan; Li, Huiping

2018-04-01

Low job satisfaction is the most common cause of nurses' turnover and influences the quality of nursing service. Moreover, we have no idea regarding whether job control, as an individual factor, can play a role in the relationship. To explore the relationship between work-family conflict and job satisfaction among Chinese registered nurses and the mediating role of job control in this relationship. From August 2015 to November 2016, 487 Chinese registered nurses completed a survey. The study used work-family conflict scale, job control scale, job satisfaction scale, as well as general information. Multiple regression analysis was used to explore the independent factors of job satisfaction. Structural equation model was used to explore the mediating role of job control. Work-family conflict was negatively correlated with job satisfaction (r ‑0.432, p<0.01). In addition, job control was positively related to job satisfaction (r 0.567, p<0.01). Work-family conflict and job control had significant predictive effects on job satisfaction. Job control partially mediated the relationship between work-family conflict and job satisfaction. Work-family conflict affected job satisfaction and job control was a mediator in this relationship among Chinese registered nurses. Job control could potentially improve nurses' job satisfaction.
Spatial asymmetry in tactile sensor skin deformation aids perception of edge orientation during haptic exploration.

PubMed

Ponce Wong, Ruben D; Hellman, Randall B; Santos, Veronica J

2014-01-01

Upper-limb amputees rely primarily on visual feedback when using their prostheses to interact with others or objects in their environment. A constant reliance upon visual feedback can be mentally exhausting and does not suffice for many activities when line-of-sight is unavailable. Upper-limb amputees could greatly benefit from the ability to perceive edges, one of the most salient features of 3D shape, through touch alone. We present an approach for estimating edge orientation with respect to an artificial fingertip through haptic exploration using a multimodal tactile sensor on a robot hand. Key parameters from the tactile signals for each of four exploratory procedures were used as inputs to a support vector regression model. Edge orientation angles ranging from -90 to 90 degrees were estimated with an 85-input model having an R (2) of 0.99 and RMS error of 5.08 degrees. Electrode impedance signals provided the most useful inputs by encoding spatially asymmetric skin deformation across the entire fingertip. Interestingly, sensor regions that were not in direct contact with the stimulus provided particularly useful information. Methods described here could pave the way for semi-autonomous capabilities in prosthetic or robotic hands during haptic exploration, especially when visual feedback is unavailable.
Pharmaceutical pricing: an empirical study of market competition in Chinese hospitals.

PubMed

Wu, Jing; Xu, Judy; Liu, Gordon; Wu, Jiuhong

2014-03-01

High pharmaceutical prices and over-prescribing of high-priced pharmaceuticals in Chinese hospitals has long been criticized. Although policy makers have tried to address these issues, they have not yet found an effective balance between government regulation and market forces. Our objective was to explore the impact of market competition on pharmaceutical pricing under Chinese government regulation. Data from 11 public tertiary hospitals in three cities in China from 2002 to 2005 were used to explore the effect of generic and therapeutic competition on prices of antibiotics and cardiovascular products. A quasi-hedonic regression model was employed to estimate the impact of competition. The inputs to our model were specific attributes of the products and manufacturers, with the exception of competition variables. Our results suggest that pharmaceutical prices are inversely related to the number of generic and therapeutic competitors, but positively related to the number of therapeutic classes. In addition, the product prices of leading local manufacturers are not only significantly lower than those of global manufacturers, but are also lower than their non-leading counterparts when other product attributes are controlled for. Under the highly price-regulated market in China, competition from generic and therapeutic competitors did decrease pharmaceutical prices. Further research is needed to explore whether this competition increases consumer welfare in China's healthcare setting.
Complex messages regarding a thin ideal appearing in teenage girls' magazines from 1956 to 2005.

PubMed

Luff, Gina M; Gray, James J

2009-03-01

Seventeen and YM were assessed from 1956 through 2005 (n=312) to examine changes in the messages about thinness sent to teenage women. Trends were analyzed through an investigation of written, internal content focused on dieting, exercise, or both, while cover models were examined to explore fluctuations in body size. Pearson's Product correlations and weighted-least squares linear regression models were used to demonstrate changes over time. The frequency of written content related to exercise and combined plans increased in Seventeen, while a curvilinear relationship between time and content relating to dieting appeared. YM showed a linear increase in content related to dieting, exercise, and combined plans. Average cover model body size increased over time in YM while demonstrating no significant changes in Seventeen. Overall, more written messages about dieting and exercise appeared in teen's magazines in 2005 than before while the average cover model body size increased.
Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.

PubMed

Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa

2016-03-01

In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.
Determinants of Perceived Stress in Individuals with Obesity: Exploring the Relationship of Potentially Obesity-Related Factors and Perceived Stress.

PubMed

Junne, Florian; Ziser, Katrin; Giel, Katrin Elisabeth; Schag, Kathrin; Skoda, Eva; Mack, Isabelle; Niess, Andreas; Zipfel, Stephan; Teufel, Martin

2017-01-01

Associations of specific types of stress with increased food intake and subsequent weight gain have been demonstrated in animal models as well as in experimental and epidemiological studies on humans. This study explores the research question of to what extent potentially obesity-related factors determine perceived stress in individuals with obesity. N = 547 individuals with obesity participated in a cross-sectional study assessing perceived stress as the outcome variable and potential determinants of stress related to obesity. Based on the available evidence, a five factorial model of 'obesity-related obesogenic stressors' was hypothesized, including the dimensions, 'drive for thinness', 'impulse regulation', 'ineffectiveness', 'social insecurity', and 'body dissatisfaction'. The model was tested using multiple linear regression analyses. The five factorial model of 'potentially obesity-related stressors' resulted in a total variance explanation of adjusted R² = 0.616 for males and adjusted R² = 0.595 for females for perceived stress. The relative variance contribution of the five included factors differed substantially for the two sexes. The findings of this cross-sectional study support the hypothesized, potentially obesity-related factors: 'drive for thinness', 'impulse regulation', 'ineffectiveness', 'social insecurity', and 'body dissatisfaction' as relevant determinants of perceived stress in individuals with obesity. © 2017 The Author(s) Published by S. Karger GmbH, Freiburg.
Statistical design and analysis for plant cover studies with multiple sources of observation errors

USGS Publications Warehouse

Wright, Wilson; Irvine, Kathryn M.; Warren, Jeffrey M .; Barnett, Jenny K.

2017-01-01

Effective wildlife habitat management and conservation requires understanding the factors influencing distribution and abundance of plant species. Field studies, however, have documented observation errors in visually estimated plant cover including measurements which differ from the true value (measurement error) and not observing a species that is present within a plot (detection error). Unlike the rapid expansion of occupancy and N-mixture models for analysing wildlife surveys, development of statistical models accounting for observation error in plants has not progressed quickly. Our work informs development of a monitoring protocol for managed wetlands within the National Wildlife Refuge System.Zero-augmented beta (ZAB) regression is the most suitable method for analysing areal plant cover recorded as a continuous proportion but assumes no observation errors. We present a model extension that explicitly includes the observation process thereby accounting for both measurement and detection errors. Using simulations, we compare our approach to a ZAB regression that ignores observation errors (naïve model) and an “ad hoc” approach using a composite of multiple observations per plot within the naïve model. We explore how sample size and within-season revisit design affect the ability to detect a change in mean plant cover between 2 years using our model.Explicitly modelling the observation process within our framework produced unbiased estimates and nominal coverage of model parameters. The naïve and “ad hoc” approaches resulted in underestimation of occurrence and overestimation of mean cover. The degree of bias was primarily driven by imperfect detection and its relationship with cover within a plot. Conversely, measurement error had minimal impacts on inferences. We found >30 plots with at least three within-season revisits achieved reasonable posterior probabilities for assessing change in mean plant cover.For rapid adoption and application, code for Bayesian estimation of our single-species ZAB with errors model is included. Practitioners utilizing our R-based simulation code can explore trade-offs among different survey efforts and parameter values, as we did, but tuned to their own investigation. Less abundant plant species of high ecological interest may warrant the additional cost of gathering multiple independent observations in order to guard against erroneous conclusions.
Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

PubMed Central

Libiger, Ondrej; Schork, Nicholas J.

2015-01-01

It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

Evaluating the performance of different predictor strategies in regression-based downscaling with a focus on glacierized mountain environments

NASA Astrophysics Data System (ADS)

Hofer, Marlis; Nemec, Johanna

2016-04-01

This study presents first steps towards verifying the hypothesis that uncertainty in global and regional glacier mass simulations can be reduced considerably by reducing the uncertainty in the high-resolution atmospheric input data. To this aim, we systematically explore the potential of different predictor strategies for improving the performance of regression-based downscaling approaches. The investigated local-scale target variables are precipitation, air temperature, wind speed, relative humidity and global radiation, all at a daily time scale. Observations of these target variables are assessed from three sites in geo-environmentally and climatologically very distinct settings, all within highly complex topography and in the close proximity to mountain glaciers: (1) the Vernagtbach station in the Northern European Alps (VERNAGT), (2) the Artesonraju measuring site in the tropical South American Andes (ARTESON), and (3) the Brewster measuring site in the Southern Alps of New Zealand (BREWSTER). As the large-scale predictors, ERA interim reanalysis data are used. In the applied downscaling model training and evaluation procedures, particular emphasis is put on appropriately accounting for the pitfalls of limited and/or patchy observation records that are usually the only (if at all) available data from the glacierized mountain sites. Generalized linear models and beta regression are investigated as alternatives to ordinary least squares regression for the non-Gaussian target variables. By analyzing results for the three different sites, five predictands and for different times of the year, we look for systematic improvements in the downscaling models' skill specifically obtained by (i) using predictor data at the optimum scale rather than the minimum scale of the reanalysis data, (ii) identifying the optimum predictor allocation in the vertical, and (iii) considering multiple (variable, level and/or grid point) predictor options combined with state-of-art empirical feature selection tools. First results show that in particular for air temperature, those downscaling models based on direct predictor selection show comparative skill like those models based on multiple predictors. For all other target variables, however, multiple predictor approaches can considerably outperform those models based on single predictors. Including multiple variable types emerges as the most promising predictor option (in particular for wind speed at all sites), even if the same predictor set is used across the different cases.
A generalized right truncated bivariate Poisson regression model with applications to health data.

PubMed

Islam, M Ataharul; Chowdhury, Rafiqul I

2017-01-01

A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.
A generalized right truncated bivariate Poisson regression model with applications to health data

PubMed Central

Islam, M. Ataharul; Chowdhury, Rafiqul I.

2017-01-01

A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model. PMID:28586344
Spatial regression models of park and land-use impacts on the urban heat island in central Beijing.

PubMed

Dai, Zhaoxin; Guldmann, Jean-Michel; Hu, Yunfeng

2018-06-01

Understanding the relationship between urban land structure and land surface temperatures (LST) is important for mitigating the urban heat island (UHI). This paper explores this relationship within central Beijing, an area located within the 2nd Ring Road. The urban variables include the Normalized Difference Vegetation Index (NDVI), the Normalized Difference Build-up Index (NDBI), the area of building footprints, the area of main roads, the area of water bodies and a gravity index for parks that account for both park size and distance. The data are captured over 8 grids of square cells (30 m, 60 m, 90 m, 120 m, 150 m, 180 m, 210 m, 240 m). The research involves: (1) estimating land surface temperatures using Landsat 8 satellite imagery, (2) building the database of urban variables, and (3) conducting regression analyses. The results show that (1) all the variables impact surface temperatures, (2) spatial regressions are necessary to capture neighboring effects, and (3) higher-order polynomial functions are more suitable for capturing the effects of NDVI and NDBI. Copyright © 2018 Elsevier B.V. All rights reserved.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

NASA Astrophysics Data System (ADS)

Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

2018-04-01

In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Spatial Assessment of Model Errors from Four Regression Techniques

Treesearch

Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

2005-01-01

Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
[Regression on order statistics and its application in estimating nondetects for food exposure assessment].

PubMed

Yu, Xiaojin; Liu, Pei; Min, Jie; Chen, Qiguang

2009-01-01

To explore the application of regression on order statistics (ROS) in estimating nondetects for food exposure assessment. Regression on order statistics was adopted in analysis of cadmium residual data set from global food contaminant monitoring, the mean residual was estimated basing SAS programming and compared with the results from substitution methods. The results show that ROS method performs better obviously than substitution methods for being robust and convenient for posterior analysis. Regression on order statistics is worth to adopt,but more efforts should be make for details of application of this method.
Environmental Determinants of the Distribution of Chagas Disease Vector Triatoma dimidiata in Colombia.

PubMed

Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura

2016-04-01

Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence ofT. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence ofT. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata(Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model ofT. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats forT. dimidiata hold promise for spatial targeting of integrated vector management. © The American Society of Tropical Medicine and Hygiene.
Environmental Determinants of the Distribution of Chagas Disease Vector Triatoma dimidiata in Colombia

PubMed Central

Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura

2016-01-01

Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence of T. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence of T. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata (Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model of T. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats for T. dimidiata hold promise for spatial targeting of integrated vector management. PMID:26856910
Sensitivity to gaze-contingent contrast increments in naturalistic movies: An exploratory report and model comparison

PubMed Central

Wallis, Thomas S. A.; Dorr, Michael; Bex, Peter J.

2015-01-01

Sensitivity to luminance contrast is a prerequisite for all but the simplest visual systems. To examine contrast increment detection performance in a way that approximates the natural environmental input of the human visual system, we presented contrast increments gaze-contingently within naturalistic video freely viewed by observers. A band-limited contrast increment was applied to a local region of the video relative to the observer's current gaze point, and the observer made a forced-choice response to the location of the target (≈25,000 trials across five observers). We present exploratory analyses showing that performance improved as a function of the magnitude of the increment and depended on the direction of eye movements relative to the target location, the timing of eye movements relative to target presentation, and the spatiotemporal image structure at the target location. Contrast discrimination performance can be modeled by assuming that the underlying contrast response is an accelerating nonlinearity (arising from a nonlinear transducer or gain control). We implemented one such model and examined the posterior over model parameters, estimated using Markov-chain Monte Carlo methods. The parameters were poorly constrained by our data; parameters constrained using strong priors taken from previous research showed poor cross-validated prediction performance. Atheoretical logistic regression models were better constrained and provided similar prediction performance to the nonlinear transducer model. Finally, we explored the properties of an extended logistic regression that incorporates both eye movement and image content features. Models of contrast transduction may be better constrained by incorporating data from both artificial and natural contrast perception settings. PMID:26057546
LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS

PubMed Central

Almquist, Zack W.; Butts, Carter T.

2015-01-01

Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218
Disability weights for infectious diseases in four European countries: comparison between countries and across respondent characteristics

PubMed Central

Maertens de Noordhout, Charline; Devleesschauwer, Brecht; Salomon, Joshua A; Turner, Heather; Cassini, Alessandro; Colzani, Edoardo; Speybroeck, Niko; Polinder, Suzanne; Kretzschmar, Mirjam E; Havelaar, Arie H; Haagsma, Juanita A

2018-01-01

Abstract Background In 2015, new disability weights (DWs) for infectious diseases were constructed based on data from four European countries. In this paper, we evaluated if country, age, sex, disease experience status, income and educational levels have an impact on these DWs. Methods We analyzed paired comparison responses of the European DW study by participants’ characteristics with separate probit regression models. To evaluate the effect of participants’ characteristics, we performed correlation analyses between countries and within country by respondent characteristics and constructed seven probit regression models, including a null model and six models containing participants’ characteristics. We compared these seven models using Akaike Information Criterion (AIC). Results According to AIC, the probit model including country as covariate was the best model. We found a lower correlation of the probit coefficients between countries and income levels (range rs: 0.97–0.99, P < 0.01) than between age groups (range rs: 0.98–0.99, P < 0.01), educational level (range rs: 0.98–0.99, P < 0.01), sex (rs = 0.99, P < 0.01) and disease status (rs = 0.99, P < 0.01). Within country the lowest correlations of the probit coefficients were between low and high income level (range rs = 0.89–0.94, P < 0.01). Conclusions We observed variations in health valuation across countries and within country between income levels. These observations should be further explored in a systematic way, also in non-European countries. We recommend future researches studying the effect of other characteristics of respondents on health assessment. PMID:29020343
LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS.

PubMed

Almquist, Zack W; Butts, Carter T

2014-08-01

Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach.
Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis.

PubMed

Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O

2016-05-15

The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
A regression tree for identifying combinations of fall risk factors associated to recurrent falling: a cross-sectional elderly population-based study.

PubMed

Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O

2014-06-01

Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p < 0.001). Among participants without FOF, those who were male and not sad had the lowest OR for recurrent falls (OR = 0.25 with p < 0.001). The RT correctly classified 1,356 from 1,414 non-recurrent fallers (specificity = 95.6 %), and 65 from 346 recurrent fallers (sensitivity = 18.8 %). The overall classification accuracy was 81.0 %. The multiple logistic regression correctly classified 1,372 from 1,414 non-recurrent fallers (specificity = 97.0 %), and 61 from 346 recurrent fallers (sensitivity = 17.6 %). The overall classification accuracy was 81.4 %. Our results show that RT may identify specific combinations of risk factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with recurrent falls. In addition, RT and multiple logistic regression were not sensitive enough to identify the majority of recurrent fallers but appeared efficient in detecting individuals not at risk of recurrent falls.
Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?

NASA Astrophysics Data System (ADS)

Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.

2016-12-01

Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.
Neurophysiological correlates of depressive symptoms in young adults: A quantitative EEG study.

PubMed

Lee, Poh Foong; Kan, Donica Pei Xin; Croarkin, Paul; Phang, Cheng Kar; Doruk, Deniz

2018-01-01

There is an unmet need for practical and reliable biomarkers for mood disorders in young adults. Identifying the brain activity associated with the early signs of depressive disorders could have important diagnostic and therapeutic implications. In this study we sought to investigate the EEG characteristics in young adults with newly identified depressive symptoms. Based on the initial screening, a total of 100 participants (n = 50 euthymic, n = 50 depressive) underwent 32-channel EEG acquisition. Simple logistic regression and C-statistic were used to explore if EEG power could be used to discriminate between the groups. The strongest EEG predictors of mood using multivariate logistic regression models. Simple logistic regression analysis with subsequent C-statistics revealed that only high-alpha and beta power originating from the left central cortex (C3) have a reliable discriminative value (ROC curve >0.7 (70%)) for differentiating the depressive group from the euthymic group. Multivariate regression analysis showed that the single most significant predictor of group (depressive vs. euthymic) is the high-alpha power over C3 (p = 0.03). The present findings suggest that EEG is a useful tool in the identification of neurophysiological correlates of depressive symptoms in young adults with no previous psychiatric history. Our results could guide future studies investigating the early neurophysiological changes and surrogate outcomes in depression. Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparison of transgressive and regressive clastic reservoirs, late Albian Viking Formation, Alberta basin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reinson, G.E.

1996-06-01

Detailed stratigraphic analysis of hydrocarbon reservoirs from the Basal Colorado upwards through the Viking/Bow Island and Cardium formations indicates that the distributional trends, overall size and geometry, internal heterogeneity, and hydrocarbon productivity of the sand bodies are related directly to a transgressive-regressive (T-R) sequence stratigraphic model. The Viking Formation (equivalent to the Muddy Sandstone of Wyoming) contains examples of both transgressive and regressive reservoirs. Viking reservoirs can be divided into progradational shoreface bars associated with the regressive systems tract, and bar/sheet sands and estuary/channel deposits associated with the transgressive systems tract. Shoreface bars, usually consisting of fine- to medium-grained sandstones,more » are tens of kilometers long, kilometers in width, and in the order of five to ten meters thick. Transgressive bar and sheet sandstones range from coarse-grained to conglomeratic, and occur in deposits that are tens of kilometers long, several kilometers wide, and from less than one to four meters in thickness. Estuary and valley-fill reservoir sandstones vary from fine-grained to conglomeratic, occur as isolated bodies that have channel-like geometries, and are usually greater than 10 meters thick. From an exploration viewpoint the most prospective reservoir trends in the Viking Formation are those associated with transgressive systems tracts. In particular, bounding discontinuities between T-R systems tracts are the principal sites of the most productive hydrocarbon-bearing sandstones.« less
Developing a predictive tropospheric ozone model for Tabriz

NASA Astrophysics Data System (ADS)

Khatibi, Rahman; Naghipour, Leila; Ghorbani, Mohammad A.; Smith, Michael S.; Karimi, Vahid; Farhoudi, Reza; Delafrouz, Hadi; Arvanaghi, Hadi

2013-04-01

Predictive ozone models are becoming indispensable tools by providing a capability for pollution alerts to serve people who are vulnerable to the risks. We have developed a tropospheric ozone prediction capability for Tabriz, Iran, by using the following five modeling strategies: three regression-type methods: Multiple Linear Regression (MLR), Artificial Neural Networks (ANNs), and Gene Expression Programming (GEP); and two auto-regression-type models: Nonlinear Local Prediction (NLP) to implement chaos theory and Auto-Regressive Integrated Moving Average (ARIMA) models. The regression-type modeling strategies explain the data in terms of: temperature, solar radiation, dew point temperature, and wind speed, by regressing present ozone values to their past values. The ozone time series are available at various time intervals, including hourly intervals, from August 2010 to March 2011. The results for MLR, ANN and GEP models are not overly good but those produced by NLP and ARIMA are promising for the establishing a forecasting capability.
Temporal Drivers of Liking Based on Functional Data Analysis and Non-Additive Models for Multi-Attribute Time-Intensity Data of Fruit Chews.

PubMed

Kuesten, Carla; Bi, Jian

2018-06-03

Conventional drivers of liking analysis was extended with a time dimension into temporal drivers of liking (TDOL) based on functional data analysis methodology and non-additive models for multiple-attribute time-intensity (MATI) data. The non-additive models, which consider both direct effects and interaction effects of attributes to consumer overall liking, include Choquet integral and fuzzy measure in the multi-criteria decision-making, and linear regression based on variance decomposition. Dynamics of TDOL, i.e., the derivatives of the relative importance functional curves were also explored. Well-established R packages 'fda', 'kappalab' and 'relaimpo' were used in the paper for developing TDOL. Applied use of these methods shows that the relative importance of MATI curves offers insights for understanding the temporal aspects of consumer liking for fruit chews.

Unitary Response Regression Models

ERIC Educational Resources Information Center

Lipovetsky, S.

2007-01-01

The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Results of the 2012 AORN salary and compensation survey.

PubMed

Bacon, Donald R

2012-12-01

AORN conducted its 10th annual compensation survey for perioperative nurses in June 2012. A multiple regression model was used to examine how a number of variables, including job title, education level, certification, experience, and geographic region, affect nurse compensation. Comparisons between the 2012 data and previous years' data are presented. The effects of other forms of compensation, such as on-call compensation, overtime, bonuses, and shift differentials on base compensation rates, also are examined. Additional analyses explore the effect of the current economic downturn on the perioperative work environment. Copyright © 2012 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Results of the 2016 AORN Salary and Compensation Survey.

PubMed

Bacon, Donald R; Stewart, Kim A

2016-12-01

AORN conducted its 14th annual compensation survey for perioperative nurses in June 2016. A multiple regression model was used to examine how several variables, including job title, education level, certification, experience, and geographic region, affect nurse compensation. Comparisons between the 2016 data and data from previous years are presented. The effects of other forms of compensation (eg, on-call compensation, overtime, bonuses, shift differentials, benefits) on base compensation rates also are examined. Additional analyses explore the effect of the economic downturn on the perioperative work environment. Copyright Â© 2016 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Results of the 2013 AORN Salary and Compensation Survey.

PubMed

Bacon, Donald R; Stewart, Kim A

2013-12-01

AORN conducted its 11th annual compensation survey for perioperative nurses in June 2013. A multiple regression model was used to examine how a number of variables, including job title, education level, certification, experience, and geographic region affect nurse compensation. Comparisons among the 2013 data and previous years' data are presented. The effects of other forms of compensation, such as on-call compensation, overtime, bonuses, and shift differentials on base compensation rates are also examined. Additional analyses explore the effect of the current economic downturn on the perioperative work environment. Copyright © 2013 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Results of the 2017 AORN Salary and Compensation Survey.

PubMed

Bacon, Donald R; Stewart, Kim A

2017-12-01

AORN conducted its 15th annual compensation survey for perioperative nurses in June 2017. A multiple regression model was used to examine how several variables, including job title, educational level, certification, experience, and geographic region, affect nurse compensation. Comparisons between the 2017 data and data from previous years are presented. The effects of other forms of compensation (eg, on-call compensation, overtime, bonuses, shift differentials, benefits) on base compensation rates are examined. Additional analyses explore the current state of the nursing shortage and the sources of job satisfaction and dissatisfaction. Copyright © 2017 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Results of the 2010 AORN Salary and Compensation Survey.

PubMed

Bacon, Donald

2010-12-01

AORN conducted its eighth annual compensation survey for perioperative nurses in June and July 2010. A multiple regression model was used to examine how a number of variables, including job title, education level, certification, experience, and geographic region, affect nurse compensation. Comparisons between the 2010 data and data from previous years are presented. The effects of other forms of compensation, such as on-call compensation, overtime, bonuses, and shift differentials, on base compensation rates are also examined. Additional analyses explore the effect of the current economic downturn on the perioperative work environment. Published by Elsevier Inc. All rights reserved.
Modeling when and where a secondary accident occurs.

PubMed

Wang, Junhua; Liu, Boya; Fu, Ting; Liu, Shuo; Stipancic, Joshua

2018-01-31

The occurrence of secondary accidents leads to traffic congestion and road safety issues. Secondary accident prevention has become a major consideration in traffic incident management. This paper investigates the location and time of a potential secondary accident after the occurrence of an initial traffic accident. With accident data and traffic loop data collected over three years from California interstate freeways, a shock wave-based method was introduced to identify secondary accidents. A linear regression model and two machine learning algorithms, including a back-propagation neural network (BPNN) and a least squares support vector machine (LSSVM), were implemented to explore the distance and time gap between the initial and secondary accidents using inputs of crash severity, violation category, weather condition, tow away, road surface condition, lighting, parties involved, traffic volume, duration, and shock wave speed generated by the primary accident. From the results, the linear regression model was inadequate in describing the effect of most variables and its goodness-of-fit and accuracy in prediction was relatively poor. In the training programs, the BPNN and LSSVM demonstrated adequate goodness-of-fit, though the BPNN was superior with a higher CORR and lower MSE. The BPNN model also outperformed the LSSVM in time prediction, while both failed to provide adequate distance prediction. Therefore, the BPNN model could be used to forecast the time gap between initial and secondary accidents, which could be used by decision makers and incident management agencies to prevent or reduce secondary collisions. Copyright © 2018 Elsevier Ltd. All rights reserved.
Evaluating the spatial variation of total mercury in young-of-year yellow perch (Perca flavescens), surface water and upland soil for watershed-lake systems within the southern Boreal Shield.

PubMed

Gabriel, Mark C; Kolka, Randy; Wickman, Trent; Nater, Ed; Woodruff, Laurel

2009-06-15

The primary objective of this research is to investigate relationships between mercury in upland soil, lake water and fish tissue and explore the cause for the observed spatial variation of THg in age one yellow perch (Perca flavescens) for ten lakes within the Superior National Forest. Spatial relationships between yellow perch THg tissue concentration and a total of 45 watershed and water chemistry parameters were evaluated for two separate years: 2005 and 2006. Results show agreement with other studies where watershed area, lake water pH, nutrient levels (specifically dissolved NO(3)(-)-N) and dissolved iron are important factors controlling and/or predicting fish THg level. Exceeding all was the strong dependence of yellow perch THg level on soil A-horizon THg and, in particular, soil O-horizon THg concentrations (Spearman rho=0.81). Soil B-horizon THg concentration was significantly correlated (Pearson r=0.75) with lake water THg concentration. Lakes surrounded by a greater percentage of shrub wetlands (peatlands) had higher fish tissue THg levels, thus it is highly possible that these wetlands are main locations for mercury methylation. Stepwise regression was used to develop empirical models for the purpose of predicting the spatial variation in yellow perch THg over the studied region. The 2005 regression model demonstrates it is possible to obtain good prediction (up to 60% variance description) of resident yellow perch THg level using upland soil O-horizon THg as the only independent variable. The 2006 model shows even greater prediction (r(2)=0.73, with an overall 10 ng/g [tissue, wet weight] margin of error), using lake water dissolved iron and watershed area as the only model independent variables. The developed regression models in this study can help with interpreting THg concentrations in low trophic level fish species for untested lakes of the greater Superior National Forest and surrounding Boreal ecosystem.
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Sudarno

2018-05-01

The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
Marital status integration and suicide: A meta-analysis and meta-regression.

PubMed

Kyung-Sook, Woo; SangSoo, Shin; Sangjin, Shin; Young-Jeon, Shin

2018-01-01

Marital status is an index of the phenomenon of social integration within social structures and has long been identified as an important predictor suicide. However, previous meta-analyses have focused only on a particular marital status, or not sufficiently explored moderators. A meta-analysis of observational studies was conducted to explore the relationships between marital status and suicide and to understand the important moderating factors in this association. Electronic databases were searched to identify studies conducted between January 1, 2000 and June 30, 2016. We performed a meta-analysis, subgroup analysis, and meta-regression of 170 suicide risk estimates from 36 publications. Using random effects model with adjustment for covariates, the study found that the suicide risk for non-married versus married was OR = 1.92 (95% CI: 1.75-2.12). The suicide risk was higher for non-married individuals aged <65 years than for those aged ≥65 years, and higher for men than for women. According to the results of stratified analysis by gender, non-married men exhibited a greater risk of suicide than their married counterparts in all sub-analyses, but women aged 65 years or older showed no significant association between marital status and suicide. The suicide risk in divorced individuals was higher than for non-married individuals in both men and women. The meta-regression showed that gender, age, and sample size affected between-study variation. The results of the study indicated that non-married individuals have an aggregate higher suicide risk than married ones. In addition, gender and age were confirmed as important moderating factors in the relationship between marital status and suicide. Copyright © 2017 Elsevier Ltd. All rights reserved.
An NCME Instructional Module on Data Mining Methods for Classification and Regression

ERIC Educational Resources Information Center

Sinharay, Sandip

2016-01-01

Data mining methods for classification and regression are becoming increasingly popular in various scientific fields. However, these methods have not been explored much in educational measurement. This module first provides a review, which should be accessible to a wide audience in education measurement, of some of these methods. The module then…
Progressive and Regressive Aspects of Information Technology in Society: A Third Sector Perspective

ERIC Educational Resources Information Center

Miller, Kandace R.

2009-01-01

This dissertation explores the impact of information technology on progressive and regressive values in society from the perspective of one international foundation and four of its technology-related programs. Through a critical interpretive approach employing an instrumental multiple-case method, a framework to help explain the influence of…
Deriving the Regression Line with Algebra

ERIC Educational Resources Information Center

Quintanilla, John A.

2017-01-01

Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Variables Associated with Communicative Participation in People with Multiple Sclerosis: A Regression Analysis

ERIC Educational Resources Information Center

Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

2010-01-01

Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…
[From clinical judgment to linear regression model.

PubMed

Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

2013-01-01

When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Impact of multicollinearity on small sample hydrologic regression models

NASA Astrophysics Data System (ADS)

Kroll, Charles N.; Song, Peter

2013-06-01

Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Meteorological influence on predicting surface SO2 concentration from satellite remote sensing in Shanghai, China.

PubMed

Xue, Dan; Yin, Jingyuan

2014-05-01

In this study, we explored the potential applications of the Ozone Monitoring Instrument (OMI) satellite sensor in air pollution research. The OMI planetary boundary layer sulfur dioxide (SO2_PBL) column density and daily average surface SO2 concentration of Shanghai from 2004 to 2012 were analyzed. After several consecutive years of increase, the surface SO2 concentration finally declined in 2007. It was higher in winter than in other seasons. The coefficient between daily average surface SO2 concentration and SO2_PBL was only 0.316. But SO2_PBL was found to be a highly significant predictor of the surface SO2 concentration using the simple regression model. Five meteorological factors were considered in this study, among them, temperature, dew point, relative humidity, and wind speed were negatively correlated with surface SO2 concentration, while pressure was positively correlated. Furthermore, it was found that dew point was a more effective predictor than temperature. When these meteorological factors were used in multiple regression, the determination coefficient reached 0.379. The relationship of the surface SO2 concentration and meteorological factors was seasonally dependent. In summer and autumn, the regression model performed better than in spring and winter. The surface SO2 concentration predicting method proposed in this study can be easily adapted for other regions, especially most useful for those having no operational air pollution forecasting services or having sparse ground monitoring networks.
Gene-environment interaction between adiponectin gene polymorphisms and environmental factors on the risk of diabetic retinopathy.

PubMed

Li, Yuan; Wu, Qun Hong; Jiao, Ming Li; Fan, Xiao Hong; Hu, Quan; Hao, Yan Hua; Liu, Ruo Hong; Zhang, Wei; Cui, Yu; Han, Li Yuan

2015-01-01

To evaluate whether the adiponectin gene is associated with diabetic retinopathy (DR) risk and interaction with environmental factors modifies the DR risk, and to investigate the relationship between serum adiponectin levels and DR. Four adiponectin polymorphisms were evaluated in 372 DR cases and 145 controls. Differences in environmental factors between cases and controls were evaluated by unconditional logistic regression analysis. The model-free multifactor dimensionality reduction method and traditional multiple regression models were applied to explore interactions between the polymorphisms and environmental factors. Using the Bonferroni method, we found no significant associations between four adiponectin polymorphisms and DR susceptibility. Multivariate logistic regression found that physical activity played a protective role in the progress of DR, whereas family history of diabetes (odds ratio 1.75) and insulin therapy (odds ratio 1.78) were associated with an increased risk for DR. The interaction between the C-11377 G (rs266729) polymorphism and insulin therapy might be associated with DR risk. Family history of diabetes combined with insulin therapy also increased the risk of DR. No adiponectin gene polymorphisms influenced the serum adiponectin levels. Serum adiponectin levels did not differ between the DR group and non-DR group. No significant association was identified between four adiponectin polymorphisms and DR susceptibility after stringent Bonferroni correction. The interaction between C-11377G (rs266729) polymorphism and insulin therapy, as well as the interaction between family history of diabetes and insulin therapy, might be associated with DR susceptibility.
[Trend in mortality from external causes in pregnant and postpartum women and its relationship to socioeconomic factors in Colombia, 1998-2010].

PubMed

Salazar, Edwin; Buitrago, Carolina; Molina, Federico; Alzate, Catalina Arango

2015-05-01

Determine the trend in mortality from external causes in pregnant and postpartum women and its relationship to socioeconomic factors. Descriptive study, based on the official registries of deaths reported by the National Statistics Agency, 1998-2010. The trend was analyzed using Poisson regressions. Bivariate correlations and multiple linear regression models were constructed to explore the relationship between mortality and socioeconomic factors: human development index, Gini index, gross domestic product, unsatisfied basic needs, unemployment rate, poverty, extreme poverty, quality of life index, illiteracy rate, and percentage of affiliation to the Social Security System. A total of 2 223 female deaths from external causes were recorded, of which 1 429 occurred during pregnancy and 794 in the postpartum period. The gross mortality rate dropped from 30.7 per 100 000 live births plus fetal deaths in 1998 to 16.7 in 2010. A downward curve with no significant inflection points was shown in the risk of dying from this cause. The multiple linear regression model showed a correlation between mortality and extreme poverty and the illiteracy rate, suggesting that these indicators could explain 89.4% of the change in mortality from external causes in pregnant and postpartum women each year in Colombia. Mortality from external causes in pregnant and postpartum women showed a significant downward trend that may be explained by important socioeconomic changes in the country, including a decrease in extreme poverty and in the illiteracy rate.
Linkage effects between deposit discovery and postdiscovery exploratory drilling

USGS Publications Warehouse

Drew, Lawrence J.

1975-01-01

For the 1950-71 period of petroleum exploration in the Powder River Basin, northeastern Wyoming and southeastern Montana, three specific topics were investigated. First, the wildcat wells drilled during the ambient phases of exploration are estimated to have discovered 2.80 times as much petroleum per well as the wildcat wells drilled during the cyclical phases of exploration, periods when exploration plays were active. Second, the hypothesis was tested and verified that during ambient phases of exploration the discovery of deposits could be anticipated by a small but statistically significant rise in the ambient drilling rate during the year prior to the year of discovery. Closer examination of the data suggests that this anticipation effect decreases through time. Third, a regression model utilizing the two independent variables of (1) the volume of petroleum contained in each deposit discovered in a cell and the directly adjacent cells and (2) the respective depths of these deposits was constructed to predict the expected yearly cyclical wildcat drilling rate in four 30 by 30 min (approximately 860 mi2) sized cells. In two of these cells relatively large volumes of petroleum were discovered, whereas in the other two cells smaller volumes were discovered. The predicted and actual rates of wildcat drilling which occurred in each cell agreed rather closely.

Novice drivers' risky driving behavior, risk perception, and crash risk: findings from the DRIVE study.

PubMed

Ivers, Rebecca; Senserrick, Teresa; Boufous, Soufiane; Stevenson, Mark; Chen, Huei-Yang; Woodward, Mark; Norton, Robyn

2009-09-01

We explored the risky driving behaviors and risk perceptions of a cohort of young novice drivers and sought to determine their associations with crash risk. Provisional drivers aged 17 to 24 (n = 20 822) completed a detailed questionnaire that included measures of risk perception and behaviors; 2 years following recruitment, survey data were linked to licensing and police-reported crash data. Poisson regression models that adjusted for multiple confounders were created to explore crash risk. High scores on questionnaire items for risky driving were associated with a 50% increased crash risk (adjusted relative risk = 1.51; 95% confidence interval = 1.25, 1.81). High scores for risk perception (poorer perceptions of safety) were also associated with increased crash risk in univariate and multivariate models; however, significance was not sustained after adjustment for risky driving. The overrepresentation of youths in crashes involving casualties is a significant public health issue. Risky driving behavior is strongly linked to crash risk among young drivers and overrides the importance of risk perceptions. Systemwide intervention, including licensing reform, is warranted.
Exploring the impact of mentoring functions on job satisfaction and organizational commitment of new staff nurses

PubMed Central

2010-01-01

Background Although previous studies proved that the implementation of mentoring program is beneficial for enhancing the nursing skills and attitudes, few researchers devoted to exploring the impact of mentoring functions on job satisfaction and organizational commitment of new nurses. In this research we aimed at examining the effects of mentoring functions on the job satisfaction and organizational commitment of new nurses in Taiwan's hospitals. Methods We employed self-administered questionnaires to collect research data and select new nurses from three regional hospitals as samples in Taiwan. In all, 306 nurse samples were obtained. We adopted a multiple regression analysis to test the impact of the mentoring functions. Results Results revealed that career development and role modeling functions have positive effects on the job satisfaction and organizational commitment of new nurses; however, the psychosocial support function was incapable of providing adequate explanation for these work outcomes. Conclusion It is suggested in this study that nurse managers should improve the career development and role modeling functions of mentoring in order to enhance the job satisfaction and organizational commitment of new nurses. PMID:20712873
Exploring the relationship between stride, stature and hand size for forensic assessment.

PubMed

Guest, Richard; Miguel-Hurtado, Oscar; Stevenage, Sarah; Black, Sue

2017-11-01

Forensic evidence often relies on a combination of accurately recorded measurements, estimated measurements from landmark data such as a subject's stature given a known measurement within an image, and inferred data. In this study a novel dataset is used to explore linkages between hand measurements, stature, leg length and stride. These three measurements replicate the type of evidence found in surveillance videos with stride being extracted from an automated gait analysis system. Through correlations and regression modelling, it is possible to generate accurate predictions of stature from hand size, leg length and stride length (and vice versa), and to predict leg and stride length from hand size with, or without, stature as an intermediary variable. The study also shows improved accuracy when a subject's sex is known a-priori. Our method and models indicate the possibility of calculating or checking relationships between a suspect's physical measurements, particularly when only one component is captured as an accurately recorded measurement. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
School-Based Racial and Gender Discrimination among African American Adolescents: Exploring Gender Variation in Frequency and Implications for Adjustment

PubMed Central

Chavous, Tabbye M.; Griffin, Tiffany M.

2012-01-01

The present study examined school-based racial and gender discrimination experiences among African American adolescents in Grade 8 (n = 204 girls; n = 209 boys). A primary goal was exploring gender variation in frequency of both types of discrimination and associations of discrimination with academic and psychological functioning among girls and boys. Girls and boys did not vary in reported racial discrimination frequency, but boys reported more gender discrimination experiences. Multiple regression analyses within gender groups indicated that among girls and boys, racial discrimination and gender discrimination predicted higher depressive symptoms and school importance and racial discrimination predicted self-esteem. Racial and gender discrimination were also negatively associated with grade point average among boys but were not significantly associated in girls’ analyses. Significant gender discrimination X racial discrimination interactions resulted in the girls’ models predicting psychological outcomes and in boys’ models predicting academic achievement. Taken together, findings suggest the importance of considering gender- and race-related experiences in understanding academic and psychological adjustment among African American adolescents. PMID:22837794
Understanding the Role of the Professional Practice Environment on Quality of Care in Magnet® and Non-Magnet Hospitals

PubMed Central

Stimpfel, Amy Witkoski; Rosen, Jennifer E.; McHugh, Matthew D.

2017-01-01

OBJECTIVE The aim of this study was to explore the relationship between Magnet Recognition® and nurse-reported quality of care. BACKGROUND Magnet® hospitals are recognized for nursing excellence and quality patient outcomes; however, few studies have explored contributing factors for these superior outcomes. METHODS This was a secondary analysis of linked nurse survey data, hospital administrative data, and a listing of American Nurses Credentialing Center Magnet hospitals. Multivariate regressions were modeled before and after propensity score matching to assess the relationship between Magnet status and quality of care. A mediation model assessed the indirect effect of the professional practice environment on quality of care. RESULTS Nurse-reported quality of care was significantly associated with Magnet Recognition after matching. The professional practice environment mediates the relationship between Magnet status and quality of care. CONCLUSION A prominent feature of Magnet hospitals, a professional practice environment that is supportive of nursing, plays a role in explaining why Magnet hospitals have better nurse-reported quality of care. PMID:26426138
Understanding the Role of the Professional Practice Environment on Quality of Care in Magnet® and Non-Magnet Hospitals

PubMed Central

Stimpfel, Amy Witkoski; Rosen, Jennifer E.; McHugh, Matthew D.

2014-01-01

OBJECTIVE The aim of this study was to explore the relationship between Magnet Recognition® and nurse-reported quality of care. BACKGROUND Magnet® hospitals are recognized for nursing excellence and quality patient outcomes; however, few studies have explored contributing factors for these superior outcomes. METHODS This was a secondary analysis of linked nurse survey data, hospital administrative data, and a listing of American Nurses Credentialing Center Magnet hospitals. Multivariate regressions were modeled before and after propensity score matching to assess the relationship between Magnet status and quality of care. A mediation model assessed the indirect effect of the professional practice environment on quality of care. RESULTS Nurse-reported quality of care was significantly associated with Magnet Recognition after matching. The professional practice environment mediates the relationship between Magnet status and quality of care. CONCLUSION A prominent feature of Magnet hospitals, a professional practice environment that is supportive of nursing, plays a role in explaining why Magnet hospitals have better nurse-reported quality of care. PMID:24316613
The evaluation of rainfall influence on combined sewer overflows characteristics: the Berlin case study.

PubMed

Sandoval, S; Torres, A; Pawlowsky-Reusing, E; Riechel, M; Caradot, N

2013-01-01

The present study aims to explore the relationship between rainfall variables and water quality/quantity characteristics of combined sewer overflows (CSOs), by the use of multivariate statistical methods and online measurements at a principal CSO outlet in Berlin (Germany). Canonical correlation results showed that the maximum and average rainfall intensities are the most influential variables to describe CSO water quantity and pollutant loads whereas the duration of the rainfall event and the rain depth seem to be the most influential variables to describe CSO pollutant concentrations. The analysis of partial least squares (PLS) regression models confirms the findings of the canonical correlation and highlights three main influences of rainfall on CSO characteristics: (i) CSO water quantity characteristics are mainly influenced by the maximal rainfall intensities, (ii) CSO pollutant concentrations were found to be mostly associated with duration of the rainfall and (iii) pollutant loads seemed to be principally influenced by dry weather duration before the rainfall event. The prediction quality of PLS models is rather low (R² < 0.6) but results can be useful to explore qualitatively the influence of rainfall on CSO characteristics.
School-Based Racial and Gender Discrimination among African American Adolescents: Exploring Gender Variation in Frequency and Implications for Adjustment.

PubMed

Cogburn, Courtney D; Chavous, Tabbye M; Griffin, Tiffany M

2011-01-03

The present study examined school-based racial and gender discrimination experiences among African American adolescents in Grade 8 (n = 204 girls; n = 209 boys). A primary goal was exploring gender variation in frequency of both types of discrimination and associations of discrimination with academic and psychological functioning among girls and boys. Girls and boys did not vary in reported racial discrimination frequency, but boys reported more gender discrimination experiences. Multiple regression analyses within gender groups indicated that among girls and boys, racial discrimination and gender discrimination predicted higher depressive symptoms and school importance and racial discrimination predicted self-esteem. Racial and gender discrimination were also negatively associated with grade point average among boys but were not significantly associated in girls' analyses. Significant gender discrimination X racial discrimination interactions resulted in the girls' models predicting psychological outcomes and in boys' models predicting academic achievement. Taken together, findings suggest the importance of considering gender- and race-related experiences in understanding academic and psychological adjustment among African American adolescents.
Real estate value prediction using multivariate regression models

NASA Astrophysics Data System (ADS)

Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

2017-11-01

The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
School bullying and traumatic dental injuries in East London adolescents.

PubMed

Agel, M; Marcenes, W; Stansfeld, S A; Bernabé, E

2014-12-01

To explore the association between school bullying and traumatic dental injuries (TDI) among 15-16-year-old school children from East London. Data from phase III of the Research with East London Adolescents Community Health Survey (RELACHS), a school-based prospective study of a representative sample of adolescents, were analysed. Adolescents provided information on demographic characteristics, socioeconomic measures and frequency of bullying in school through self-administered questionnaires and were clinically examined for overjet, lip coverage and TDI. The association between school bullying and TDI was assessed using binary logistic regression models. The prevalence of TDI was 17%, while lifetime and current prevalence of bullying was 32% and 11%, respectively. The prevalence of TDI increased with a growing frequency of bullying; from 16% among adolescents who had never been bullied at school, to 21% among those who were bullied in the past but not this school term, to 22% for those who were bullied this school term. However, this association was not statistically significant either in crude or adjusted regression models. There was no evidence of an association between frequency of school bullying and TDI in this sample of 15-16-year-old adolescents in East London.
Nutrition knowledge, attitudes, behaviours and the influencing factors among non-parent caregivers of rural left-behind children under 7 years old in China.

PubMed

Tan, Cai; Luo, Jiayou; Zong, Rong; Fu, Chuhui; Zhang, Lingli; Mou, Jinsong; Duan, Danhui

2010-10-01

To explore and compare nutrition knowledge, attitudes and behaviours (KAB) between non-parent and parent caregivers of children under 7 years old in Chinese rural areas, and to identify the factors influencing their nutrition KAB. Face-to-face interviews were carried out with 1691 non-parent caregivers and 1670 parent caregivers in the selected study areas; multivariate logistic regression models were used to identify the factors influencing nutrition KAB in caregivers. The awareness rate of nutrition knowledge, the rate of positive attitudes and the rate of optimal behaviours in non-parent caregivers (52.2 %, 56.9 % and 37.7 %, respectively) were significantly lower than in the parent group (63.8 %, 62.1 % and 42.8 %, respectively). Multivariate logistic regression modelling showed that caregivers' family income and care will, and children's age and gender, were associated with caregivers' nutrition KAB after controlling the possible confounding variables (caregivers' age, gender, education and occupation). Non-parent caregivers had relatively poor nutrition KAB. Extra efforts and targeted education programmes aimed to improve rural non-parent caregivers' nutrition KAB are wanted and need to be emphasized.
Use of multiple regression models in the study of sandhopper orientation under natural conditions

NASA Astrophysics Data System (ADS)

Marchetti, Giovanni M.; Scapini, Felicita

2003-10-01

In sandhoppers (Amphipoda; Talitridae), typical dwellers of the supralittoral zone of sandy beaches, orientation with respect to the sun and landscape vision is adapted to the local direction of the shoreline. Variation of this behavioural adaptation can be related to the characteristics of the beach. Measures of orientation with respect to the shoreline direction can thus be made as a tool to assess beach stability versus changeability, once the sources of variation are correctly interpreted. Orientation of animals can be studied by statistical analysis of directions taken after release in nature. In this paper some new tools for exploring directional data are reviewed, with special emphasis on non-parametric smoothers and regression models. Results from a large study concerning one species of sandhoppers, Talitrus saltator (Montagu), from an exposed sandy beach in northeastern Tunisia are presented. Seasonal differences in orientation behaviour were shown with a higher scatter in autumn with respect to spring. The higher scatter shown in autumn depended both on intrinsic (sex) and external (climatic conditions and landscape visibility) factors and was related to the tendency of this species to migrate towards the dune anticipating winter conditions.
Application of Multiple Regression and Design of Experiments for Modelling the Effect of Monoethylene Glycol in the Calcium Carbonate Scaling Process.

PubMed

Kartnaller, Vinicius; Venâncio, Fabrício; F do Rosário, Francisca; Cajaiba, João

2018-04-10

To avoid gas hydrate formation during oil and gas production, companies usually employ thermodynamic inhibitors consisting of hydroxyl compounds, such as monoethylene glycol (MEG). However, these inhibitors may cause other types of fouling during production such as inorganic salt deposits (scale). Calcium carbonate is one of the main scaling salts and is a great concern, especially for the new pre-salt wells being explored in Brazil. Hence, it is important to understand how using inhibitors to control gas hydrate formation may be interacting with the scale formation process. Multiple regression and design of experiments were used to mathematically model the calcium carbonate scaling process and its evolution in the presence of MEG. It was seen that MEG, although inducing the precipitation by increasing the supersaturation ratio, actually works as a scale inhibitor for calcium carbonate in concentrations over 40%. This effect was not due to changes in the viscosity, as suggested in the literature, but possibly to the binding of MEG to the CaCO₃ particles' surface. The interaction of the MEG inhibition effect with the system's variables was also assessed, when temperature' and calcium concentration were more relevant.
Predictive sparse modeling of fMRI data for improved classification, regression, and visualization using the k-support norm.

PubMed

Belilovsky, Eugene; Gkirtzou, Katerina; Misyrlis, Michail; Konova, Anna B; Honorio, Jean; Alia-Klein, Nelly; Goldstein, Rita Z; Samaras, Dimitris; Blaschko, Matthew B

2015-12-01

We explore various sparse regularization techniques for analyzing fMRI data, such as the ℓ1 norm (often called LASSO in the context of a squared loss function), elastic net, and the recently introduced k-support norm. Employing sparsity regularization allows us to handle the curse of dimensionality, a problem commonly found in fMRI analysis. In this work we consider sparse regularization in both the regression and classification settings. We perform experiments on fMRI scans from cocaine-addicted as well as healthy control subjects. We show that in many cases, use of the k-support norm leads to better predictive performance, solution stability, and interpretability as compared to other standard approaches. We additionally analyze the advantages of using the absolute loss function versus the standard squared loss which leads to significantly better predictive performance for the regularization methods tested in almost all cases. Our results support the use of the k-support norm for fMRI analysis and on the clinical side, the generalizability of the I-RISA model of cocaine addiction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Time series trends of the safety effects of pavement resurfacing.

PubMed

Park, Juneyoung; Abdel-Aty, Mohamed; Wang, Jung-Han

2017-04-01

This study evaluated the safety performance of pavement resurfacing projects on urban arterials in Florida using the observational before and after approaches. The safety effects of pavement resurfacing were quantified in the crash modification factors (CMFs) and estimated based on different ranges of heavy vehicle traffic volume and time changes for different severity levels. In order to evaluate the variation of CMFs over time, crash modification functions (CMFunctions) were developed using nonlinear regression and time series models. The results showed that pavement resurfacing projects decrease crash frequency and are found to be more safety effective to reduce severe crashes in general. Moreover, the results of the general relationship between the safety effects and time changes indicated that the CMFs increase over time after the resurfacing treatment. It was also found that pavement resurfacing projects for the urban roadways with higher heavy vehicle volume rate are more safety effective than the roadways with lower heavy vehicle volume rate. Based on the exploration and comparison of the developed CMFucntions, the seasonal autoregressive integrated moving average (SARIMA) and exponential functional form of the nonlinear regression models can be utilized to identify the trend of CMFs over time. Copyright © 2017 Elsevier Ltd. All rights reserved.
Changes in profile of lipids and adipokines in patients with newly diagnosed hypothyroidism and hyperthyroidism

PubMed Central

Chen, Yanyan; Wu, Xiafang; Wu, Ruirui; Sun, Xiance; Yang, Boyi; Wang, Yi; Xu, Yuanyuan

2016-01-01

Changes in profile of lipids and adipokines have been reported in patients with thyroid dysfunction. But the evidence is controversial. The present study aimed to explore the relationships between thyroid function and the profile of lipids and adipokines. A cross-sectional study was conducted in 197 newly diagnosed hypothyroid patients, 230 newly diagnosed hyperthyroid patients and 355 control subjects. Hypothyroid patients presented with significantly higher serum levels of total cholesterol, triglycerides, low-density lipoprotein cholesterol (LDLC), fasting insulin, resistin and leptin than control (p < 0.05). Hyperthyroid patients presented with significantly lower serum levels of high-density lipoprotein cholesterol, LDLC and leptin, as well as higher levels of fasting insulin, resistin, adiponectin and homeostasis model insulin resistance index (HOMA-IR) than control (p < 0.05). Nonlinear regression and multivariable linear regression models all showed significant associations of resistin or adiponectin with free thyroxine and association of leptin with thyroid-stimulating hormone (p < 0.001). Furthermore, significant correlation between resistin and HOMA-IR was observed in the patients (p < 0.001). Thus, thyroid dysfunction affects the profile of lipids and adipokines. Resistin may serve as a link between thyroid dysfunction and insulin resistance. PMID:27193069
Relationship between caregivers' nutritional knowledge and children's dietary behavior in Chinese rural areas.

PubMed

Zeng, Rong; Luo, Jiayou; Tan, Cai; DU, Qiyun; Zhang, Weimin; Li, Yanping

2012-11-01

To explore the relationship between caregivers' nutritional knowledge and children's dietary behavior in rural areas of China. A cross-sectional study was conducted. 3361 rural caregivers and their children, aged 2 to 7 years old, were selected randomly and surveyed by questionnaire. Logistic regression models were used to identify the relationship between caregivers' nutritional knowledge and the children's dietary behaviors. The awareness level of nutritional knowledge among rural caregivers was 57.9%; among the children surveyed, 79.6% did not like to drink milk, 66.0% were considered choosy of food, 84.1% regularly snacked, 24.4% frequently skipped breakfast, and 13.7% did not come to meals on time. Logistic regression models indicated that a caregiver with a low level of nutritional knowledge is a risk factor for a child's unhealth dietary behaviors (snacking excepted): the odds ratios (OR) of not liking to drink milk, being choosy about food, skipping breakfast or not having meals on time are 1.665, 1.338, 1.330 and 1.582, respectively. Caregivers' nutritional knowledge is strongly associated with children's dietary behavior. Nutrition education programs are urgently wanted to improve caregiver's knowledge and thus to improve children's dietary behavior in rural areas of China.
Multivariate prediction of upper limb prosthesis acceptance or rejection.

PubMed

Biddiss, Elaine A; Chau, Tom T

2008-07-01

To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.
Predictive modeling studies for the ecotoxicity of ionic liquids towards the green algae Scenedesmus vacuolatus.

PubMed

Das, Rudra Narayan; Roy, Kunal

2014-06-01

Hazardous potential of ionic liquids is becoming an issue of high concern with increasing application of these compounds in various industrial processes. Predictive toxicological modeling on ionic liquids provides a rational assessment strategy and aids in developing suitable guidance for designing novel analogues. The present study attempts to explore the chemical features of ionic liquids responsible for their ecotoxicity towards the green algae Scenedesmus vacuolatus by developing mathematical models using extended topochemical atom (ETA) indices along with other categories of chemical descriptors. The entire study has been conducted with reference to the OECD guidelines for QSAR model development using predictive classification and regression modeling strategies. The best models from both the analyses showed that ecotoxicity of ionic liquids can be decreased by reducing chain length of cationic substituents and increasing hydrogen bond donor feature in cations, and replacing bulky unsaturated anions with simple saturated moiety having less lipophilic heteroatoms. Copyright © 2013 Elsevier Ltd. All rights reserved.
Applying intersectionality to explore the relations between gendered racism and health among Black women.

PubMed

Lewis, Jioni A; Williams, Marlene G; Peppers, Erica J; Gadson, Cecile A

2017-10-01

The purpose of this study was to apply an intersectionality framework to explore the influence of gendered racism (i.e., intersection of racism and sexism) on health outcomes. Specifically, we applied intersectionality to extend a biopsychosocial model of racism to highlight the psychosocial variables that mediate and moderate the influence of gendered racial microaggressions (i.e., subtle gendered racism) on health outcomes. In addition, we tested aspects of this conceptual model by exploring the influence of gendered racial microaggressions on the mental and physical health of Black women. In addition, we explored the mediating role of coping strategies and the moderating role of gendered racial identity centrality. Participants were 231 Black women who completed an online survey. Results from regression analyses indicated that gendered racial microaggressions significantly predicted both self-reported mental and physical health outcomes. In addition, results from mediation analyses indicated that disengagement coping significantly mediated the link between gendered racial microaggressions and negative mental and physical health. In addition, a moderated mediation effect was found, such that individuals who reported a greater frequency of gendered racial microaggressions and reported lower levels of gendered racial identity centrality tended to use greater disengagement coping, which in turn, was negatively associated with mental and physical health outcomes. Findings of this study suggest that gendered racial identity centrality can serve a buffering role against the negative mental and physical health effects of gendered racism for Black women. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.