regression models conclusions: Topics by Science.gov

Sample records for regression models conclusions

Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2013-01-01

This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
A new approach to correct the QT interval for changes in heart rate using a nonparametric regression model in beagle dogs.

PubMed

Watanabe, Hiroyuki; Miyazaki, Hiroyasu

2006-01-01

Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
An improved portmanteau test for autocorrelated errors in interrupted time-series regression models.

PubMed

Huitema, Bradley E; McKean, Joseph W

2007-08-01

A new portmanteau test for autocorrelation among the errors of interrupted time-series regression models is proposed. Simulation results demonstrate that the inferential properties of the proposed Q(H-M) test statistic are considerably more satisfactory than those of the well known Ljung-Box test and moderately better than those of the Box-Pierce test. These conclusions generally hold for a wide variety of autoregressive (AR), moving averages (MA), and ARMA error processes that are associated with time-series regression models of the form described in Huitema and McKean (2000a, 2000b).
Non-ignorable missingness in logistic regression.

PubMed

Wang, Joanna J J; Bartlett, Mark; Ryan, Louise

2017-08-30

Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Development of an anaerobic threshold (HRLT, HRVT) estimation equation using the heart rate threshold (HRT) during the treadmill incremental exercise test

PubMed Central

Ham, Joo-ho; Park, Hun-Young; Kim, Youn-ho; Bae, Sang-kon; Ko, Byung-hoon

2017-01-01

[Purpose] The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. [Methods] We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20–59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. [Results] Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. [Conclusion] These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. PMID:29036765
OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

PubMed

Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

2012-01-01

The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

PubMed Central

Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

2015-01-01

Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
The intermediate endpoint effect in logistic and probit regression

PubMed Central

MacKinnon, DP; Lockwood, CM; Brown, CH; Wang, W; Hoffman, JM

2010-01-01

Background An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. Purpose The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. Methods The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Results Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. Limitations More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Conclusions Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted conclusions regarding the intermediate effect. PMID:17942466
Using a Regression Discontinuity Design to Estimate the Impact of Placement Decisions in Developmental Math

ERIC Educational Resources Information Center

Melguizo, Tatiana; Bos, Johannes M.; Ngo, Federick; Mills, Nicholas; Prather, George

2016-01-01

This study evaluates the effectiveness of math placement policies for entering community college students on these students' academic success in math. We estimate the impact of placement decisions by using a discrete-time survival model within a regression discontinuity framework. The primary conclusion that emerges is that initial placement in a…
Detection of outliers in the response and explanatory variables of the simple circular regression model

NASA Astrophysics Data System (ADS)

Mahmood, Ehab A.; Rana, Sohel; Hussin, Abdul Ghapor; Midi, Habshah

2016-06-01

The circular regression model may contain one or more data points which appear to be peculiar or inconsistent with the main part of the model. This may be occur due to recording errors, sudden short events, sampling under abnormal conditions etc. The existence of these data points "outliers" in the data set cause lot of problems in the research results and the conclusions. Therefore, we should identify them before applying statistical analysis. In this article, we aim to propose a statistic to identify outliers in the both of the response and explanatory variables of the simple circular regression model. Our proposed statistic is robust circular distance RCDxy and it is justified by the three robust measurements such as proportion of detection outliers, masking and swamping rates.
Differentially private distributed logistic regression using private and public data

PubMed Central

2014-01-01

Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786
Spatiotemporal variability of urban growth factors: A global and local perspective on the megacity of Mumbai

NASA Astrophysics Data System (ADS)

Shafizadeh-Moghadam, Hossein; Helbich, Marco

2015-03-01

The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.
Methods for calculating confidence and credible intervals for the residual between-study variance in random effects meta-regression models

PubMed Central

2014-01-01

Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Mixed conditional logistic regression for habitat selection studies.

PubMed

Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas

2010-05-01

1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.
Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

PubMed Central

2011-01-01

Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
A controlled experiment in ground water flow model calibration

USGS Publications Warehouse

Hill, M.C.; Cooley, R.L.; Pollock, D.W.

1998-01-01

Nonlinear regression was introduced to ground water modeling in the 1970s, but has been used very little to calibrate numerical models of complicated ground water systems. Apparently, nonlinear regression is thought by many to be incapable of addressing such complex problems. With what we believe to be the most complicated synthetic test case used for such a study, this work investigates using nonlinear regression in ground water model calibration. Results of the study fall into two categories. First, the study demonstrates how systematic use of a well designed nonlinear regression method can indicate the importance of different types of data and can lead to successive improvement of models and their parameterizations. Our method differs from previous methods presented in the ground water literature in that (1) weighting is more closely related to expected data errors than is usually the case; (2) defined diagnostic statistics allow for more effective evaluation of the available data, the model, and their interaction; and (3) prior information is used more cautiously. Second, our results challenge some commonly held beliefs about model calibration. For the test case considered, we show that (1) field measured values of hydraulic conductivity are not as directly applicable to models as their use in some geostatistical methods imply; (2) a unique model does not necessarily need to be identified to obtain accurate predictions; and (3) in the absence of obvious model bias, model error was normally distributed. The complexity of the test case involved implies that the methods used and conclusions drawn are likely to be powerful in practice.Nonlinear regression was introduced to ground water modeling in the 1970s, but has been used very little to calibrate numerical models of complicated ground water systems. Apparently, nonlinear regression is thought by many to be incapable of addressing such complex problems. With what we believe to be the most complicated synthetic test case used for such a study, this work investigates using nonlinear regression in ground water model calibration. Results of the study fall into two categories. First, the study demonstrates how systematic use of a well designed nonlinear regression method can indicate the importance of different types of data and can lead to successive improvement of models and their parameterizations. Our method differs from previous methods presented in the ground water literature in that (1) weighting is more closely related to expected data errors than is usually the case; (2) defined diagnostic statistics allow for more effective evaluation of the available data, the model, and their interaction; and (3) prior information is used more cautiously. Second, our results challenge some commonly held beliefs about model calibration. For the test case considered, we show that (1) field measured values of hydraulic conductivity are not as directly applicable to models as their use in some geostatistical methods imply; (2) a unique model does not necessarily need to be identified to obtain accurate predictions; and (3) in the absence of obvious model bias, model error was normally distributed. The complexity of the test case involved implies that the methods used and conclusions drawn are likely to be powerful in practice.
Prediction models for clustered data: comparison of a random intercept and standard regression model

PubMed Central

2013-01-01

Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436
United States Marine Corps Basic Reconnaissance Course: Predictors of Success

DTIC Science & Technology

2017-03-01

PAGE INTENTIONALLY LEFT BLANK 81 VI. CONCLUSIONS AND RECOMMENDATIONS A. CONCLUSIONS The objective of my research is to provide quantitative ...percent over the last three years, illustrating there is room for improvement. This study conducts a quantitative and qualitative analysis of the...criteria used to select candidates for the BRC. The research uses multi-variate logistic regression models and survival analysis to determine to what
Does transport time help explain the high trauma mortality rates in rural areas? New and traditional predictors assessed by new and traditional statistical methods

PubMed Central

Røislien, Jo; Lossius, Hans Morten; Kristiansen, Thomas

2015-01-01

Background Trauma is a leading global cause of death. Trauma mortality rates are higher in rural areas, constituting a challenge for quality and equality in trauma care. The aim of the study was to explore population density and transport time to hospital care as possible predictors of geographical differences in mortality rates, and to what extent choice of statistical method might affect the analytical results and accompanying clinical conclusions. Methods Using data from the Norwegian Cause of Death registry, deaths from external causes 1998–2007 were analysed. Norway consists of 434 municipalities, and municipality population density and travel time to hospital care were entered as predictors of municipality mortality rates in univariate and multiple regression models of increasing model complexity. We fitted linear regression models with continuous and categorised predictors, as well as piecewise linear and generalised additive models (GAMs). Models were compared using Akaike's information criterion (AIC). Results Population density was an independent predictor of trauma mortality rates, while the contribution of transport time to hospital care was highly dependent on choice of statistical model. A multiple GAM or piecewise linear model was superior, and similar, in terms of AIC. However, while transport time was statistically significant in multiple models with piecewise linear or categorised predictors, it was not in GAM or standard linear regression. Conclusions Population density is an independent predictor of trauma mortality rates. The added explanatory value of transport time to hospital care is marginal and model-dependent, highlighting the importance of exploring several statistical models when studying complex associations in observational data. PMID:25972600
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients

PubMed

Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil

2018-03-27

Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License

POWER PRIOR DISTRIBUTIONS FOR REGRESSION MODELS. (R824757)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Access disparities to Magnet hospitals for patients undergoing neurosurgical operations

PubMed Central

Missios, Symeon; Bekelis, Kimon

2017-01-01

Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152
Evaluating the utility of companion animal tick surveillance practices for monitoring spread and occurrence of human Lyme disease in West Virginia, 2014-2016.

PubMed

Hendricks, Brian; Mark-Carew, Miguella; Conley, Jamison

2017-11-13

Domestic dogs and cats are potentially effective sentinel populations for monitoring occurrence and spread of Lyme disease. Few studies have evaluated the public health utility of sentinel programmes using geo-analytic approaches. Confirmed Lyme disease cases diagnosed by physicians and ticks submitted by veterinarians to the West Virginia State Health Department were obtained for 2014-2016. Ticks were identified to species, and only Ixodes scapularis were incorporated in the analysis. Separate ordinary least squares (OLS) and spatial lag regression models were conducted to estimate the association between average numbers of Ix. scapularis collected on pets and human Lyme disease incidence. Regression residuals were visualised using Local Moran's I as a diagnostic tool to identify spatial dependence. Statistically significant associations were identified between average numbers of Ix. scapularis collected from dogs and human Lyme disease in the OLS (β=20.7, P<0.001) and spatial lag (β=12.0, P=0.002) regression. No significant associations were identified for cats in either regression model. Statistically significant (P≤0.05) spatial dependence was identified in all regression models. Local Moran's I maps produced for spatial lag regression residuals indicated a decrease in model over- and under-estimation, but identified a higher number of statistically significant outliers than OLS regression. Results support previous conclusions that dogs are effective sentinel populations for monitoring risk of human exposure to Lyme disease. Findings reinforce the utility of spatial analysis of surveillance data, and highlight West Virginia's unique position within the eastern United States in regards to Lyme disease occurrence.
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.

Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days

DOE PAGES

Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.; ...

2017-09-22

Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
An empirical study using permutation-based resampling in meta-regression

PubMed Central

2012-01-01

Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815
A new model for estimating total body water from bioelectrical resistance

NASA Technical Reports Server (NTRS)

Siconolfi, S. F.; Kear, K. T.

1992-01-01

Estimation of total body water (T) from bioelectrical resistance (R) is commonly done by stepwise regression models with height squared over R, H(exp 2)/R, age, sex, and weight (W). Polynomials of H(exp 2)/R have not been included in these models. We examined the validity of a model with third order polynomials and W. Methods: T was measured with oxygen-18 labled water in 27 subjects. R at 50 kHz was obtained from electrodes placed on the hand and foot while subjects were in the supine position. A stepwise regression equation was developed with 13 subjects (age 31.5 plus or minus 6.2 years, T 38.2 plus or minus 6.6 L, W 65.2 plus or minus 12.0 kg). Correlations, standard error of estimates and mean differences were computed between T and estimated T's from the new (N) model and other models. Evaluations were completed with the remaining 14 subjects (age 32.4 plus or minus 6.3 years, T 40.3 plus or minus 8 L, W 70.2 plus or minus 12.3 kg) and two of its subgroups (high and low) Results: A regression equation was developed from the model. The only significant mean difference was between T and one of the earlier models. Conclusion: Third order polynomials in regression models may increase the accuracy of estimating total body water. Evaluating the model with a larger population is needed.
Regression dilution bias: tools for correction methods and sample size calculation.

PubMed

Berglund, Lars

2012-08-01

Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Comparison of Logistic Regression and Artificial Neural Network in Low Back Pain Prediction: Second National Health Survey

PubMed Central

Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

2012-01-01

Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198
Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

NASA Astrophysics Data System (ADS)

Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

2014-12-01

Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

PubMed Central

Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

2017-01-01

Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
[Research of prevalence of schistosomiasis in Hunan province, 1984-2015].

PubMed

Li, F Y; Tan, H Z; Ren, G H; Jiang, Q; Wang, H L

2017-03-10

Objective: To analyze the prevalence of schistosomiasis in Hunan province, and provide scientific evidence for the control and elimination of schistosomiasis. Methods: The changes of infection rates of Schistosoma ( S .) japonicum among residents and cattle in Hunan from 1984 to 2015 were analyzed by using dynamic trend diagram; and the time regression model was used to fit the infection rates of S. japonicum , and predict the recent infection rate. Results: The overall infection rates of S. japonicum in Hunan from 1984 to 2015 showed downward trend (95.29% in residents and 95.16% in cattle). By using the linear regression model, the actual values of infection rates in residents and cattle were all in the 95% confidence intervals of the value predicted; and the prediction showed that the infection rates in the residents and cattle would continue to decrease from 2016 to 2020. Conclusion: The prevalence of schistosomiasis was in decline in Hunan. The regression model has a good effect in the short-term prediction of schistosomiasis prevalence.
Evaluation of the Bitterness of Traditional Chinese Medicines using an E-Tongue Coupled with a Robust Partial Least Squares Regression Method.

PubMed

Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin

2016-01-25

To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
Regression models for analyzing costs and their determinants in health care: an introductory review.

PubMed

Gregori, Dario; Petrinco, Michele; Bo, Simona; Desideri, Alessandro; Merletti, Franco; Pagano, Eva

2011-06-01

This article aims to describe the various approaches in multivariable modelling of healthcare costs data and to synthesize the respective criticisms as proposed in the literature. We present regression methods suitable for the analysis of healthcare costs and then apply them to an experimental setting in cardiovascular treatment (COSTAMI study) and an observational setting in diabetes hospital care. We show how methods can produce different results depending on the degree of matching between the underlying assumptions of each method and the specific characteristics of the healthcare problem. The matching of healthcare cost models to the analytic objectives and characteristics of the data available to a study requires caution. The study results and interpretation can be heavily dependent on the choice of model with a real risk of spurious results and conclusions.
Developing a dengue forecast model using machine learning: A case study in China

PubMed Central

Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun

2017-01-01

Background In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Methodology/Principal findings Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011–2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. Conclusion and significance The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics. PMID:29036169
Can Predictive Modeling Identify Head and Neck Oncology Patients at Risk for Readmission?

PubMed

Manning, Amy M; Casper, Keith A; Peter, Kay St; Wilson, Keith M; Mark, Jonathan R; Collar, Ryan M

2018-05-01

Objective Unplanned readmission within 30 days is a contributor to health care costs in the United States. The use of predictive modeling during hospitalization to identify patients at risk for readmission offers a novel approach to quality improvement and cost reduction. Study Design Two-phase study including retrospective analysis of prospectively collected data followed by prospective longitudinal study. Setting Tertiary academic medical center. Subjects and Methods Prospectively collected data for patients undergoing surgical treatment for head and neck cancer from January 2013 to January 2015 were used to build predictive models for readmission within 30 days of discharge using logistic regression, classification and regression tree (CART) analysis, and random forests. One model (logistic regression) was then placed prospectively into the discharge workflow from March 2016 to May 2016 to determine the model's ability to predict which patients would be readmitted within 30 days. Results In total, 174 admissions had descriptive data. Thirty-two were excluded due to incomplete data. Logistic regression, CART, and random forest predictive models were constructed using the remaining 142 admissions. When applied to 106 consecutive prospective head and neck oncology patients at the time of discharge, the logistic regression model predicted readmissions with a specificity of 94%, a sensitivity of 47%, a negative predictive value of 90%, and a positive predictive value of 62% (odds ratio, 14.9; 95% confidence interval, 4.02-55.45). Conclusion Prospectively collected head and neck cancer databases can be used to develop predictive models that can accurately predict which patients will be readmitted. This offers valuable support for quality improvement initiatives and readmission-related cost reduction in head and neck cancer care.
Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study.

PubMed

Brenn, T; Arnesen, E

1985-01-01

For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.
Robust ridge regression estimators for nonlinear models with applications to high throughput screening assay data.

PubMed

Lim, Changwon

2015-03-30

Nonlinear regression is often used to evaluate the toxicity of a chemical or a drug by fitting data from a dose-response study. Toxicologists and pharmacologists may draw a conclusion about whether a chemical is toxic by testing the significance of the estimated parameters. However, sometimes the null hypothesis cannot be rejected even though the fit is quite good. One possible reason for such cases is that the estimated standard errors of the parameter estimates are extremely large. In this paper, we propose robust ridge regression estimation procedures for nonlinear models to solve this problem. The asymptotic properties of the proposed estimators are investigated; in particular, their mean squared errors are derived. The performances of the proposed estimators are compared with several standard estimators using simulation studies. The proposed methodology is also illustrated using high throughput screening assay data obtained from the National Toxicology Program. Copyright © 2014 John Wiley & Sons, Ltd.
Forecasting urban water demand: A meta-regression analysis.

PubMed

Sebri, Maamar

2016-12-01

Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.
Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

PubMed Central

Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

2012-01-01

Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

Comparison of Cox’s Regression Model and Parametric Models in Evaluating the Prognostic Factors for Survival after Liver Transplantation in Shiraz during 2000–2012

PubMed Central

Adelian, R.; Jamali, J.; Zare, N.; Ayatollahi, S. M. T.; Pooladfar, G. R.; Roustaei, N.

2015-01-01

Background: Identification of the prognostic factors for survival in patients with liver transplantation is challengeable. Various methods of survival analysis have provided different, sometimes contradictory, results from the same data. Objective: To compare Cox’s regression model with parametric models for determining the independent factors for predicting adults’ and pediatrics’ survival after liver transplantation. Method: This study was conducted on 183 pediatric patients and 346 adults underwent liver transplantation in Namazi Hospital, Shiraz, southern Iran. The study population included all patients undergoing liver transplantation from 2000 to 2012. The prognostic factors sex, age, Child class, initial diagnosis of the liver disease, PELD/MELD score, and pre-operative laboratory markers were selected for survival analysis. Result: Among 529 patients, 346 (64.5%) were adult and 183 (34.6%) were pediatric cases. Overall, the lognormal distribution was the best-fitting model for adult and pediatric patients. Age in adults (HR=1.16, p<0.05) and weight (HR=2.68, p<0.01) and Child class B (HR=2.12, p<0.05) in pediatric patients were the most important factors for prediction of survival after liver transplantation. Adult patients younger than the mean age and pediatric patients weighing above the mean and Child class A (compared to those with classes B or C) had better survival. Conclusion: Parametric regression model is a good alternative for the Cox’s regression model. PMID:26306158
A novel model incorporating two variability sources for describing motor evoked potentials

PubMed Central

Goetz, Stefan M.; Luber, Bruce; Lisanby, Sarah H.; Peterchev, Angel V.

2014-01-01

Objective Motor evoked potentials (MEPs) play a pivotal role in transcranial magnetic stimulation (TMS), e.g., for determining the motor threshold and probing cortical excitability. Sampled across the range of stimulation strengths, MEPs outline an input–output (IO) curve, which is often used to characterize the corticospinal tract. More detailed understanding of the signal generation and variability of MEPs would provide insight into the underlying physiology and aid correct statistical treatment of MEP data. Methods A novel regression model is tested using measured IO data of twelve subjects. The model splits MEP variability into two independent contributions, acting on both sides of a strong sigmoidal nonlinearity that represents neural recruitment. Traditional sigmoidal regression with a single variability source after the nonlinearity is used for comparison. Results The distribution of MEP amplitudes varied across different stimulation strengths, violating statistical assumptions in traditional regression models. In contrast to the conventional regression model, the dual variability source model better described the IO characteristics including phenomena such as changing distribution spread and skewness along the IO curve. Conclusions MEP variability is best described by two sources that most likely separate variability in the initial excitation process from effects occurring later on. The new model enables more accurate and sensitive estimation of the IO curve characteristics, enhancing its power as a detection tool, and may apply to other brain stimulation modalities. Furthermore, it extracts new information from the IO data concerning the neural variability—information that has previously been treated as noise. PMID:24794287
Regression to the Mean and Changes in Risk Behavior following Study Enrollment in a Cohort of US Women at Risk for HIV

PubMed Central

Hughes, James P.; Haley, Danielle F.; Frew, Paula M.; Golin, Carol E.; Adimora, Adaora A; Kuo, Irene; Justman, Jessica; Soto-Torres, Lydia; Wang, Jing; Hodder, Sally

2015-01-01

Purpose Reductions in risk behaviors are common following enrollment in HIV prevention studies. We develop methods to quantify the proportion of change in risk behaviors that can be attributed to regression to the mean versus study participation and other factors. Methods A novel model that incorporates both regression to the mean and study participation effects is developed for binary measures. The model is used to estimate the proportion of change in the prevalence of “unprotected sex in the past 6 months” that can be attributed to study participation versus regression to the mean in a longitudinal cohort of women at risk for HIV infection who were recruited from ten US communities with high rates of HIV and poverty. HIV risk behaviors were evaluated using audio computer-assisted self-interviews at baseline and every 6 months for up to 12 months. Results The prevalence of “unprotected sex in the past 6 months” declined from 96% at baseline to 77% at 12 months. However, this change could be almost completely explained by regression to the mean. Conclusions Analyses that examine changes over time in cohorts selected for high or low risk behaviors should account for regression to the mean effects. PMID:25883065
Comparison of regression models for estimation of isometric wrist joint torques using surface electromyography

PubMed Central

2011-01-01

Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG) signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2) values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS) was shown to have high isometric torque estimation accuracy combined with very short training times. PMID:21943179
Learning accurate and interpretable models based on regularized random forests regression

PubMed Central

2014-01-01

Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120
Bias correction by use of errors-in-variables regression models in studies with K-X-ray fluorescence bone lead measurements.

PubMed

Lamadrid-Figueroa, Héctor; Téllez-Rojo, Martha M; Angeles, Gustavo; Hernández-Ávila, Mauricio; Hu, Howard

2011-01-01

In-vivo measurement of bone lead by means of K-X-ray fluorescence (KXRF) is the preferred biological marker of chronic exposure to lead. Unfortunately, considerable measurement error associated with KXRF estimations can introduce bias in estimates of the effect of bone lead when this variable is included as the exposure in a regression model. Estimates of uncertainty reported by the KXRF instrument reflect the variance of the measurement error and, although they can be used to correct the measurement error bias, they are seldom used in epidemiological statistical analyzes. Errors-in-variables regression (EIV) allows for correction of bias caused by measurement error in predictor variables, based on the knowledge of the reliability of such variables. The authors propose a way to obtain reliability coefficients for bone lead measurements from uncertainty data reported by the KXRF instrument and compare, by the use of Monte Carlo simulations, results obtained using EIV regression models vs. those obtained by the standard procedures. Results of the simulations show that Ordinary Least Square (OLS) regression models provide severely biased estimates of effect, and that EIV provides nearly unbiased estimates. Although EIV effect estimates are more imprecise, their mean squared error is much smaller than that of OLS estimates. In conclusion, EIV is a better alternative than OLS to estimate the effect of bone lead when measured by KXRF. Copyright Â© 2010 Elsevier Inc. All rights reserved.
Modeling energy expenditure in children and adolescents using quantile regression

PubMed Central

Yang, Yunwen; Adolph, Anne L.; Puyau, Maurice R.; Vohra, Firoz A.; Zakeri, Issa F.

2013-01-01

Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in energy expenditure (EE). Study objective is to apply quantile regression (QR) to predict EE and determine quantile-dependent variation in covariate effects in nonobese and obese children. First, QR models will be developed to predict minute-by-minute awake EE at different quantile levels based on heart rate (HR) and physical activity (PA) accelerometry counts, and child characteristics of age, sex, weight, and height. Second, the QR models will be used to evaluate the covariate effects of weight, PA, and HR across the conditional EE distribution. QR and ordinary least squares (OLS) regressions are estimated in 109 children, aged 5–18 yr. QR modeling of EE outperformed OLS regression for both nonobese and obese populations. Average prediction errors for QR compared with OLS were not only smaller at the median τ = 0.5 (18.6 vs. 21.4%), but also substantially smaller at the tails of the distribution (10.2 vs. 39.2% at τ = 0.1 and 8.7 vs. 19.8% at τ = 0.9). Covariate effects of weight, PA, and HR on EE for the nonobese and obese children differed across quantiles (P < 0.05). The associations (linear and quadratic) between PA and HR with EE were stronger for the obese than nonobese population (P < 0.05). In conclusion, QR provided more accurate predictions of EE compared with conventional OLS regression, especially at the tails of the distribution, and revealed substantially different covariate effects of weight, PA, and HR on EE in nonobese and obese children. PMID:23640591
Application of seemingly unrelated regression in medical data with intermittently observed time-dependent covariates.

PubMed

Keshavarzi, Sareh; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf; Pakfetrat, Maryam

2012-01-01

BACKGROUND. In many studies with longitudinal data, time-dependent covariates can only be measured intermittently (not at all observation times), and this presents difficulties for standard statistical analyses. This situation is common in medical studies, and methods that deal with this challenge would be useful. METHODS. In this study, we performed the seemingly unrelated regression (SUR) based models, with respect to each observation time in longitudinal data with intermittently observed time-dependent covariates and further compared these models with mixed-effect regression models (MRMs) under three classic imputation procedures. Simulation studies were performed to compare the sample size properties of the estimated coefficients for different modeling choices. RESULTS. In general, the proposed models in the presence of intermittently observed time-dependent covariates showed a good performance. However, when we considered only the observed values of the covariate without any imputations, the resulted biases were greater. The performances of the proposed SUR-based models in comparison with MRM using classic imputation methods were nearly similar with approximately equal amounts of bias and MSE. CONCLUSION. The simulation study suggests that the SUR-based models work as efficiently as MRM in the case of intermittently observed time-dependent covariates. Thus, it can be used as an alternative to MRM.
Hybrid ABC Optimized MARS-Based Modeling of the Milling Tool Wear from Milling Run Experimental Data

PubMed Central

García Nieto, Paulino José; García-Gonzalo, Esperanza; Ordóñez Galán, Celestino; Bernardo Sánchez, Antonio

2016-01-01

Milling cutters are important cutting tools used in milling machines to perform milling operations, which are prone to wear and subsequent failure. In this paper, a practical new hybrid model to predict the milling tool wear in a regular cut, as well as entry cut and exit cut, of a milling tool is proposed. The model was based on the optimization tool termed artificial bee colony (ABC) in combination with multivariate adaptive regression splines (MARS) technique. This optimization mechanism involved the parameter setting in the MARS training procedure, which significantly influences the regression accuracy. Therefore, an ABC–MARS-based model was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. Regression with optimal hyperparameters was performed and a determination coefficient of 0.94 was obtained. The ABC–MARS-based model's goodness of fit to experimental data confirmed the good performance of this model. This new model also allowed us to ascertain the most influential parameters on the milling tool flank wear with a view to proposing milling machine's improvements. Finally, conclusions of this study are exposed. PMID:28787882
Hybrid ABC Optimized MARS-Based Modeling of the Milling Tool Wear from Milling Run Experimental Data.

PubMed

García Nieto, Paulino José; García-Gonzalo, Esperanza; Ordóñez Galán, Celestino; Bernardo Sánchez, Antonio

2016-01-28

Milling cutters are important cutting tools used in milling machines to perform milling operations, which are prone to wear and subsequent failure. In this paper, a practical new hybrid model to predict the milling tool wear in a regular cut, as well as entry cut and exit cut, of a milling tool is proposed. The model was based on the optimization tool termed artificial bee colony (ABC) in combination with multivariate adaptive regression splines (MARS) technique. This optimization mechanism involved the parameter setting in the MARS training procedure, which significantly influences the regression accuracy. Therefore, an ABC-MARS-based model was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc . Regression with optimal hyperparameters was performed and a determination coefficient of 0.94 was obtained. The ABC-MARS-based model's goodness of fit to experimental data confirmed the good performance of this model. This new model also allowed us to ascertain the most influential parameters on the milling tool flank wear with a view to proposing milling machine's improvements. Finally, conclusions of this study are exposed.
The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

PubMed

Szekér, Szabolcs; Vathy-Fogarassy, Ágnes

2018-01-01

Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
Casemix funding for a specialist paediatrics hospital: a hedonic regression approach.

PubMed

Bridges, J F; Hanson, R M

2000-01-01

This paper inquires into the effects that Diagnosis Related Groups (DRGs) have had on the ability to explain patient-level costs in a specialist paediatrics hospital. Two hedonic models are estimated using 1996/97 New Children's Hospital (NCH) patient level cost data, one with and one without a casemix index (CMI). The results show that the inclusion of a casemix index as an explanatory variable leads to a better accounting of cost. The full hedonic model is then used to simulate a funding model for the 1997/98 NCH cost data. These costs are highly correlated with the actual costs reported for that year. In addition, univariate regression indicates that there has been inflation in costs in the order of 4.8% between the two years. In conclusion, hedonic analysis can provide valuable evidence for the design of funding models that account for casemix.
Comment on "Cosmic-ray-driven reaction and greenhouse effect of halogenated molecules: Culprits for atmospheric ozone depletion and global climate change"

NASA Astrophysics Data System (ADS)

Nuccitelli, Dana; Cowtan, Kevin; Jacobs, Peter; Richardson, Mark; Way, Robert G.; Blackburn, Anne-Marie; Stolpe, Martin B.; Cook, John

2014-04-01

Lu (2013) (L13) argued that solar effects and anthropogenic halogenated gases can explain most of the observed warming of global mean surface air temperatures since 1850, with virtually no contribution from atmospheric carbon dioxide (CO2) concentrations. Here we show that this conclusion is based on assumptions about the saturation of the CO2-induced greenhouse effect that have been experimentally falsified. L13 also confuses equilibrium and transient response, and relies on data sources that have been superseeded due to known inaccuracies. Furthermore, the statistical approach of sequential linear regression artificially shifts variance onto the first predictor. L13's artificial choice of regression order and neglect of other relevant data is the fundamental cause of the incorrect main conclusion. Consideration of more modern data and a more parsimonious multiple regression model leads to contradiction with L13's statistical results. Finally, the correlation arguments in L13 are falsified by considering either the more appropriate metric of global heat accumulation, or data on longer timescales.
Evaluation of the Bitterness of Traditional Chinese Medicines using an E-Tongue Coupled with a Robust Partial Least Squares Regression Method

PubMed Central

Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin

2016-01-01

To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
Predictors of the number of under-five malnourished children in Bangladesh: application of the generalized poisson regression model

PubMed Central

2013-01-01

Background Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. Methods The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. Results The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance < mean) property. Our study also identify several significant predictors of the outcome variable namely mother’s education, father’s education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Conclusions Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh. PMID:23297699
Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data

PubMed Central

Ying, Gui-shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

2017-01-01

Purpose To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. Methods We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field data in the elderly. Results When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI −0.03 to 0.32D, P=0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, P=0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller P-values, while analysis of the worse eye provided larger P-values than mixed effects models and marginal models. Conclusion In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision. PMID:28102741
Assessing NARCCAP climate model effects using spatial confidence regions.

PubMed

French, Joshua P; McGinnis, Seth; Schwartzman, Armin

2017-01-01

We assess similarities and differences between model effects for the North American Regional Climate Change Assessment Program (NARCCAP) climate models using varying classes of linear regression models. Specifically, we consider how the average temperature effect differs for the various global and regional climate model combinations, including assessment of possible interaction between the effects of global and regional climate models. We use both pointwise and simultaneous inference procedures to identify regions where global and regional climate model effects differ. We also show conclusively that results from pointwise inference are misleading, and that accounting for multiple comparisons is important for making proper inference.
Characterizing multivariate decoding models based on correlated EEG spectral features

PubMed Central

McFarland, Dennis J.

2013-01-01

Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Artificial neural networks predict the incidence of portosplenomesenteric venous thrombosis in patients with acute pancreatitis.

PubMed

Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q

2017-03-01

Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.
Environmental, Spatial, and Sociodemographic Factors Associated with Nonfatal Injuries in Indonesia.

PubMed

Irianti, Sri; Prasetyoputra, Puguh

2017-01-01

Background . The determinants of injuries and their reoccurrence in Indonesia are not well understood, despite their importance in the prevention of injuries. Therefore, this study seeks to investigate the environmental, spatial, and sociodemographic factors associated with the reoccurrence of injuries among Indonesian people. Methods . Data from the 2013 round of the Indonesia Baseline Health Research (IBHR 2013) were analysed using a two-part hurdle regression model. A logit regression model was chosen for the zero-hurdle part , while a zero-truncated negative binomial regression model was selected for the counts part . Odds ratio (OR) and incidence rate ratio (IRR) were the measures of association, respectively. Results . The results suggest that living in a household with distant drinking water source, residing in slum areas, residing in Eastern Indonesia, having low educational attainment, being men, and being poorer are positively related to the likelihood of experiencing injury. Moreover, being a farmer or fishermen, having low educational attainment, and being men are positively associated with the frequency of injuries. Conclusion . This study would be useful to prioritise injury prevention programs in Indonesia based on the environmental, spatial, and sociodemographic characteristics.

Vesicular stomatitis forecasting based on Google Trends

PubMed Central

Lu, Yi; Zhou, GuangYa; Chen, Qin

2018-01-01

Background Vesicular stomatitis (VS) is an important viral disease of livestock. The main feature of VS is irregular blisters that occur on the lips, tongue, oral mucosa, hoof crown and nipple. Humans can also be infected with vesicular stomatitis and develop meningitis. This study analyses 2014 American VS outbreaks in order to accurately predict vesicular stomatitis outbreak trends. Methods American VS outbreaks data were collected from OIE. The data for VS keywords were obtained by inputting 24 disease-related keywords into Google Trends. After calculating the Pearson and Spearman correlation coefficients, it was found that there was a relationship between outbreaks and keywords derived from Google Trends. Finally, the predicted model was constructed based on qualitative classification and quantitative regression. Results For the regression model, the Pearson correlation coefficients between the predicted outbreaks and actual outbreaks are 0.953 and 0.948, respectively. For the qualitative classification model, we constructed five classification predictive models and chose the best classification predictive model as the result. The results showed, SN (sensitivity), SP (specificity) and ACC (prediction accuracy) values of the best classification predictive model are 78.52%,72.5% and 77.14%, respectively. Conclusion This study applied Google search data to construct a qualitative classification model and a quantitative regression model. The results show that the method is effective and that these two models obtain more accurate forecast. PMID:29385198
The Colorectal Cancer Mortality-to-Incidence Ratio as an Indicator of Global Cancer Screening and Care

PubMed Central

Sunkara, Vasu; Hébert, James R.

2015-01-01

BACKGROUND Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. METHODS The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the “divergents” removed. RESULTS The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. CONCLUSIONS The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. PMID:25572676
Genetic prediction of type 2 diabetes using deep neural network.

PubMed

Kim, J; Kim, J; Kwak, M J; Bajaj, M

2018-04-01

Type 2 diabetes (T2DM) has strong heritability but genetic models to explain heritability have been challenging. We tested deep neural network (DNN) to predict T2DM using the nested case-control study of Nurses' Health Study (3326 females, 45.6% T2DM) and Health Professionals Follow-up Study (2502 males, 46.5% T2DM). We selected 96, 214, 399, and 678 single-nucleotide polymorphism (SNPs) through Fisher's exact test and L1-penalized logistic regression. We split each dataset randomly in 4:1 to train prediction models and test their performance. DNN and logistic regressions showed better area under the curve (AUC) of ROC curves than the clinical model when 399 or more SNPs included. DNN was superior than logistic regressions in AUC with 399 or more SNPs in male and 678 SNPs in female. Addition of clinical factors consistently increased AUC of DNN but failed to improve logistic regressions with 214 or more SNPs. In conclusion, we show that DNN can be a versatile tool to predict T2DM incorporating large numbers of SNPs and clinical information. Limitations include a relatively small number of the subjects mostly of European ethnicity. Further studies are warranted to confirm and improve performance of genetic prediction models using DNN in different ethnic groups. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Test anxiety and academic performance in chiropractic students.

PubMed

Zhang, Niu; Henderson, Charles N R

2014-01-01

Objective : We assessed the level of students' test anxiety, and the relationship between test anxiety and academic performance. Methods : We recruited 166 third-quarter students. The Test Anxiety Inventory (TAI) was administered to all participants. Total scores from written examinations and objective structured clinical examinations (OSCEs) were used as response variables. Results : Multiple regression analysis shows that there was a modest, but statistically significant negative correlation between TAI scores and written exam scores, but not OSCE scores. Worry and emotionality were the best predictive models for written exam scores. Mean total anxiety and emotionality scores for females were significantly higher than those for males, but not worry scores. Conclusion : Moderate-to-high test anxiety was observed in 85% of the chiropractic students examined. However, total test anxiety, as measured by the TAI score, was a very weak predictive model for written exam performance. Multiple regression analysis demonstrated that replacing total anxiety (TAI) with worry and emotionality (TAI subscales) produces a much more effective predictive model of written exam performance. Sex, age, highest current academic degree, and ethnicity contributed little additional predictive power in either regression model. Moreover, TAI scores were not found to be statistically significant predictors of physical exam skill performance, as measured by OSCEs.
Identification of Extremely Premature Infants at High Risk of Rehospitalization

PubMed Central

Carlo, Waldemar A.; McDonald, Scott A.; Yao, Qing; Das, Abhik; Higgins, Rosemary D.

2011-01-01

OBJECTIVE: Extremely low birth weight infants often require rehospitalization during infancy. Our objective was to identify at the time of discharge which extremely low birth weight infants are at higher risk for rehospitalization. METHODS: Data from extremely low birth weight infants in Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network centers from 2002–2005 were analyzed. The primary outcome was rehospitalization by the 18- to 22-month follow-up, and secondary outcome was rehospitalization for respiratory causes in the first year. Using variables and odds ratios identified by stepwise logistic regression, scoring systems were developed with scores proportional to odds ratios. Classification and regression-tree analysis was performed by recursive partitioning and automatic selection of optimal cutoff points of variables. RESULTS: A total of 3787 infants were evaluated (mean ± SD birth weight: 787 ± 136 g; gestational age: 26 ± 2 weeks; 48% male, 42% black). Forty-five percent of the infants were rehospitalized by 18 to 22 months; 14.7% were rehospitalized for respiratory causes in the first year. Both regression models (area under the curve: 0.63) and classification and regression-tree models (mean misclassification rate: 40%–42%) were moderately accurate. Predictors for the primary outcome by regression were shunt surgery for hydrocephalus, hospital stay of >120 days for pulmonary reasons, necrotizing enterocolitis stage II or higher or spontaneous gastrointestinal perforation, higher fraction of inspired oxygen at 36 weeks, and male gender. By classification and regression-tree analysis, infants with hospital stays of >120 days for pulmonary reasons had a 66% rehospitalization rate compared with 42% without such a stay. CONCLUSIONS: The scoring systems and classification and regression-tree analysis models identified infants at higher risk of rehospitalization and might assist planning for care after discharge. PMID:22007016
Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 2. Applications

USGS Publications Warehouse

Cooley, Richard L.

1983-01-01

This paper investigates factors influencing the degree of improvement in estimates of parameters of a nonlinear regression groundwater flow model by incorporating prior information of unknown reliability. Consideration of expected behavior of the regression solutions and results of a hypothetical modeling problem lead to several general conclusions. First, if the parameters are properly scaled, linearized expressions for the mean square error (MSE) in parameter estimates of a nonlinear model will often behave very nearly as if the model were linear. Second, by using prior information, the MSE in properly scaled parameters can be reduced greatly over the MSE of ordinary least squares estimates of parameters. Third, plots of estimated MSE and the estimated standard deviation of MSE versus an auxiliary parameter (the ridge parameter) specifying the degree of influence of the prior information on regression results can help determine the potential for improvement of parameter estimates. Fourth, proposed criteria can be used to make appropriate choices for the ridge parameter and another parameter expressing degree of overall bias in the prior information. Results of a case study of Truckee Meadows, Reno-Sparks area, Washoe County, Nevada, conform closely to the results of the hypothetical problem. In the Truckee Meadows case, incorporation of prior information did not greatly change the parameter estimates from those obtained by ordinary least squares. However, the analysis showed that both sets of estimates are more reliable than suggested by the standard errors from ordinary least squares.
A Comparison between Multiple Regression Models and CUN-BAE Equation to Predict Body Fat in Adults

PubMed Central

Fuster-Parra, Pilar; Bennasar-Veny, Miquel; Tauler, Pedro; Yañez, Aina; López-González, Angel A.; Aguiló, Antoni

2015-01-01

Background Because the accurate measure of body fat (BF) is difficult, several prediction equations have been proposed. The aim of this study was to compare different multiple regression models to predict BF, including the recently reported CUN-BAE equation. Methods Multi regression models using body mass index (BMI) and body adiposity index (BAI) as predictors of BF will be compared. These models will be also compared with the CUN-BAE equation. For all the analysis a sample including all the participants and another one including only the overweight and obese subjects will be considered. The BF reference measure was made using Bioelectrical Impedance Analysis. Results The simplest models including only BMI or BAI as independent variables showed that BAI is a better predictor of BF. However, adding the variable sex to both models made BMI a better predictor than the BAI. For both the whole group of participants and the group of overweight and obese participants, using simple models (BMI, age and sex as variables) allowed obtaining similar correlations with BF as when the more complex CUN-BAE was used (ρ = 0:87 vs. ρ = 0:86 for the whole sample and ρ = 0:88 vs. ρ = 0:89 for overweight and obese subjects, being the second value the one for CUN-BAE). Conclusions There are simpler models than CUN-BAE equation that fits BF as well as CUN-BAE does. Therefore, it could be considered that CUN-BAE overfits. Using a simple linear regression model, the BAI, as the only variable, predicts BF better than BMI. However, when the sex variable is introduced, BMI becomes the indicator of choice to predict BF. PMID:25821960
The Necessity-Concerns-Framework: A Multidimensional Theory Benefits from Multidimensional Analysis

PubMed Central

Phillips, L. Alison; Diefenbach, Michael; Kronish, Ian M.; Negron, Rennie M.; Horowitz, Carol R.

2014-01-01

Background Patients’ medication-related concerns and necessity-beliefs predict adherence. Evaluation of the potentially complex interplay of these two dimensions has been limited because of methods that reduce them to a single dimension (difference scores). Purpose We use polynomial regression to assess the multidimensional effect of stroke-event survivors’ medication-related concerns and necessity-beliefs on their adherence to stroke-prevention medication. Methods Survivors (n=600) rated their concerns, necessity-beliefs, and adherence to medication. Confirmatory and exploratory polynomial regression determined the best-fitting multidimensional model. Results As posited by the Necessity-Concerns Framework (NCF), the greatest and lowest adherence was reported by those with strong necessity-beliefs/weak concerns and strong concerns/weak necessity-beliefs, respectively. However, as could not be assessed using a difference-score model, patients with ambivalent beliefs were less adherent than those exhibiting indifference. Conclusions Polynomial regression allows for assessment of the multidimensional nature of the NCF. Clinicians/Researchers should be aware that concerns and necessity dimensions are not polar opposites. PMID:24500078
PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

NASA Astrophysics Data System (ADS)

Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

2014-10-01

The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.
The natural outcome of melamine-induced bladder stones with bladder epithelial hyperplasia after the withdrawal of melamine in mice.

PubMed

Ren, Shu-Ting; Xu, Chang-Fu; Du, Yun-Xia; Gao, Xiao-Li; Sun, Ying; Jiang, Yi-Na

2012-07-01

The natural outcome of melamine-induced bladder stones (cystoliths) with bladder epithelial hyperplasia (BEH) after melamine withdrawn is unclear. Using an ideal dual-model system, three experiments were conducted in BALB/c mice. Each experiment included a control, model 1 and model 2 groups. The mice were fed a regular diet in controls or a 9373 ppm melamine diet in models, and the first day was designated as dosing day 1. The melamine diet was then replaced by the regular diet in the model 2 groups, and the first day was designated as post-dosing day 1. On dosing days 12, 35 and 49, the incidence of cystoliths and diffusely active BEH was 8/8 in the mice of three model 1 groups. On post-dosing days 1, 4 and 8, in the mice of three model 2 groups, the incidence of cystoliths was 2/8, 0/8 and 1/8, respectively, and the progressive regression of BEH was observed. In conclusion, both the stones and BEH have the natural property of rapid development and rapid regression, and melamine withdrawn plays a key role in the stone dissolution-discharge necessary for BEH regression. BEH may be reversible after the discharge of the stones. The conventionally conservative therapy is thus reasonable. Copyright © 2012 Elsevier Ltd. All rights reserved.
Quality Reporting of Multivariable Regression Models in Observational Studies: Review of a Representative Sample of Articles Published in Biomedical Journals.

PubMed

Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M

2016-05-01

Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network

PubMed Central

Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

2012-01-01

From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910
A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network.

PubMed

Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

2012-01-01

From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Neuropsychological tests for predicting cognitive decline in older adults

PubMed Central

Baerresen, Kimberly M; Miller, Karen J; Hanson, Eric R; Miller, Justin S; Dye, Richelin V; Hartman, Richard E; Vermeersch, David; Small, Gary W

2015-01-01

Summary Aim To determine neuropsychological tests likely to predict cognitive decline. Methods A sample of nonconverters (n = 106) was compared with those who declined in cognitive status (n = 24). Significant univariate logistic regression prediction models were used to create multivariate logistic regression models to predict decline based on initial neuropsychological testing. Results Rey–Osterrieth Complex Figure Test (RCFT) Retention predicted conversion to mild cognitive impairment (MCI) while baseline Buschke Delay predicted conversion to Alzheimer’s disease (AD). Due to group sample size differences, additional analyses were conducted using a subsample of demographically matched nonconverters. Analyses indicated RCFT Retention predicted conversion to MCI and AD, and Buschke Delay predicted conversion to AD. Conclusion Results suggest RCFT Retention and Buschke Delay may be useful in predicting cognitive decline. PMID:26107318
Assessing NARCCAP climate model effects using spatial confidence regions

PubMed Central

French, Joshua P.; McGinnis, Seth; Schwartzman, Armin

2017-01-01

We assess similarities and differences between model effects for the North American Regional Climate Change Assessment Program (NARCCAP) climate models using varying classes of linear regression models. Specifically, we consider how the average temperature effect differs for the various global and regional climate model combinations, including assessment of possible interaction between the effects of global and regional climate models. We use both pointwise and simultaneous inference procedures to identify regions where global and regional climate model effects differ. We also show conclusively that results from pointwise inference are misleading, and that accounting for multiple comparisons is important for making proper inference. PMID:28936474
A Study of the Effect of the Front-End Styling of Sport Utility Vehicles on Pedestrian Head Injuries

PubMed Central

Qin, Qin; Chen, Zheng; Bai, Zhonghao; Cao, Libo

2018-01-01

Background The number of sport utility vehicles (SUVs) on China market is continuously increasing. It is necessary to investigate the relationships between the front-end styling features of SUVs and head injuries at the styling design stage for improving the pedestrian protection performance and product development efficiency. Methods Styling feature parameters were extracted from the SUV side contour line. And simplified finite element models were established based on the 78 SUV side contour lines. Pedestrian headform impact simulations were performed and validated. The head injury criterion of 15 ms (HIC15) at four wrap-around distances was obtained. A multiple linear regression analysis method was employed to describe the relationships between the styling feature parameters and the HIC15 at each impact point. Results The relationship between the selected styling features and the HIC15 showed reasonable correlations, and the regression models and the selected independent variables showed statistical significance. Conclusions The regression equations obtained by multiple linear regression can be used to assess the performance of SUV styling in protecting pedestrians' heads and provide styling designers with technical guidance regarding their artistic creations.
A Heckman selection model for the safety analysis of signalized intersections

PubMed Central

Wong, S. C.; Zhu, Feng; Pei, Xin; Huang, Helai; Liu, Youjun

2017-01-01

Purpose The objective of this paper is to provide a new method for estimating crash rate and severity simultaneously. Methods This study explores a Heckman selection model of the crash rate and severity simultaneously at different levels and a two-step procedure is used to investigate the crash rate and severity levels. The first step uses a probit regression model to determine the sample selection process, and the second step develops a multiple regression model to simultaneously evaluate the crash rate and severity for slight injury/kill or serious injury (KSI), respectively. The model uses 555 observations from 262 signalized intersections in the Hong Kong metropolitan area, integrated with information on the traffic flow, geometric road design, road environment, traffic control and any crashes that occurred during two years. Results The results of the proposed two-step Heckman selection model illustrate the necessity of different crash rates for different crash severity levels. Conclusions A comparison with the existing approaches suggests that the Heckman selection model offers an efficient and convenient alternative method for evaluating the safety performance at signalized intersections. PMID:28732050
Prediction equations of forced oscillation technique: the insidious role of collinearity.

PubMed

Narchi, Hassib; AlBlooshi, Afaf

2018-03-27

Many studies have reported reference data for forced oscillation technique (FOT) in healthy children. The prediction equation of FOT parameters were derived from a multivariable regression model examining the effect of age, gender, weight and height on each parameter. As many of these variables are likely to be correlated, collinearity might have affected the accuracy of the model, potentially resulting in misleading, erroneous or difficult to interpret conclusions.The aim of this work was: To review all FOT publications in children since 2005 to analyze whether collinearity was considered in the construction of the published prediction equations. Then to compare these prediction equations with our own study. And to analyse, in our study, how collinearity between the explanatory variables might affect the predicted equations if it was not considered in the model. The results showed that none of the ten reviewed studies had stated whether collinearity was checked for. Half of the reports had also included in their equations variables which are physiologically correlated, such as age, weight and height. The predicted resistance varied by up to 28% amongst these studies. And in our study, multicollinearity was identified between the explanatory variables initially considered for the regression model (age, weight and height). Ignoring it would have resulted in inaccuracies in the coefficients of the equation, their signs (positive or negative), their 95% confidence intervals, their significance level and the model goodness of fit. In Conclusion with inaccurately constructed and improperly reported models, understanding the results and reproducing the models for future research might be compromised.
Functional Data Analysis Applied to Modeling of Severe Acute Mucositis and Dysphagia Resulting From Head and Neck Radiation Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dean, Jamie A., E-mail: jamie.dean@icr.ac.uk; Wong, Kee H.; Gay, Hiram

Purpose: Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue–sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. Methods and Materials: FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogrammore » data. The reduced dose data were input into functional logistic regression models (functional partial least squares–logistic regression [FPLS-LR] and functional principal component–logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate–response associations, assessed using bootstrapping. Results: The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/−0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/−0.96, 0.79/−0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. Conclusions: FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling.« less
PREDICTION OF MALIGNANT BREAST LESIONS FROM MRI FEATURES: A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND LOGISTIC REGRESSION TECHNIQUES

PubMed Central

McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying

2009-01-01

Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817

Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

PubMed Central

Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

2014-01-01

Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915
Using a Guided Machine Learning Ensemble Model to Predict Discharge Disposition following Meningioma Resection.

PubMed

Muhlestein, Whitney E; Akagi, Dallin S; Kallos, Justiss A; Morone, Peter J; Weaver, Kyle D; Thompson, Reid C; Chambless, Lola B

2018-04-01

Objective Machine learning (ML) algorithms are powerful tools for predicting patient outcomes. This study pilots a novel approach to algorithm selection and model creation using prediction of discharge disposition following meningioma resection as a proof of concept. Materials and Methods A diversity of ML algorithms were trained on a single-institution database of meningioma patients to predict discharge disposition. Algorithms were ranked by predictive power and top performers were combined to create an ensemble model. The final ensemble was internally validated on never-before-seen data to demonstrate generalizability. The predictive power of the ensemble was compared with a logistic regression. Further analyses were performed to identify how important variables impact the ensemble. Results Our ensemble model predicted disposition significantly better than a logistic regression (area under the curve of 0.78 and 0.71, respectively, p = 0.01). Tumor size, presentation at the emergency department, body mass index, convexity location, and preoperative motor deficit most strongly influence the model, though the independent impact of individual variables is nuanced. Conclusion Using a novel ML technique, we built a guided ML ensemble model that predicts discharge destination following meningioma resection with greater predictive power than a logistic regression, and that provides greater clinical insight than a univariate analysis. These techniques can be extended to predict many other patient outcomes of interest.
A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods.

PubMed

Duncan, Dustin T; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A; Arbia, Giuseppe; Castro, Marcia C; White, Kellee; Williams, David R

2014-04-01

The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran's I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran's I range from 0.24 to 0.86, all P =0.001), for tree density (Global Moran's I =0.452, P =0.001), and in the OLS regression residuals (Global Moran's I range from 0.32 to 0.38, all P <0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (r S =-0.19; conventional P -value=0.016; spatially adjusted P -value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (r S =-0.18; conventional P -value=0.019; spatially adjusted P -value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed.
Chemokine receptors CXCR2 and CX3CR1 differentially regulate functional responses of bone-marrow endothelial progenitors during atherosclerotic plaque regression

PubMed Central

Herlea-Pana, Oana; Yao, Longbiao; Heuser-Baker, Janet; Wang, Qiongxin; Wang, Qilong; Georgescu, Constantin; Zou, Ming-Hui; Barlic-Dicen, Jana

2015-01-01

Aims Atherosclerosis manifests itself as arterial plaques, which lead to heart attacks or stroke. Treatments supporting plaque regression are therefore aggressively pursued. Studies conducted in models in which hypercholesterolaemia is reversible, such as the Reversa mouse model we have employed in the current studies, will be instrumental for the development of such interventions. Using this model, we have shown that advanced atherosclerosis regression occurs when lipid lowering is used in combination with bone-marrow endothelial progenitor cell (EPC) treatment. However, it remains unclear how EPCs home to regressing plaques and how they augment atherosclerosis reversal. Here we identify molecules that support functional responses of EPCs during plaque resolution. Methods and results Chemokines CXCL1 and CX3CL1 were detected in the vascular wall of atheroregressing Reversa mice, and their cognate receptors CXCR2 and CX3CR1 were observed on adoptively transferred EPCs in circulation. We tested whether CXCL1–CXCR2 and CX3CL1–CX3CR1 axes regulate functional responses of EPCs during plaque reversal. We show that pharmacological inhibition of CXCR2 or CX3CR1, or genetic inactivation of these two chemokine receptors interfered with EPC-mediated advanced atherosclerosis regression. We also demonstrate that CXCR2 directs EPCs to regressing plaques while CX3CR1 controls a paracrine function(s) of these cells. Conclusion CXCR2 and CX3CR1 differentially regulate EPC functional responses during atheroregression. Our study improves understanding of how chemokines and chemokine receptors regulate plaque resolution, which could determine the effectiveness of interventions reducing complications of atherosclerosis. PMID:25765938
Technology diffusion in hospitals: a log odds random effects regression model.

PubMed

Blank, Jos L T; Valdmanis, Vivian G

2015-01-01

This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the diffusion of technologies. We distinguish a number of determinants, such as service, physician, and environmental, financial and organizational characteristics of the 60 Dutch hospitals in our sample. On the basis of this data set on Dutch general hospitals over the period 1995-2002, we conclude that there is a relation between a number of determinants and the diffusion of innovations underlining conclusions from earlier research. Positive effects were found on the basis of the size of the hospitals, competition and a hospital's commitment to innovation. It appears that if a policy is developed to further diffuse innovations, the external effects of demand and market competition need to be examined, which would de facto lead to an efficient use of technology. For the individual hospital, instituting an innovations office appears to be the most prudent course of action. © 2013 The Authors. International Journal of Health Planning and Management published by John Wiley & Sons, Ltd.
Bayesian structured additive regression modeling of epidemic data: application to cholera

PubMed Central

2012-01-01

Background A significant interest in spatial epidemiology lies in identifying associated risk factors which enhances the risk of infection. Most studies, however, make no, or limited use of the spatial structure of the data, as well as possible nonlinear effects of the risk factors. Methods We develop a Bayesian Structured Additive Regression model for cholera epidemic data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (MCMC) simulations. The model is applied to cholera epidemic data in the Kumasi Metropolis, Ghana. Proximity to refuse dumps, density of refuse dumps, and proximity to potential cholera reservoirs were modeled as continuous functions; presence of slum settlers and population density were modeled as fixed effects, whereas spatial references to the communities were modeled as structured and unstructured spatial effects. Results We observe that the risk of cholera is associated with slum settlements and high population density. The risk of cholera is equal and lower for communities with fewer refuse dumps, but variable and higher for communities with more refuse dumps. The risk is also lower for communities distant from refuse dumps and potential cholera reservoirs. The results also indicate distinct spatial variation in the risk of cholera infection. Conclusion The study highlights the usefulness of Bayesian semi-parametric regression model analyzing public health data. These findings could serve as novel information to help health planners and policy makers in making effective decisions to control or prevent cholera epidemics. PMID:22866662
Development of a Bayesian model to estimate health care outcomes in the severely wounded

PubMed Central

Stojadinovic, Alexander; Eberhardt, John; Brown, Trevor S; Hawksworth, Jason S; Gage, Frederick; Tadaki, Douglas K; Forsberg, Jonathan A; Davis, Thomas A; Potter, Benjamin K; Dunne, James R; Elster, E A

2010-01-01

Background: Graphical probabilistic models have the ability to provide insights as to how clinical factors are conditionally related. These models can be used to help us understand factors influencing health care outcomes and resource utilization, and to estimate morbidity and clinical outcomes in trauma patient populations. Study design: Thirty-two combat casualties with severe extremity injuries enrolled in a prospective observational study were analyzed using step-wise machine-learned Bayesian belief network (BBN) and step-wise logistic regression (LR). Models were evaluated using 10-fold cross-validation to calculate area-under-the-curve (AUC) from receiver operating characteristics (ROC) curves. Results: Our BBN showed important associations between various factors in our data set that could not be developed using standard regression methods. Cross-validated ROC curve analysis showed that our BBN model was a robust representation of our data domain and that LR models trained on these findings were also robust: hospital-acquired infection (AUC: LR, 0.81; BBN, 0.79), intensive care unit length of stay (AUC: LR, 0.97; BBN, 0.81), and wound healing (AUC: LR, 0.91; BBN, 0.72) showed strong AUC. Conclusions: A BBN model can effectively represent clinical outcomes and biomarkers in patients hospitalized after severe wounding, and is confirmed by 10-fold cross-validation and further confirmed through logistic regression modeling. The method warrants further development and independent validation in other, more diverse patient populations. PMID:21197361
Flexible Meta-Regression to Assess the Shape of the Benzene–Leukemia Exposure–Response Curve

PubMed Central

Vlaanderen, Jelle; Portengen, Lützen; Rothman, Nathaniel; Lan, Qing; Kromhout, Hans; Vermeulen, Roel

2010-01-01

Background Previous evaluations of the shape of the benzene–leukemia exposure–response curve (ERC) were based on a single set or on small sets of human occupational studies. Integrating evidence from all available studies that are of sufficient quality combined with flexible meta-regression models is likely to provide better insight into the functional relation between benzene exposure and risk of leukemia. Objectives We used natural splines in a flexible meta-regression method to assess the shape of the benzene–leukemia ERC. Methods We fitted meta-regression models to 30 aggregated risk estimates extracted from nine human observational studies and performed sensitivity analyses to assess the impact of a priori assessed study characteristics on the predicted ERC. Results The natural spline showed a supralinear shape at cumulative exposures less than 100 ppm-years, although this model fitted the data only marginally better than a linear model (p = 0.06). Stratification based on study design and jackknifing indicated that the cohort studies had a considerable impact on the shape of the ERC at high exposure levels (> 100 ppm-years) but that predicted risks for the low exposure range (< 50 ppm-years) were robust. Conclusions Although limited by the small number of studies and the large heterogeneity between studies, the inclusion of all studies of sufficient quality combined with a flexible meta-regression method provides the most comprehensive evaluation of the benzene–leukemia ERC to date. The natural spline based on all data indicates a significantly increased risk of leukemia [relative risk (RR) = 1.14; 95% confidence interval (CI), 1.04–1.26] at an exposure level as low as 10 ppm-years. PMID:20064779
Sensitivity Analysis of Mechanical Parameters of Different Rock Layers to the Stability of Coal Roadway in Soft Rock Strata

PubMed Central

Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing

2013-01-01

According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Finding gene clusters for a replicated time course study

PubMed Central

2014-01-01

Background Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. Findings In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. Conclusions The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism. PMID:24460656
Estimation of stature from the foot and its segments in a sub-adult female population of North India

PubMed Central

2011-01-01

Background Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. Methods The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. Results The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. Conclusions The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults. PMID:22104433
Forecasting Daily Patient Outflow From a Ward Having No Real-Time Clinical Data

PubMed Central

Tran, Truyen; Luo, Wei; Phung, Dinh; Venkatesh, Svetha

2016-01-01

Background: Modeling patient flow is crucial in understanding resource demand and prioritization. We study patient outflow from an open ward in an Australian hospital, where currently bed allocation is carried out by a manager relying on past experiences and looking at demand. Automatic methods that provide a reasonable estimate of total next-day discharges can aid in efficient bed management. The challenges in building such methods lie in dealing with large amounts of discharge noise introduced by the nonlinear nature of hospital procedures, and the nonavailability of real-time clinical information in wards. Objective Our study investigates different models to forecast the total number of next-day discharges from an open ward having no real-time clinical data. Methods We compared 5 popular regression algorithms to model total next-day discharges: (1) autoregressive integrated moving average (ARIMA), (2) the autoregressive moving average with exogenous variables (ARMAX), (3) k-nearest neighbor regression, (4) random forest regression, and (5) support vector regression. Although the autoregressive integrated moving average model relied on past 3-month discharges, nearest neighbor forecasting used median of similar discharges in the past in estimating next-day discharge. In addition, the ARMAX model used the day of the week and number of patients currently in ward as exogenous variables. For the random forest and support vector regression models, we designed a predictor set of 20 patient features and 88 ward-level features. Results Our data consisted of 12,141 patient visits over 1826 days. Forecasting quality was measured using mean forecast error, mean absolute error, symmetric mean absolute percentage error, and root mean square error. When compared with a moving average prediction model, all 5 models demonstrated superior performance with the random forests achieving 22.7% improvement in mean absolute error, for all days in the year 2014. Conclusions In the absence of clinical information, our study recommends using patient-level and ward-level data in predicting next-day discharges. Random forest and support vector regression models are able to use all available features from such data, resulting in superior performance over traditional autoregressive methods. An intelligent estimate of available beds in wards plays a crucial role in relieving access block in emergency departments. PMID:27444059
Automated time series forecasting for biosurveillance.

PubMed

Burkom, Howard S; Murphy, Sean Patrick; Shmueli, Galit

2007-09-30

For robust detection performance, traditional control chart monitoring for biosurveillance is based on input data free of trends, day-of-week effects, and other systematic behaviour. Time series forecasting methods may be used to remove this behaviour by subtracting forecasts from observations to form residuals for algorithmic input. We describe three forecast methods and compare their predictive accuracy on each of 16 authentic syndromic data streams. The methods are (1) a non-adaptive regression model using a long historical baseline, (2) an adaptive regression model with a shorter, sliding baseline, and (3) the Holt-Winters method for generalized exponential smoothing. Criteria for comparing the forecasts were the root-mean-square error, the median absolute per cent error (MedAPE), and the median absolute deviation. The median-based criteria showed best overall performance for the Holt-Winters method. The MedAPE measures over the 16 test series averaged 16.5, 11.6, and 9.7 for the non-adaptive regression, adaptive regression, and Holt-Winters methods, respectively. The non-adaptive regression forecasts were degraded by changes in the data behaviour in the fixed baseline period used to compute model coefficients. The mean-based criterion was less conclusive because of the effects of poor forecasts on a small number of calendar holidays. The Holt-Winters method was also most effective at removing serial autocorrelation, with most 1-day-lag autocorrelation coefficients below 0.15. The forecast methods were compared without tuning them to the behaviour of individual series. We achieved improved predictions with such tuning of the Holt-Winters method, but practical use of such improvements for routine surveillance will require reliable data classification methods.
Forecasting space weather: Can new econometric methods improve accuracy?

NASA Astrophysics Data System (ADS)

Reikard, Gordon

2011-06-01

Space weather forecasts are currently used in areas ranging from navigation and communication to electric power system operations. The relevant forecast horizons can range from as little as 24 h to several days. This paper analyzes the predictability of two major space weather measures using new time series methods, many of them derived from econometrics. The data sets are the A p geomagnetic index and the solar radio flux at 10.7 cm. The methods tested include nonlinear regressions, neural networks, frequency domain algorithms, GARCH models (which utilize the residual variance), state transition models, and models that combine elements of several techniques. While combined models are complex, they can be programmed using modern statistical software. The data frequency is daily, and forecasting experiments are run over horizons ranging from 1 to 7 days. Two major conclusions stand out. First, the frequency domain method forecasts the A p index more accurately than any time domain model, including both regressions and neural networks. This finding is very robust, and holds for all forecast horizons. Combining the frequency domain method with other techniques yields a further small improvement in accuracy. Second, the neural network forecasts the solar flux more accurately than any other method, although at short horizons (2 days or less) the regression and net yield similar results. The neural net does best when it includes measures of the long-term component in the data.
Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08

PubMed Central

Casey, Martin F.; Neidell, Matthew

2013-01-01

Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205
A spatial-temporal regression model to predict daily outdoor residential PAH concentrations in an epidemiologic study in Fresno, CA

NASA Astrophysics Data System (ADS)

Noth, Elizabeth M.; Hammond, S. Katharine; Biging, Gregory S.; Tager, Ira B.

2011-05-01

BackgroundPolycyclic aromatic hydrocarbons (PAHs) are generated as a byproduct of combustion, and are associated with respiratory symptoms and increased risk of asthma attacks. ObjectivesTo assign daily, outdoor exposures to participants in the Fresno Asthmatic Children's Environment Study (FACES) using land use regression models for the sum of 4-, 5- and 6-ring PAHs (PAH456). MethodsPAH data were collected daily at the EPA Supersite in Fresno, CA from 10/2000 through 2/2007. From 2/2002 to 2/2003, intensive air pollution sampling was conducted at 83 homes of participants in the FACES study. These measurement data were combined with meteorological data, source data, and other spatial variables to form a land use regression model to assign daily exposure at all FACES homes for all years of the study (2001-2008). ResultsThe model for daily, outdoor residential PAH456 concentrations accounted for 80% of the between-home variability and 18% of the within-home variability. Both temporal and spatial variables were significant in the model. Traffic characteristics and home heating fuel were the main spatial explanatory variables. ConclusionsBecause spatial and temporal distributions of PAHs vary on an intra-urban scale, the location of the child's home within the urban setting plays an important role in the level of exposure that each child has to PAHs.
Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway

PubMed Central

Wang, Fengfeng; Wong, S. C. Cesar; Chan, Lawrence W. C.; Cho, William C. S.; Yip, S. P.; Yung, Benjamin Y. M.

2014-01-01

Background. MicroRNA (miRNA) is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC), and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI) and chromosomal instability (CIN) signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC. PMID:24895601
Assessing the Impact of Drug Use on Hospital Costs

PubMed Central

Stuart, Bruce C; Doshi, Jalpa A; Terza, Joseph V

2009-01-01

Objective To assess whether outpatient prescription drug utilization produces offsets in the cost of hospitalization for Medicare beneficiaries. Data Sources/Study Setting The study analyzed a sample (N=3,101) of community-dwelling fee-for-service U.S. Medicare beneficiaries drawn from the 1999 and 2000 Medicare Current Beneficiary Surveys. Study Design Using a two-part model specification, we regressed any hospital admission (part 1: probit) and hospital spending by those with one or more admissions (part 2: nonlinear least squares regression) on drug use in a standard model with strong covariate controls and a residual inclusion instrumental variable (IV) model using an exogenous measure of drug coverage as the instrument. Principal Findings The covariate control model predicted that each additional prescription drug used (mean=30) raised hospital spending by $16 (p<.001). The residual inclusion IV model prediction was that each additional prescription fill reduced hospital spending by $104 (p<.001). Conclusions The findings indicate that drug use is associated with cost offsets in hospitalization among Medicare beneficiaries, once omitted variable bias is corrected using an IV technique appropriate for nonlinear applications. PMID:18783453
Assessing alternative measures of wealth in health research.

PubMed

Cubbin, Catherine; Pollack, Craig; Flaherty, Brian; Hayward, Mark; Sania, Ayesha; Vallone, Donna; Braveman, Paula

2011-05-01

We assessed whether it would be feasible to replace the standard measure of net worth with simpler measures of wealth in population-based studies examining associations between wealth and health. We used data from the 2004 Survey of Consumer Finances (respondents aged 25-64 years) and the 2004 Health and Retirement Survey (respondents aged 50 years or older) to construct logistic regression models relating wealth to health status and smoking. For our wealth measure, we used the standard measure of net worth as well as 9 simpler measures of wealth, and we compared results among the 10 models. In both data sets and for both health indicators, models using simpler wealth measures generated conclusions about the association between wealth and health that were similar to the conclusions generated by models using net worth. The magnitude and significance of the odds ratios were similar for the covariates in multivariate models, and the model-fit statistics for models using these simpler measures were similar to those for models using net worth. Our findings suggest that simpler measures of wealth may be acceptable in population-based studies of health.
Validation of Statistical Models for Estimating Hospitalization Associated with Influenza and Other Respiratory Viruses

PubMed Central

Chan, King-Pan; Chan, Kwok-Hung; Wong, Wilfred Hing-Sang; Peiris, J. S. Malik; Wong, Chit-Ming

2011-01-01

Background Reliable estimates of disease burden associated with respiratory viruses are keys to deployment of preventive strategies such as vaccination and resource allocation. Such estimates are particularly needed in tropical and subtropical regions where some methods commonly used in temperate regions are not applicable. While a number of alternative approaches to assess the influenza associated disease burden have been recently reported, none of these models have been validated with virologically confirmed data. Even fewer methods have been developed for other common respiratory viruses such as respiratory syncytial virus (RSV), parainfluenza and adenovirus. Methods and Findings We had recently conducted a prospective population-based study of virologically confirmed hospitalization for acute respiratory illnesses in persons <18 years residing in Hong Kong Island. Here we used this dataset to validate two commonly used models for estimation of influenza disease burden, namely the rate difference model and Poisson regression model, and also explored the applicability of these models to estimate the disease burden of other respiratory viruses. The Poisson regression models with different link functions all yielded estimates well correlated with the virologically confirmed influenza associated hospitalization, especially in children older than two years. The disease burden estimates for RSV, parainfluenza and adenovirus were less reliable with wide confidence intervals. The rate difference model was not applicable to RSV, parainfluenza and adenovirus and grossly underestimated the true burden of influenza associated hospitalization. Conclusion The Poisson regression model generally produced satisfactory estimates in calculating the disease burden of respiratory viruses in a subtropical region such as Hong Kong. PMID:21412433

Modeling the milling tool wear by using an evolutionary SVM-based model from milling runs experimental data

NASA Astrophysics Data System (ADS)

Nieto, Paulino José García; García-Gonzalo, Esperanza; Vilán, José Antonio Vilán; Robleda, Abraham Segade

2015-12-01

The main aim of this research work is to build a new practical hybrid regression model to predict the milling tool wear in a regular cut as well as entry cut and exit cut of a milling tool. The model was based on Particle Swarm Optimization (PSO) in combination with support vector machines (SVMs). This optimization mechanism involved kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. Bearing this in mind, a PSO-SVM-based model, which is based on the statistical learning theory, was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. To accomplish the objective of this study, the experimental dataset represents experiments from runs on a milling machine under various operating conditions. In this way, data sampled by three different types of sensors (acoustic emission sensor, vibration sensor and current sensor) were acquired at several positions. A second aim is to determine the factors with the greatest bearing on the milling tool flank wear with a view to proposing milling machine's improvements. Firstly, this hybrid PSO-SVM-based regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the flank wear (output variable) and input variables (time, depth of cut, feed, etc.). Indeed, regression with optimal hyperparameters was performed and a determination coefficient of 0.95 was obtained. The agreement of this model with experimental data confirmed its good performance. Secondly, the main advantages of this PSO-SVM-based model are its capacity to produce a simple, easy-to-interpret model, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, the main conclusions of this study are exposed.
A Time Series Analysis: Weather Factors, Human Migration and Malaria Cases in Endemic Area of Purworejo, Indonesia, 2005–2014

PubMed Central

REJEKI, Dwi Sarwani Sri; NURHAYATI, Nunung; AJI, Budi; MURHANDARWATI, E. Elsa Herdiana; KUSNANTO, Hari

2018-01-01

Background: Climatic and weather factors become important determinants of vector-borne diseases transmission like malaria. This study aimed to prove relationships between weather factors with considering human migration and previous case findings and malaria cases in endemic areas in Purworejo during 2005–2014. Methods: This study employed ecological time series analysis by using monthly data. The independent variables were the maximum temperature, minimum temperature, maximum humidity, minimum humidity, precipitation, human migration, and previous malaria cases, while the dependent variable was positive malaria cases. Three models of count data regression analysis i.e. Poisson model, quasi-Poisson model, and negative binomial model were applied to measure the relationship. The least Akaike Information Criteria (AIC) value was also performed to find the best model. Negative binomial regression analysis was considered as the best model. Results: The model showed that humidity (lag 2), precipitation (lag 3), precipitation (lag 12), migration (lag1) and previous malaria cases (lag 12) had a significant relationship with malaria cases. Conclusion: Weather, migration and previous malaria cases factors need to be considered as prominent indicators for the increase of malaria case projection. PMID:29900134
Secure Logistic Regression Based on Homomorphic Encryption: Design and Evaluation

PubMed Central

Song, Yongsoo; Wang, Shuang; Xia, Yuhou; Jiang, Xiaoqian

2018-01-01

Background Learning a model without accessing raw data has been an intriguing idea to security and machine learning researchers for years. In an ideal setting, we want to encrypt sensitive data to store them on a commercial cloud and run certain analyses without ever decrypting the data to preserve privacy. Homomorphic encryption technique is a promising candidate for secure data outsourcing, but it is a very challenging task to support real-world machine learning tasks. Existing frameworks can only handle simplified cases with low-degree polynomials such as linear means classifier and linear discriminative analysis. Objective The goal of this study is to provide a practical support to the mainstream learning models (eg, logistic regression). Methods We adapted a novel homomorphic encryption scheme optimized for real numbers computation. We devised (1) the least squares approximation of the logistic function for accuracy and efficiency (ie, reduce computation cost) and (2) new packing and parallelization techniques. Results Using real-world datasets, we evaluated the performance of our model and demonstrated its feasibility in speed and memory consumption. For example, it took approximately 116 minutes to obtain the training model from the homomorphically encrypted Edinburgh dataset. In addition, it gives fairly accurate predictions on the testing dataset. Conclusions We present the first homomorphically encrypted logistic regression outsourcing model based on the critical observation that the precision loss of classification models is sufficiently small so that the decision plan stays still. PMID:29666041
Genetic analysis of milk production traits of Tunisian Holsteins using random regression test-day model with Legendre polynomials

PubMed Central

2018-01-01

Objective The objective of this study was to estimate genetic parameters of milk, fat, and protein yields within and across lactations in Tunisian Holsteins using a random regression test-day (TD) model. Methods A random regression multiple trait multiple lactation TD model was used to estimate genetic parameters in the Tunisian dairy cattle population. Data were TD yields of milk, fat, and protein from the first three lactations. Random regressions were modeled with third-order Legendre polynomials for the additive genetic, and permanent environment effects. Heritabilities, and genetic correlations were estimated by Bayesian techniques using the Gibbs sampler. Results All variance components tended to be high in the beginning and the end of lactations. Additive genetic variances for milk, fat, and protein yields were the lowest and were the least variable compared to permanent variances. Heritability values tended to increase with parity. Estimates of heritabilities for 305-d yield-traits were low to moderate, 0.14 to 0.2, 0.12 to 0.17, and 0.13 to 0.18 for milk, fat, and protein yields, respectively. Within-parity, genetic correlations among traits were up to 0.74. Genetic correlations among lactations for the yield traits were relatively high and ranged from 0.78±0.01 to 0.82±0.03, between the first and second parities, from 0.73±0.03 to 0.8±0.04 between the first and third parities, and from 0.82±0.02 to 0.84±0.04 between the second and third parities. Conclusion These results are comparable to previously reported estimates on the same population, indicating that the adoption of a random regression TD model as the official genetic evaluation for production traits in Tunisia, as developed by most Interbull countries, is possible in the Tunisian Holsteins. PMID:28823122
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time‐to‐Event Analysis

PubMed Central

Gong, Xiajing; Hu, Meng

2018-01-01

Abstract Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time‐to‐event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high‐dimensional data featured by a large number of predictor variables. Our results showed that ML‐based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high‐dimensional data. The prediction performances of ML‐based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML‐based methods provide a powerful tool for time‐to‐event analysis, with a built‐in capacity for high‐dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. PMID:29536640
Transmission Risks of Schistosomiasis Japonica: Extraction from Back-propagation Artificial Neural Network and Logistic Regression Model

PubMed Central

Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong

2013-01-01

Background The transmission of schistosomiasis japonica in a local setting is still poorly understood in the lake regions of the People's Republic of China (P. R. China), and its transmission patterns are closely related to human, social and economic factors. Methodology/Principal Findings We aimed to apply the integrated approach of artificial neural network (ANN) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China. By using the back-propagation (BP) of the ANN model, 16 factors out of 27 factors were screened, and the top five factors ranked by the absolute value of mean impact value (MIV) were mainly related to human behavior, i.e. integration of water contact history and infection history, family with past infection, history of water contact, infection history, and infection times. The top five factors screened by the logistic regression model were mainly related to the social economics, i.e. village level, economic conditions of family, age group, education level, and infection times. The risk of human infection with S. japonicum is higher in the population who are at age 15 or younger, or with lower education, or with the higher infection rate of the village, or with poor family, and in the population with more than one time to be infected. Conclusion/Significance Both BP artificial neural network and logistic regression model established in a small scale suggested that individual behavior and socioeconomic status are the most important risk factors in the transmission of schistosomiasis japonica. It was reviewed that the young population (≤15) in higher-risk areas was the main target to be intervened for the disease transmission control. PMID:23556015
Pattern Recognition Analysis of Age-Related Retinal Ganglion Cell Signatures in the Human Eye

PubMed Central

Yoshioka, Nayuta; Zangerl, Barbara; Nivison-Smith, Lisa; Khuu, Sieu K.; Jones, Bryan W.; Pfeiffer, Rebecca L.; Marc, Robert E.; Kalloniatis, Michael

2017-01-01

Purpose To characterize macular ganglion cell layer (GCL) changes with age and provide a framework to assess changes in ocular disease. This study used data clustering to analyze macular GCL patterns from optical coherence tomography (OCT) in a large cohort of subjects without ocular disease. Methods Single eyes of 201 patients evaluated at the Centre for Eye Health (Sydney, Australia) were retrospectively enrolled (age range, 20–85); 8 × 8 grid locations obtained from Spectralis OCT macular scans were analyzed with unsupervised classification into statistically separable classes sharing common GCL thickness and change with age. The resulting classes and gridwise data were fitted with linear and segmented linear regression curves. Additionally, normalized data were analyzed to determine regression as a percentage. Accuracy of each model was examined through comparison of predicted 50-year-old equivalent macular GCL thickness for the entire cohort to a true 50-year-old reference cohort. Results Pattern recognition clustered GCL thickness across the macula into five to eight spatially concentric classes. F-test demonstrated segmented linear regression to be the most appropriate model for macular GCL change. The pattern recognition–derived and normalized model revealed less difference between the predicted macular GCL thickness and the reference cohort (average ± SD 0.19 ± 0.92 and −0.30 ± 0.61 μm) than a gridwise model (average ± SD 0.62 ± 1.43 μm). Conclusions Pattern recognition successfully identified statistically separable macular areas that undergo a segmented linear reduction with age. This regression model better predicted macular GCL thickness. The various unique spatial patterns revealed by pattern recognition combined with core GCL thickness data provide a framework to analyze GCL loss in ocular disease. PMID:28632847
Hazard Regression Models of Early Mortality in Trauma Centers

PubMed Central

Clark, David E; Qian, Jing; Winchell, Robert J; Betensky, Rebecca A

2013-01-01

Background Factors affecting early hospital deaths after trauma may be different from factors affecting later hospital deaths, and the distribution of short and long prehospital times may vary among hospitals. Hazard regression (HR) models may therefore be more useful than logistic regression (LR) models for analysis of trauma mortality, especially when treatment effects at different time points are of interest. Study Design We obtained data for trauma center patients from the 2008–9 National Trauma Data Bank (NTDB). Cases were included if they had complete data for prehospital times, hospital times, survival outcome, age, vital signs, and severity scores. Cases were excluded if pulseless on admission, transferred in or out, or ISS<9. Using covariates proposed for the Trauma Quality Improvement Program and an indicator for each hospital, we compared LR models predicting survival at 8 hours after injury to HR models with survival censored at 8 hours. HR models were then modified to allow time-varying hospital effects. Results 85,327 patients in 161 hospitals met inclusion criteria. Crude hazards peaked initially, then steadily declined. When hazard ratios were assumed constant in HR models, they were similar to odds ratios in LR models associating increased mortality with increased age, firearm mechanism, increased severity, more deranged physiology, and estimated hospital-specific effects. However, when hospital effects were allowed to vary by time, HR models demonstrated that hospital outliers were not the same at different times after injury. Conclusions HR models with time-varying hazard ratios reveal inconsistencies in treatment effects, data quality, and/or timing of early death among trauma centers. HR models are generally more flexible than LR models, can be adapted for censored data, and potentially offer a better tool for analysis of factors affecting early death after injury. PMID:23036828
How is the weather? Forecasting inpatient glycemic control

PubMed Central

Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M

2017-01-01

Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Sentinel node status prediction by four statistical models: results from a large bi-institutional series (n = 1132).

PubMed

Mocellin, Simone; Thompson, John F; Pasquali, Sandro; Montesco, Maria C; Pilati, Pierluigi; Nitti, Donato; Saw, Robyn P; Scolyer, Richard A; Stretch, Jonathan R; Rossi, Carlo R

2009-12-01

To improve selection for sentinel node (SN) biopsy (SNB) in patients with cutaneous melanoma using statistical models predicting SN status. About 80% of patients currently undergoing SNB are node negative. In the absence of conclusive evidence of a SNBassociated survival benefit, these patients may be over-treated. Here, we tested the efficiency of 4 different models in predicting SN status. The clinicopathologic data (age, gender, tumor thickness, Clark level, regression, ulceration, histologic subtype, and mitotic index) of 1132 melanoma patients who had undergone SNB at institutions in Italy and Australia were analyzed. Logistic regression, classification tree, random forest, and support vector machine models were fitted to the data. The predictive models were built with the aim of maximizing the negative predictive value (NPV) and reducing the rate of SNB procedures though minimizing the error rate. After cross-validation logistic regression, classification tree, random forest, and support vector machine predictive models obtained clinically relevant NPV (93.6%, 94.0%, 97.1%, and 93.0%, respectively), SNB reduction (27.5%, 29.8%, 18.2%, and 30.1%, respectively), and error rates (1.8%, 1.8%, 0.5%, and 2.1%, respectively). Using commonly available clinicopathologic variables, predictive models can preoperatively identify a proportion of patients ( approximately 25%) who might be spared SNB, with an acceptable (1%-2%) error. If validated in large prospective series, these models might be implemented in the clinical setting for improved patient selection, which ultimately would lead to better quality of life for patients and optimization of resource allocation for the health care system.
Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

PubMed Central

Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

2015-01-01

Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (p<0.001) and integrated discrimination improvement (p=0.04). The HALT-C model had a c-statistic of 0.60 (95%CI 0.50-0.70) in the validation cohort and was outperformed by the machine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273
Breast Arterial Calcification Is Associated with Reproductive Factors in Asymptomatic Postmenopausal Women

PubMed Central

Whaley, Dana H.; Sheedy, Patrick F.; Peyser, Patricia A.

2010-01-01

Abstract Objective The etiology of breast arterial calcification (BAC) is not well understood. We examined reproductive history and cardiovascular disease (CVD) risk factor associations with the presence of detectable BAC in asymptomatic postmenopausal women. Methods Reproductive history and CVD risk factors were obtained in 240 asymptomatic postmenopausal women from a community-based research study who had a screening mammogram within 2 years of their participation in the study. The mammograms were reviewed for the presence of detectable BAC. Age-adjusted logistic regression models were fit to assess the association between each risk factor and the presence of BAC. Multiple variable logistic regression models were used to identify the most parsimonious model for the presence of BAC. Results The prevalence of BAC increased with increased age (p < 0.0001). The most parsimonious logistic regression model for BAC presence included age at time of examination, increased parity (p = 0.01), earlier age at first birth (p = 0.002), weight, and an age-by-weight interaction term (p = 0.004). Older women with a smaller body size had a higher probability of having BAC than women of the same age with a larger body size. Conclusions The presence or absence of BAC at mammography may provide an assessment of a postmenopausal woman's lifetime estrogen exposure and indicate women who could be at risk for hormonally related conditions. PMID:20629578
Representational change and strategy use in children's number line estimation during the first years of primary school

PubMed Central

2012-01-01

Background The objective of this study was to scrutinize number line estimation behaviors displayed by children in mathematics classrooms during the first three years of schooling. We extend existing research by not only mapping potential logarithmic-linear shifts but also provide a new perspective by studying in detail the estimation strategies of individual target digits within a number range familiar to children. Methods Typically developing children (n = 67) from Years 1-3 completed a number-to-position numerical estimation task (0-20 number line). Estimation behaviors were first analyzed via logarithmic and linear regression modeling. Subsequently, using an analysis of variance we compared the estimation accuracy of each digit, thus identifying target digits that were estimated with the assistance of arithmetic strategy. Results Our results further confirm a developmental logarithmic-linear shift when utilizing regression modeling; however, uniquely we have identified that children employ variable strategies when completing numerical estimation, with levels of strategy advancing with development. Conclusion In terms of the existing cognitive research, this strategy factor highlights the limitations of any regression modeling approach, or alternatively, it could underpin the developmental time course of the logarithmic-linear shift. Future studies need to systematically investigate this relationship and also consider the implications for educational practice. PMID:22217191
Prediction of pulmonary hypertension in idiopathic pulmonary fibrosis☆

PubMed Central

Zisman, David A.; Ross, David J.; Belperio, John A.; Saggar, Rajan; Lynch, Joseph P.; Ardehali, Abbas; Karlamangla, Arun S.

2007-01-01

Summary Background Reliable, noninvasive approaches to the diagnosis of pulmonary hypertension in idiopathic pulmonary fibrosis are needed. We tested the hypothesis that the forced vital capacity to diffusing capacity ratio and room air resting pulse oximetry may be combined to predict mean pulmonary artery pressure (MPAP) in idiopathic pulmonary fibrosis. Methods Sixty-one idiopathic pulmonary fibrosis patients with available right-heart catheterization were studied. We regressed measured MPAP as a continuous variable on pulse oximetry (SpO2) and percent predicted forced vital capacity (FVC) to percent-predicted diffusing capacity ratio (% FVC/% DLco) in a multivariable linear regression model. Results Linear regression generated the following equation: MPAP = −11.9+0.272 × SpO2+0.0659 × (100−SpO2)2+3.06 × (% FVC/% DLco); adjusted R2 = 0.55, p<0.0001. The sensitivity, specificity, positive predictive and negative predictive value of model-predicted pulmonary hypertension were 71% (95% confidence interval (CI): 50–89%), 81% (95% CI: 68–92%), 71% (95% CI: 51–87%) and 81% (95% CI: 68–94%). Conclusions A pulmonary hypertension predictor based on room air resting pulse oximetry and FVC to diffusing capacity ratio has a relatively high negative predictive value. However, this model will require external validation before it can be used in clinical practice. PMID:17604151
The Use of Linear Instrumental Variables Methods in Health Services Research and Health Economics: A Cautionary Note

PubMed Central

Terza, Joseph V; Bradford, W David; Dismuke, Clara E

2008-01-01

Objective To investigate potential bias in the use of the conventional linear instrumental variables (IV) method for the estimation of causal effects in inherently nonlinear regression settings. Data Sources Smoking Supplement to the 1979 National Health Interview Survey, National Longitudinal Alcohol Epidemiologic Survey, and simulated data. Study Design Potential bias from the use of the linear IV method in nonlinear models is assessed via simulation studies and real world data analyses in two commonly encountered regression setting: (1) models with a nonnegative outcome (e.g., a count) and a continuous endogenous regressor; and (2) models with a binary outcome and a binary endogenous regressor. Principle Findings The simulation analyses show that substantial bias in the estimation of causal effects can result from applying the conventional IV method in inherently nonlinear regression settings. Moreover, the bias is not attenuated as the sample size increases. This point is further illustrated in the survey data analyses in which IV-based estimates of the relevant causal effects diverge substantially from those obtained with appropriate nonlinear estimation methods. Conclusions We offer this research as a cautionary note to those who would opt for the use of linear specifications in inherently nonlinear settings involving endogeneity. PMID:18546544
Socioeconomic Disparities in Telephone-Based Treatment of Tobacco Dependence

PubMed Central

Varghese, Merilyn; Stitzer, Maxine; Landes, Reid; Brackman, S. Laney; Munn, Tiffany

2014-01-01

Objectives. We examined socioeconomic disparities in tobacco dependence treatment outcomes from a free, proactive telephone counseling quitline. Methods. We delivered cognitive–behavioral treatment and nicotine patches to 6626 smokers and examined socioeconomic differences in demographic, clinical, environmental, and treatment use factors. We used logistic regressions and generalized estimating equations (GEE) to model abstinence and account for socioeconomic differences in the models. Results. The odds of achieving long-term abstinence differed by socioeconomic status (SES). In the GEE model, the odds of abstinence for the highest SES participants were 1.75 times those of the lowest SES participants. Logistic regression models revealed no treatment outcome disparity at the end of treatment, but significant disparities 3 and 6 months after treatment. Conclusions. Although quitlines often increase access to treatment for some lower SES smokers, significant socioeconomic disparities in treatment outcomes raise questions about whether current approaches are contributing to tobacco-related socioeconomic health disparities. Strategies to improve treatment outcomes for lower SES smokers might include novel methods to address multiple factors associated with socioeconomic disparities. PMID:24922165
Weight Fluctuation and Postmenopausal Breast Cancer in the National Health and Nutrition Examination Survey I Epidemiologic Follow-Up Study

PubMed Central

Komaroff, Marina

2016-01-01

Objective. The aim of this study is to investigate if weight fluctuation is an independent risk factor for postmenopausal breast cancer (PBC) among women who gained weight in adult years. Methods. NHANES I Epidemiologic Follow-Up Study (NHEFS) database was used in the study. Women that were cancers-free at enrollment and diagnosed for the first time with breast cancer at age 50 or greater were considered cases. Controls were chosen from the subset of cancers-free women and matched to cases by years of follow-up and status of body mass index (BMI) at 25 years of age. Weight fluctuation was measured by the root-mean-square-error (RMSE) from a simple linear regression model for each woman with their body mass index (BMI) regressed on age (started at 25 years) while women with the positive slope from this regression were defined as weight gainers. Data were analyzed using conditional logistic regression models. Results. A total of 158 women were included into the study. The conditional logistic regression adjusted for weight gain demonstrated positive association between weight fluctuation in adult years and postmenopausal breast cancers (odds ratio/OR = 1.67; 95% confidence interval/CI: 1.06–2.66). Conclusions. The data suggested that long-term weight fluctuation was significant risk factor for PBC among women who gained weight in adult years. This finding underscores the importance of maintaining lost weight and avoiding weight fluctuation. PMID:26953120
Using an autologistic regression model to identify spatial risk factors and spatial risk patterns of hand, foot and mouth disease (HFMD) in Mainland China

PubMed Central

2014-01-01

Background There have been large-scale outbreaks of hand, foot and mouth disease (HFMD) in Mainland China over the last decade. These events varied greatly across the country. It is necessary to identify the spatial risk factors and spatial distribution patterns of HFMD for public health control and prevention. Climate risk factors associated with HFMD occurrence have been recognized. However, few studies discussed the socio-economic determinants of HFMD risk at a space scale. Methods HFMD records in Mainland China in May 2008 were collected. Both climate and socio-economic factors were selected as potential risk exposures of HFMD. Odds ratio (OR) was used to identify the spatial risk factors. A spatial autologistic regression model was employed to get OR values of each exposures and model the spatial distribution patterns of HFMD risk. Results Results showed that both climate and socio-economic variables were spatial risk factors for HFMD transmission in Mainland China. The statistically significant risk factors are monthly average precipitation (OR = 1.4354), monthly average temperature (OR = 1.379), monthly average wind speed (OR = 1.186), the number of industrial enterprises above designated size (OR = 17.699), the population density (OR = 1.953), and the proportion of student population (OR = 1.286). The spatial autologistic regression model has a good goodness of fit (ROC = 0.817) and prediction accuracy (Correct ratio = 78.45%) of HFMD occurrence. The autologistic regression model also reduces the contribution of the residual term in the ordinary logistic regression model significantly, from 17.25 to 1.25 for the odds ratio. Based on the prediction results of the spatial model, we obtained a map of the probability of HFMD occurrence that shows the spatial distribution pattern and local epidemic risk over Mainland China. Conclusions The autologistic regression model was used to identify spatial risk factors and model spatial risk patterns of HFMD. HFMD occurrences were found to be spatially heterogeneous over the Mainland China, which is related to both the climate and socio-economic variables. The combination of socio-economic and climate exposures can explain the HFMD occurrences more comprehensively and objectively than those with only climate exposures. The modeled probability of HFMD occurrence at the county level reveals not only the spatial trends, but also the local details of epidemic risk, even in the regions where there were no HFMD case records. PMID:24731248
Utility of an Abbreviated Dizziness Questionnaire to Differentiate between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study

PubMed Central

Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.

2015-01-01

Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598
Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

PubMed Central

Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

2013-01-01

Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

Prevention and treatment of atherosclerosis with flaxseed-derived compound secoisolariciresinol diglucoside.

PubMed

Prasad, Kailash; Jadhav, Ashok

2016-01-01

Atherosclerosis is the primary cause of coronary artery disease, heart attack, strokes, and peripheral vascular disease. Alternative/complimentary medicines, although are unacceptable by medical community, may be of great help in suppression, slowing of progression and regression of atherosclerosis. Numerous natural products are in use for therapy in spite of lack of evidence. This paper discusses the basic mechanism of atherosclerosis, risk factors for atherosclerosis, and prevention, slowing of progression and regression of atherosclerosis with flaxseed-derived secoisolariciresinol diglucoside (SDG). SDG content of flaxseed varies from 6mg/g to 18 mg/g. Flaxseed is the richest source of SDG. SDG possesses antioxidant, antihypertensive, antidiabetic, hypolipidemic, anti-inflammatory and antiatherogenic activities. SDG content of some commonly used food has been described. SDG in very low dose (15 mg/ kg) suppressed the development of hypercholesterolemic atherosclerosis by 73 % and this effect was associated with reduction in serum total cholesterol, LDL-C, and oxidative stress, and an increase in the levels HDL-C. A summary of the effects of flaxseed and its components on hypercholesterolemic atherosclerosis has been provided. Reduction in hypercholesterolemic atherosclerosis by flaxseed, CDC-flaxseed, flaxseed oil, flax lignan complex and SDG are 46 %, 69 %, 0 %, 34 % and 73 % respectively in dietary cholesterol -induced rabbit model of atherosclerosis. SDG slows the progression of atherosclerosis in animal model. Long-term use of SDG regresses hypercholesterolemic atherosclerosis. It is interesting that regular diet following high cholesterol diet accelerates in this animal model of atherosclerosis. In conclusion SDG suppresses, slow the progression and regresses the atherosclerosis. It could serve as an alternative medicine for the prevention, slowing of progression and regression of atherosclerosis and hence for the treatment of coronary artery disease, stroke and peripheral arterial vascular diseases.
Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

PubMed

Agga, Getahun E; Scott, H Morgan

2015-10-01

Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.
A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods

PubMed Central

Duncan, Dustin T.; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A.; Arbia, Giuseppe; Castro, Marcia C.; White, Kellee; Williams, David R.

2017-01-01

The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran’s I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran’s I range from 0.24 to 0.86, all P=0.001), for tree density (Global Moran’s I=0.452, P=0.001), and in the OLS regression residuals (Global Moran’s I range from 0.32 to 0.38, all P<0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (rS=−0.19; conventional P-value=0.016; spatially adjusted P-value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (rS=−0.18; conventional P-value=0.019; spatially adjusted P-value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed. PMID:29354668
The relationship between the c-statistic of a risk-adjustment model and the accuracy of hospital report cards: A Monte Carlo study

PubMed Central

Austin, Peter C.; Reeves, Mathew J.

2015-01-01

Background Hospital report cards, in which outcomes following the provision of medical or surgical care are compared across health care providers, are being published with increasing frequency. Essential to the production of these reports is risk-adjustment, which allows investigators to account for differences in the distribution of patient illness severity across different hospitals. Logistic regression models are frequently used for risk-adjustment in hospital report cards. Many applied researchers use the c-statistic (equivalent to the area under the receiver operating characteristic curve) of the logistic regression model as a measure of the credibility and accuracy of hospital report cards. Objectives To determine the relationship between the c-statistic of a risk-adjustment model and the accuracy of hospital report cards. Research Design Monte Carlo simulations were used to examine this issue. We examined the influence of three factors on the accuracy of hospital report cards: the c-statistic of the logistic regression model used for risk-adjustment, the number of hospitals, and the number of patients treated at each hospital. The parameters used to generate the simulated datasets came from analyses of patients hospitalized with a diagnosis of acute myocardial infarction in Ontario, Canada. Results The c-statistic of the risk-adjustment model had, at most, a very modest impact on the accuracy of hospital report cards, whereas the number of patients treated at each hospital had a much greater impact. Conclusions The c-statistic of a risk-adjustment model should not be used to assess the accuracy of a hospital report card. PMID:23295579
Effect of temperature and precipitation on salmonellosis cases in South-East Queensland, Australia: an observational study

PubMed Central

Barnett, Adrian Gerard

2016-01-01

Objective Foodborne illnesses in Australia, including salmonellosis, are estimated to cost over $A1.25 billion annually. The weather has been identified as being influential on salmonellosis incidence, as cases increase during summer, however time series modelling of salmonellosis is challenging because outbreaks cause strong autocorrelation. This study assesses whether switching models is an improved method of estimating weather–salmonellosis associations. Design We analysed weather and salmonellosis in South-East Queensland between 2004 and 2013 using 2 common regression models and a switching model, each with 21-day lags for temperature and precipitation. Results The switching model best fit the data, as judged by its substantial improvement in deviance information criterion over the regression models, less autocorrelated residuals and control of seasonality. The switching model estimated a 5°C increase in mean temperature and 10 mm precipitation were associated with increases in salmonellosis cases of 45.4% (95% CrI 40.4%, 50.5%) and 24.1% (95% CrI 17.0%, 31.6%), respectively. Conclusions Switching models improve on traditional time series models in quantifying weather–salmonellosis associations. A better understanding of how temperature and precipitation influence salmonellosis may identify where interventions can be made to lower the health and economic costs of salmonellosis. PMID:26916693
Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network.

PubMed

Liu, Ruixin; Zhang, Xiaodong; Zhang, Lu; Gao, Xiaojie; Li, Huiling; Shi, Junhan; Li, Xuelin

2014-06-01

The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue.
Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network

PubMed Central

LIU, RUIXIN; ZHANG, XIAODONG; ZHANG, LU; GAO, XIAOJIE; LI, HUILING; SHI, JUNHAN; LI, XUELIN

2014-01-01

The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue. PMID:24926369
Genetic analyses of partial egg production in Japanese quail using multi-trait random regression models.

PubMed

Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E

2017-12-01

1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.
Nitrogen dioxide concentrations in neighborhoods adjacent to a commercial airport: a land use regression modeling study

PubMed Central

2010-01-01

Background There is growing concern in communities surrounding airports regarding the contribution of various emission sources (such as aircraft and ground support equipment) to nearby ambient concentrations. We used extensive monitoring of nitrogen dioxide (NO2) in neighborhoods surrounding T.F. Green Airport in Warwick, RI, and land-use regression (LUR) modeling techniques to determine the impact of proximity to the airport and local traffic on these concentrations. Methods Palmes diffusion tube samplers were deployed along the airport's fence line and within surrounding neighborhoods for one to two weeks. In total, 644 measurements were collected over three sampling campaigns (October 2007, March 2008 and June 2008) and each sampling location was geocoded. GIS-based variables were created as proxies for local traffic and airport activity. A forward stepwise regression methodology was employed to create general linear models (GLMs) of NO2 variability near the airport. The effect of local meteorology on associations with GIS-based variables was also explored. Results Higher concentrations of NO2 were seen near the airport terminal, entrance roads to the terminal, and near major roads, with qualitatively consistent spatial patterns between seasons. In our final multivariate model (R2 = 0.32), the local influences of highways and arterial/collector roads were statistically significant, as were local traffic density and distance to the airport terminal (all p < 0.001). Local meteorology did not significantly affect associations with principal GIS variables, and the regression model structure was robust to various model-building approaches. Conclusion Our study has shown that there are clear local variations in NO2 in the neighborhoods that surround an urban airport, which are spatially consistent across seasons. LUR modeling demonstrated a strong influence of local traffic, except the smallest roads that predominate in residential areas, as well as proximity to the airport terminal. PMID:21083910
Comparison of Prediction Model for Cardiovascular Autonomic Dysfunction Using Artificial Neural Network and Logistic Regression Analysis

PubMed Central

Zeng, Fangfang; Li, Zhongtao; Yu, Xiaoling; Zhou, Linuo

2013-01-01

Background This study aimed to develop the artificial neural network (ANN) and multivariable logistic regression (LR) analyses for prediction modeling of cardiovascular autonomic (CA) dysfunction in the general population, and compare the prediction models using the two approaches. Methods and Materials We analyzed a previous dataset based on a Chinese population sample consisting of 2,092 individuals aged 30–80 years. The prediction models were derived from an exploratory set using ANN and LR analysis, and were tested in the validation set. Performances of these prediction models were then compared. Results Univariate analysis indicated that 14 risk factors showed statistically significant association with the prevalence of CA dysfunction (P<0.05). The mean area under the receiver-operating curve was 0.758 (95% CI 0.724–0.793) for LR and 0.762 (95% CI 0.732–0.793) for ANN analysis, but noninferiority result was found (P<0.001). The similar results were found in comparisons of sensitivity, specificity, and predictive values in the prediction models between the LR and ANN analyses. Conclusion The prediction models for CA dysfunction were developed using ANN and LR. ANN and LR are two effective tools for developing prediction models based on our dataset. PMID:23940593
Inferring gene regression networks with model trees

PubMed Central

2010-01-01

Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
The process and utility of classification and regression tree methodology in nursing research

PubMed Central

Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

2014-01-01

Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048
Sources of Biased Inference in Alcohol and Drug Services Research: An Instrumental Variable Approach

PubMed Central

Schmidt, Laura A.; Tam, Tammy W.; Larson, Mary Jo

2012-01-01

Objective: This study examined the potential for biased inference due to endogeneity when using standard approaches for modeling the utilization of alcohol and drug treatment. Method: Results from standard regression analysis were compared with those that controlled for endogeneity using instrumental variables estimation. Comparable models predicted the likelihood of receiving alcohol treatment based on the widely used Aday and Andersen medical care–seeking model. Data were from the National Epidemiologic Survey on Alcohol and Related Conditions and included a representative sample of adults in households and group quarters throughout the contiguous United States. Results: Findings suggested that standard approaches for modeling treatment utilization are prone to bias because of uncontrolled reverse causation and omitted variables. Compared with instrumental variables estimation, standard regression analyses produced downwardly biased estimates of the impact of alcohol problem severity on the likelihood of receiving care. Conclusions: Standard approaches for modeling service utilization are prone to underestimating the true effects of problem severity on service use. Biased inference could lead to inaccurate policy recommendations, for example, by suggesting that people with milder forms of substance use disorder are more likely to receive care than is actually the case. PMID:22152672
Prediction of Return-to-original-work after an Industrial Accident Using Machine Learning and Comparison of Techniques

PubMed Central

2018-01-01

Background Many studies have tried to develop predictors for return-to-work (RTW). However, since complex factors have been demonstrated to predict RTW, it is difficult to use them practically. This study investigated whether factors used in previous studies could predict whether an individual had returned to his/her original work by four years after termination of the worker's recovery period. Methods An initial logistic regression analysis of 1,567 participants of the fourth Panel Study of Worker's Compensation Insurance yielded odds ratios. The participants were divided into two subsets, a training dataset and a test dataset. Using the training dataset, logistic regression, decision tree, random forest, and support vector machine models were established, and important variables of each model were identified. The predictive abilities of the different models were compared. Results The analysis showed that only earned income and company-related factors significantly affected return-to-original-work (RTOW). The random forest model showed the best accuracy among the tested machine learning models; however, the difference was not prominent. Conclusion It is possible to predict a worker's probability of RTOW using machine learning techniques with moderate accuracy. PMID:29736160
Analysis of Market Opportunities for Chinese Private Express Delivery Industry

NASA Astrophysics Data System (ADS)

Jiang, Changbing; Bai, Lijun; Tong, Xiaoqing

China's express delivery market has become the arena in which each express enterprise struggles to chase due to the huge potential demand and high profitable prospects. So certain qualitative and quantitative forecast for the future changes of China's express delivery market will help enterprises understand various types of market conditions and social changes in demand and adjust business activities to enhance their competitiveness timely. The development of China's express delivery industry is first introduced in this chapter. Then the theoretical basis of the regression model is overviewed. We also predict the demand trends of China's express delivery market by using Pearson correlation analysis and regression analysis from qualitative and quantitative aspects, respectively. Finally, we draw some conclusions and recommendations for China's express delivery industry.
Reconstruction of the Foot and Ankle Using Pedicled or Free Flaps: Perioperative Flap Survival Analysis

PubMed Central

Li, Xiucun; Cui, Jianli; Maharjan, Suraj; Lu, Laijin; Gong, Xu

2016-01-01

Objective The purpose of this study is to determine the correlation between non-technical risk factors and the perioperative flap survival rate and to evaluate the choice of skin flap for the reconstruction of foot and ankle. Methods This was a clinical retrospective study. Nine variables were identified. The Kaplan-Meier method coupled with a log-rank test and a Cox regression model was used to predict the risk factors that influence the perioperative flap survival rate. The relationship between postoperative wound infection and risk factors was also analyzed using a logistic regression model. Results The overall flap survival rate was 85.42%. The necrosis rates of free flaps and pedicled flaps were 5.26% and 20.69%, respectively. According to the Cox regression model, flap type (hazard ratio [HR] = 2.592; 95% confidence interval [CI] (1.606, 4.184); P < 0.001) and postoperative wound infection (HR = 0.266; 95% CI (0.134, 0.529); P < 0.001) were found to be statistically significant risk factors associated with flap necrosis. Based on the logistic regression model, preoperative wound bed inflammation (odds ratio [OR] = 11.371,95% CI (3.117, 41.478), P < 0.001) was a statistically significant risk factor for postoperative wound infection. Conclusion Flap type and postoperative wound infection were both independent risk factors influencing the flap survival rate in the foot and ankle. However, postoperative wound infection was a risk factor for the pedicled flap but not for the free flap. Microvascular anastomosis is a major cause of free flap necrosis. To reconstruct complex or wide soft tissue defects of the foot or ankle, free flaps are safer and more reliable than pedicled flaps and should thus be the primary choice. PMID:27930679
Household Debt and Relation to Intimate Partner Violence and Husbands' Attitudes Toward Gender Norms: A Study Among Young Married Couples in Rural Maharashtra, India

PubMed Central

Donta, Balaiah; Dasgupta, Anindita; Ghule, Mohan; Battala, Madhusudana; Nair, Saritha; Silverman, Jay G.; Jadhav, Arun; Palaye, Prajakta; Saggurti, Niranjan; Raj, Anita

2015-01-01

Objective Evidence has linked economic hardship with increased intimate partner violence (IPV) perpetration among males. However, less is known about how economic debt or gender norms related to men's roles in relationships or the household, which often underlie IPV perpetration, intersect in or may explain these associations. We assessed the intersection of economic debt, attitudes toward gender norms, and IPV perpetration among married men in India. Methods Data were from the evaluation of a family planning intervention among young married couples (n=1,081) in rural Maharashtra, India. Crude and adjusted logistic regression models for dichotomous outcome variables and linear regression models for continuous outcomes were used to examine debt in relation to husbands' attitudes toward gender-based norms (i.e., beliefs supporting IPV and beliefs regarding male dominance in relationships and the household), as well as sexual and physical IPV perpetration. Results Twenty percent of husbands reported debt. In adjusted linear regression models, debt was associated with husbands' attitudes supportive of IPV (b=0.015, p=0.004) and norms supporting male dominance in relationships and the household (b=0.006, p=0.003). In logistic regression models adjusted for relevant demographics, debt was associated with perpetration of physical IPV (adjusted odds ratio [AOR] = 1.4, 95% confidence interval [CI] 1.1, 1.9) and sexual IPV (AOR=1.6, 95% CI 1.1, 2.1) from husbands. These findings related to debt and relation to IPV were slightly attenuated when further adjusted for men's attitudes toward gender norms. Conclusion Findings suggest the need for combined gender equity and economic promotion interventions to address high levels of debt and related IPV reported among married couples in rural India. PMID:26556938
The Detection and Quantification of Adulteration in Ground Roasted Asian Palm Civet Coffee Using Near-Infrared Spectroscopy in Tandem with Chemometrics

NASA Astrophysics Data System (ADS)

Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.

2018-05-01

In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Beyond Reading Alone: The Relationship Between Aural Literacy And Asthma Management

PubMed Central

Rosenfeld, Lindsay; Rudd, Rima; Emmons, Karen M.; Acevedo-García, Dolores; Martin, Laurie; Buka, Stephen

2010-01-01

Objectives To examine the relationship between literacy and asthma management with a focus on the oral exchange. Methods Study participants, all of whom reported asthma, were drawn from the New England Family Study (NEFS), an examination of links between education and health. NEFS data included reading, oral (speaking), and aural (listening) literacy measures. An additional survey was conducted with this group of study participants related to asthma issues, particularly asthma management. Data analysis focused on bivariate and multivariable logistic regression. Results In bivariate logistic regression models exploring aural literacy, there was a statistically significant association between those participants with lower aural literacy skills and less successful asthma management (OR:4.37, 95%CI:1.11, 17.32). In multivariable logistic regression analyses, controlling for gender, income, and race in separate models (one-at-a-time), there remained a statistically significant association between those participants with lower aural literacy skills and less successful asthma management. Conclusion Lower aural literacy skills seem to complicate asthma management capabilities. Practice Implications Greater attention to the oral exchange, in particular the listening skills highlighted by aural literacy, as well as other related literacy skills may help us develop strategies for clear communication related to asthma management. PMID:20399060
Assessment of triglyceride and cholesterol in overweight people based on multiple linear regression and artificial intelligence model.

PubMed

Ma, Jing; Yu, Jiong; Hao, Guangshu; Wang, Dan; Sun, Yanni; Lu, Jianxin; Cao, Hongcui; Lin, Feiyan

2017-02-20

The prevalence of high hyperlipemia is increasing around the world. Our aims are to analyze the relationship of triglyceride (TG) and cholesterol (TC) with indexes of liver function and kidney function, and to develop a prediction model of TG, TC in overweight people. A total of 302 adult healthy subjects and 273 overweight subjects were enrolled in this study. The levels of fasting indexes of TG (fs-TG), TC (fs-TC), blood glucose, liver function, and kidney function were measured and analyzed by correlation analysis and multiple linear regression (MRL). The back propagation artificial neural network (BP-ANN) was applied to develop prediction models of fs-TG and fs-TC. The results showed there was significant difference in biochemical indexes between healthy people and overweight people. The correlation analysis showed fs-TG was related to weight, height, blood glucose, and indexes of liver and kidney function; while fs-TC was correlated with age, indexes of liver function (P < 0.01). The MRL analysis indicated regression equations of fs-TG and fs-TC both had statistic significant (P < 0.01) when included independent indexes. The BP-ANN model of fs-TG reached training goal at 59 epoch, while fs-TC model achieved high prediction accuracy after training 1000 epoch. In conclusions, there was high relationship of fs-TG and fs-TC with weight, height, age, blood glucose, indexes of liver function and kidney function. Based on related variables, the indexes of fs-TG and fs-TC can be predicted by BP-ANN models in overweight people.

Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models

PubMed Central

Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza

2016-01-01

Background Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models’ with and without novel biomarkers. Objectives Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. Materials and Methods We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham’s “general CVD risk” algorithm. Results The command is addpred for logistic regression models. Conclusions The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers. PMID:27279830
Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di

Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less
Predicting location of recurrence using FDG, FLT, and Cu-ATSM PET in canine sinonasal tumors treated with radiotherapy

NASA Astrophysics Data System (ADS)

Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert

2015-07-01

Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R2 and pseudo R2 were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R2 ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R2 = 0.31), but there was still large variability between patients in R2. The R2 from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Predicting location of recurrence using FDG, FLT, and Cu-ATSM PET in canine sinonasal tumors treated with radiotherapy.

PubMed

Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert

2015-07-07

Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R(2) and pseudo R(2) were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R(2) ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R(2) = 0.31), but there was still large variability between patients in R(2). The R(2) from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Weibull mixture regression for marginal inference in zero-heavy continuous outcomes.

PubMed

Gebregziabher, Mulugeta; Voronca, Delia; Teklehaimanot, Abeba; Santa Ana, Elizabeth J

2017-06-01

Continuous outcomes with preponderance of zero values are ubiquitous in data that arise from biomedical studies, for example studies of addictive disorders. This is known to lead to violation of standard assumptions in parametric inference and enhances the risk of misleading conclusions unless managed properly. Two-part models are commonly used to deal with this problem. However, standard two-part models have limitations with respect to obtaining parameter estimates that have marginal interpretation of covariate effects which are important in many biomedical applications. Recently marginalized two-part models are proposed but their development is limited to log-normal and log-skew-normal distributions. Thus, in this paper, we propose a finite mixture approach, with Weibull mixture regression as a special case, to deal with the problem. We use extensive simulation study to assess the performance of the proposed model in finite samples and to make comparisons with other family of models via statistical information and mean squared error criteria. We demonstrate its application on real data from a randomized controlled trial of addictive disorders. Our results show that a two-component Weibull mixture model is preferred for modeling zero-heavy continuous data when the non-zero part are simulated from Weibull or similar distributions such as Gamma or truncated Gauss.
Genetic parameters for stayability to consecutive calvings in Zebu cattle.

PubMed

Silva, D O; Santana, M L; Ayres, D R; Menezes, G R O; Silva, L O C; Nobre, P R C; Pereira, R J

2017-12-22

Longer-lived cows tend to be more profitable and the stayability trait is a selection criterion correlated to longevity. An alternative to the traditional approach to evaluate stayability is its definition based on consecutive calvings, whose main advantage is the more accurate evaluation of young bulls. However, no study using this alternative approach has been conducted for Zebu breeds. Therefore, the objective of this study was to compare linear random regression models to fit stayability to consecutive calvings of Guzerá, Nelore and Tabapuã cows and to estimate genetic parameters for this trait in the respective breeds. Data up to the eighth calving were used. The models included the fixed effects of age at first calving and year-season of birth of the cow and the random effects of contemporary group, additive genetic, permanent environmental and residual. Random regressions were modeled by orthogonal Legendre polynomials of order 1 to 4 (2 to 5 coefficients) for contemporary group, additive genetic and permanent environmental effects. Using Deviance Information Criterion as the selection criterion, the model with 4 regression coefficients for each effect was the most adequate for the Nelore and Tabapuã breeds and the model with 5 coefficients is recommended for the Guzerá breed. For Guzerá, heritabilities ranged from 0.05 to 0.08, showing a quadratic trend with a peak between the fourth and sixth calving. For the Nelore and Tabapuã breeds, the estimates ranged from 0.03 to 0.07 and from 0.03 to 0.08, respectively, and increased with increasing calving number. The additive genetic correlations exhibited a similar trend among breeds and were higher for stayability between closer calvings. Even between more distant calvings (second v. eighth), stayability showed a moderate to high genetic correlation, which was 0.77, 0.57 and 0.79 for the Guzerá, Nelore and Tabapuã breeds, respectively. For Guzerá, when the models with 4 or 5 regression coefficients were compared, the rank correlations between predicted breeding values for the intercept were always higher than 0.99, indicating the possibility of practical application of the least parameterized model. In conclusion, the model with 4 random regression coefficients is recommended for the genetic evaluation of stayability to consecutive calvings in Zebu cattle.
Resilient Brain Aging: Characterization of Discordance between Alzheimer’s Disease Pathology and Cognition

PubMed Central

Negash, Selam; Wilson, Robert S.; Leurgans, Sue E.; Wolk, David A.; Schneider, Julie A.; Buchman, Aron S.; Bennett, David A.; Arnold, Steven. E.

2014-01-01

Background Although it is now evident that normal cognition can occur despite significant AD pathology, few studies have attempted to characterize this discordance, or examine factors that may contribute to resilient brain aging in the setting of AD pathology. Methods More than 2,000 older persons underwent annual evaluation as part of participation in the Religious Orders Study or Rush Memory Aging Project. A total of 966 subjects who had brain autopsy and comprehensive cognitive testing proximate to death were analyzed. Resilience was quantified as a continuous measure using linear regression modeling, where global cognition was entered as a dependent variable and global pathology was an independent variable. Studentized residuals generated from the model represented the discordance between cognition and pathology, and served as measure of resilience. The relation of resilience index to known risk factors for AD and related variables was examined. Results Multivariate regression models that adjusted for demographic variables revealed significant associations for early life socioeconomic status, reading ability, APOE-ε4 status, and past cognitive activity. A stepwise regression model retained reading level (estimate = 0.10, SE = 0.02; p < 0.0001) and past cognitive activity (estimate = 0.27, SE = 0.09; p = 0.002), suggesting the potential mediating role of these variables for resilience. Conclusions The construct of resilient brain aging can provide a framework for quantifying the discordance between cognition and pathology, and help identify factors that may mediate this relationship. PMID:23919768
Explicit criteria for prioritization of cataract surgery

PubMed Central

Ma Quintana, José; Escobar, Antonio; Bilbao, Amaia

2006-01-01

Background Consensus techniques have been used previously to create explicit criteria to prioritize cataract extraction; however, the appropriateness of the intervention was not included explicitly in previous studies. We developed a prioritization tool for cataract extraction according to the RAND method. Methods Criteria were developed using a modified Delphi panel judgment process. A panel of 11 ophthalmologists was assembled. Ratings were analyzed regarding the level of agreement among panelists. We studied the effect of all variables on the final panel score using general linear and logistic regression models. Priority scoring systems were developed by means of optimal scaling and general linear models. The explicit criteria developed were summarized by means of regression tree analysis. Results Eight variables were considered to create the indications. Of the 310 indications that the panel evaluated, 22.6% were considered high priority, 52.3% intermediate priority, and 25.2% low priority. Agreement was reached for 31.9% of the indications and disagreement for 0.3%. Logistic regression and general linear models showed that the preoperative visual acuity of the cataractous eye, visual function, and anticipated visual acuity postoperatively were the most influential variables. Alternative and simple scoring systems were obtained by optimal scaling and general linear models where the previous variables were also the most important. The decision tree also shows the importance of the previous variables and the appropriateness of the intervention. Conclusion Our results showed acceptable validity as an evaluation and management tool for prioritizing cataract extraction. It also provides easy algorithms for use in clinical practice. PMID:16512893
Novel Analog For Muscle Deconditioning

NASA Technical Reports Server (NTRS)

Ploutz-Snyder, Lori; Ryder, Jeff; Buxton, Roxanne; Redd, Elizabeth; Scott-Pandorf, Melissa; Hackney, Kyle; Fiedler, James; Bloomberg, Jacob

2010-01-01

Existing models of muscle deconditioning are cumbersome and expensive (ex: bedrest). We propose a new model utilizing a weighted suit to manipulate strength, power or endurance (function) relative to body weight (BW). Methods: 20 subjects performed 7 occupational astronaut tasks while wearing a suit weighted with 0-120% of BW. Models of the full relationship between muscle function/BW and task completion time were developed using fractional polynomial regression and verified by the addition of pre- and post-flight astronaut performance data using the same tasks. Spline regression was used to identify muscle function thresholds below which task performance was impaired. Results: Thresholds of performance decline were identified for each task. Seated egress & walk (most difficult task) showed thresholds of: leg press (LP) isometric peak force/BW of 18 N/kg, LP power/BW of 18 W/kg, LP work/ BW of 79 J/kg, knee extension (KE) isokinetic/BW of 6 Nm/Kg and KE torque/BW of 1.9 Nm/kg. Conclusions: Laboratory manipulation of strength / BW has promise as an appropriate analog for spaceflight-induced loss of muscle function for predicting occupational task performance and establishing operationally relevant exercise targets.
Local modelling of land consumption in Germany with RegioClust

NASA Astrophysics Data System (ADS)

Hagenauer, Julian; Helbich, Marco

2018-03-01

Germany is experiencing extensive land consumption. This necessitates local models to understand actual and future land consumption patterns. This research examined land consumption rates on a municipality level in Germany for the period 2000-10 and predicted rates for 2010-20. For this purpose, RegioClust, an algorithm that combines hierarchical clustering and regression analysis to identify regions with similar relationships between land consumption and its drivers, was developed. The performance of RegioClust was compared against geographically weighted regression (GWR). Distinct spatially varying relationships across regions emerged, whereas population density is suggested as the central driver. Although both RegioClust and GWR predicted an increase in land consumption rates for east Germany for 2010-20, only RegioClust forecasts a decline for west Germany. In conclusion, both models predict for 2010-20 a rate of land consumption that suggests that the policy objective of reducing land consumption to 30 ha per day in 2020 will not be achieved. Policymakers are advised to take action and revise existing planning strategies to counteract this development.
Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time-to-Event Analysis.

PubMed

Gong, Xiajing; Hu, Meng; Zhao, Liang

2018-05-01

Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time-to-event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high-dimensional data featured by a large number of predictor variables. Our results showed that ML-based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high-dimensional data. The prediction performances of ML-based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML-based methods provide a powerful tool for time-to-event analysis, with a built-in capacity for high-dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Estimating severity of sideways fall using a generic multi linear regression model based on kinematic input variables.

PubMed

van der Zijden, A M; Groen, B E; Tanck, E; Nienhuis, B; Verdonschot, N; Weerdesteyn, V

2017-03-21

Many research groups have studied fall impact mechanics to understand how fall severity can be reduced to prevent hip fractures. Yet, direct impact force measurements with force plates are restricted to a very limited repertoire of experimental falls. The purpose of this study was to develop a generic model for estimating hip impact forces (i.e. fall severity) in in vivo sideways falls without the use of force plates. Twelve experienced judokas performed sideways Martial Arts (MA) and Block ('natural') falls on a force plate, both with and without a mat on top. Data were analyzed to determine the hip impact force and to derive 11 selected (subject-specific and kinematic) variables. Falls from kneeling height were used to perform a stepwise regression procedure to assess the effects of these input variables and build the model. The final model includes four input variables, involving one subject-specific measure and three kinematic variables: maximum upper body deceleration, body mass, shoulder angle at the instant of 'maximum impact' and maximum hip deceleration. The results showed that estimated and measured hip impact forces were linearly related (explained variances ranging from 46 to 63%). Hip impact forces of MA falls onto the mat from a standing position (3650±916N) estimated by the final model were comparable with measured values (3698±689N), even though these data were not used for training the model. In conclusion, a generic linear regression model was developed that enables the assessment of fall severity through kinematic measures of sideways falls, without using force plates. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dynamic spatiotemporal analysis of indigenous dengue fever at street-level in Guangzhou city, China

PubMed Central

Xia, Yao; Zhang, Yingtao; Huang, Xiaodong; Huang, Jiawei; Nie, Enqiong; Jing, Qinlong; Wang, Guoling; Yang, Zhicong; Hu, Wenbiao

2018-01-01

Background This study aimed to investigate the spatiotemporal clustering and socio-environmental factors associated with dengue fever (DF) incidence rates at street level in Guangzhou city, China. Methods Spatiotemporal scan technique was applied to identify the high risk region of DF. Multiple regression model was used to identify the socio-environmental factors associated with DF infection. A Poisson regression model was employed to examine the spatiotemporal patterns in the spread of DF. Results Spatial clusters of DF were primarily concentrated at the southwest part of Guangzhou city. Age group (65+ years) (Odd Ratio (OR) = 1.49, 95% Confidence Interval (CI) = 1.13 to 2.03), floating population (OR = 1.09, 95% CI = 1.05 to 1.15), low-education (OR = 1.08, 95% CI = 1.01 to 1.16) and non-agriculture (OR = 1.07, 95% CI = 1.03 to 1.11) were associated with DF transmission. Poisson regression results indicated that changes in DF incidence rates were significantly associated with longitude (β = -5.08, P<0.01) and latitude (β = -1.99, P<0.01). Conclusions The study demonstrated that social-environmental factors may play an important role in DF transmission in Guangzhou. As geographic range of notified DF has significantly expanded over recent years, an early warning systems based on spatiotemporal model with socio-environmental is urgently needed to improve the effectiveness and efficiency of dengue control and prevention. PMID:29561835
Complexities and potential pitfalls of clinical study design and data analysis in assisted reproduction.

PubMed

Patounakis, George; Hill, Micah J

2018-06-01

The purpose of the current review is to describe the common pitfalls in design and statistical analysis of reproductive medicine studies. It serves to guide both authors and reviewers toward reducing the incidence of spurious statistical results and erroneous conclusions. The large amount of data gathered in IVF cycles leads to problems with multiplicity, multicollinearity, and over fitting of regression models. Furthermore, the use of the word 'trend' to describe nonsignificant results has increased in recent years. Finally, methods to accurately account for female age in infertility research models are becoming more common and necessary. The pitfalls of study design and analysis reviewed provide a framework for authors and reviewers to approach clinical research in the field of reproductive medicine. By providing a more rigorous approach to study design and analysis, the literature in reproductive medicine will have more reliable conclusions that can stand the test of time.
The relative roles of environment, history and local dispersal in controlling the distributions of common tree and shrub species in a tropical forest landscape, Panama

USGS Publications Warehouse

Svenning, J.-C.; Engelbrecht, B.M.J.; Kinner, D.A.; Kursar, T.A.; Stallard, R.F.; Wright, S.J.

2006-01-01

We used regression models and information-theoretic model selection to assess the relative importance of environment, local dispersal and historical contingency as controls of the distributions of 26 common plant species in tropical forest on Barro Colorado Island (BCI), Panama. We censused eighty-eight 0.09-ha plots scattered across the landscape. Environmental control, local dispersal and historical contingency were represented by environmental variables (soil moisture, slope, soil type, distance to shore, old-forest presence), a spatial autoregressive parameter (??), and four spatial trend variables, respectively. We built regression models, representing all combinations of the three hypotheses, for each species. The probability that the best model included the environmental variables, spatial trend variables and ?? averaged 33%, 64% and 50% across the study species, respectively. The environmental variables, spatial trend variables, ??, and a simple intercept model received the strongest support for 4, 15, 5 and 2 species, respectively. Comparing the model results to information on species traits showed that species with strong spatial trends produced few and heavy diaspores, while species with strong soil moisture relationships were particularly drought-sensitive. In conclusion, history and local dispersal appeared to be the dominant controls of the distributions of common plant species on BCI. Copyright ?? 2006 Cambridge University Press.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.

PubMed

Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L

2011-10-01

Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Spatiotemporally restricted arenavirus replication induces immune surveillance and type I interferon-dependent tumour regression

PubMed Central

Kalkavan, Halime; Sharma, Piyush; Kasper, Stefan; Helfrich, Iris; Pandyra, Aleksandra A.; Gassa, Asmae; Virchow, Isabel; Flatz, Lukas; Brandenburg, Tim; Namineni, Sukumar; Heikenwalder, Mathias; Höchst, Bastian; Knolle, Percy A.; Wollmann, Guido; von Laer, Dorothee; Drexler, Ingo; Rathbun, Jessica; Cannon, Paula M.; Scheu, Stefanie; Bauer, Jens; Chauhan, Jagat; Häussinger, Dieter; Willimsky, Gerald; Löhning, Max; Schadendorf, Dirk; Brandau, Sven; Schuler, Martin; Lang, Philipp A.; Lang, Karl S.

2017-01-01

Immune-mediated effector molecules can limit cancer growth, but lack of sustained immune activation in the tumour microenvironment restricts antitumour immunity. New therapeutic approaches that induce a strong and prolonged immune activation would represent a major immunotherapeutic advance. Here we show that the arenaviruses lymphocytic choriomeningitis virus (LCMV) and the clinically used Junin virus vaccine (Candid#1) preferentially replicate in tumour cells in a variety of murine and human cancer models. Viral replication leads to prolonged local immune activation, rapid regression of localized and metastatic cancers, and long-term disease control. Mechanistically, LCMV induces antitumour immunity, which depends on the recruitment of interferon-producing Ly6C+ monocytes and additionally enhances tumour-specific CD8+ T cells. In comparison with other clinically evaluated oncolytic viruses and to PD-1 blockade, LCMV treatment shows promising antitumoural benefits. In conclusion, therapeutically administered arenavirus replicates in cancer cells and induces tumour regression by enhancing local immune responses. PMID:28248314
Is It the Intervention or the Students? Using Linear Regression to Control for Student Characteristics in Undergraduate STEM Education Research

PubMed Central

Theobald, Roddy; Freeman, Scott

2014-01-01

Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance. PMID:24591502
Is it the intervention or the students? using linear regression to control for student characteristics in undergraduate STEM education research.

PubMed

Theobald, Roddy; Freeman, Scott

2014-01-01

Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance.
Predictors of quality of life: A quantitative investigation of the stress-coping model in children with asthma

PubMed Central

Peeters, Yvette; Boersma, Sandra N; Koopman, Hendrik M

2008-01-01

Background Aim of this study is to further explore predictors of health related quality of life in children with asthma using factors derived from to the extended stress-coping model. While the stress-coping model has often been used as a frame of reference in studying health related quality of life in chronic illness, few have actually tested the model in children with asthma. Method In this survey study data were obtained by means of self-report questionnaires from seventy-eight children with asthma and their parents. Based on data derived from these questionnaires the constructs of the extended stress-coping model were assessed, using regression analysis and path analysis. Results The results of both regression analysis and path analysis reveal tentative support for the proposed relationships between predictors and health related quality of life in the stress-coping model. Moreover, as indicated in the stress-coping model, HRQoL is only directly predicted by coping. Both coping strategies 'emotional reaction' (significantly) and 'avoidance' are directly related to HRQoL. Conclusion In children with asthma, the extended stress-coping model appears to be a useful theoretical framework for understanding the impact of the illness on their quality of life. Consequently, the factors suggested by this model should be taken into account when designing optimal psychosocial-care interventions. PMID:18366753

Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.

PubMed

Trninić, Marko; Jeličić, Mario; Papić, Vladan

2015-07-01

In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree

PubMed Central

Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

2016-01-01

Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters

NASA Astrophysics Data System (ADS)

Huang, Lin-Shan; Chen, Yan-Guang

Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
Ethical climate as a moderator between organizational trust and whistle-blowing among nurses and secretaries

PubMed Central

Aydan, Seda; Kaya, Sidika

2018-01-01

Objectives: To reveal the effect of perception of ethical climate by nurses and secretaries and their level of organizational trust on their whistleblowing intention. Methods: Nurses and secretaries working in a University Hospital in Ankara, Turkey, were enrolled in the study conducted in 2016. Responses were received from 369 nurses and secretaries working at Clinics and Polyclinics. Path analysis, investigation of structural equation models used while multi-regression analysis was also applied. Results: According to the regression model, ethical climate dimensions, profession, gender, and work place had significant impact on the whistleblowing intention. According to Path analysis, ethical climate had direct impact of 69% on whistleblowing intention. It was seen that organizational trust had an indirect impact of 27% on the whistleblowing score when ethical climate had a moderator role. Conclusion: In order to promote whistleblowing in organizations, it is important to keep the ethical climate perception of employees and the level of their organizational trust at high levels. PMID:29805421
Teacher Consultation and Coaching within Mental Health Practice: Classroom and Child Effects in Urban Elementary Schools

PubMed Central

Cappella, Elise; Hamre, Bridget K.; Kim, Ha Yeon; Henry, David B.; Frazier, Stacy L.; Atkins, Marc S.; Schoenwald, Sonja K.

2012-01-01

Objective To examine effects of a teacher consultation and coaching program delivered by school and community mental health professionals on change in observed classroom interactions and child functioning across one school year. Method Thirty-six classrooms within five urban elementary schools (87% Latino, 11% Black) were randomly assigned to intervention (training + consultation/coaching) and control (training only) conditions. Classroom and child outcomes (n = 364; 43% girls) were assessed in the fall and spring. Results Random effects regression models showed main effects of intervention on teacher-student relationship closeness, academic self-concept, and peer victimization. Results of multiple regression models showed levels of observed teacher emotional support in the fall moderated intervention impact on emotional support at the end of the school year. Conclusions Results suggest teacher consultation and coaching can be integrated within existing mental health activities in urban schools and impact classroom effectiveness and child adaptation across multiple domains. PMID:22428941
Face-Referenced Measurement of Perioral Stiffness and Speech Kinematics in Parkinson's Disease

PubMed Central

Barlow, Steven M.; Lee, Jaehoon

2015-01-01

Purpose Perioral biomechanics, labial kinematics, and associated electromyographic signals were sampled and characterized in individuals with Parkinson's disease (PD) as a function of medication state. Method Passive perioral stiffness was sampled using the OroSTIFF system in 10 individuals with PD in a medication ON and a medication OFF state and compared to 10 matched controls. Perioral stiffness, derived as the quotient of resultant force and interoral angle span, was modeled with regression techniques. Labial movement amplitudes and integrated electromyograms from select lip muscles were evaluated during syllable production using a 4-D computerized motion capture system. Results Multilevel regression modeling showed greater perioral stiffness in patients with PD, consistent with the clinical correlate of rigidity. In the medication-OFF state, individuals with PD manifested greater integrated electromyogram levels for the orbicularis oris inferior compared to controls, which increased further after consumption of levodopa. Conclusions This study illustrates the application of biomechanical, electrophysiological, and kinematic methods to better understand the pathophysiology of speech motor control in PD. PMID:25629806
Intimate partner violence and anxiety disorders in pregnancy: the importance of vocational training of the nursing staff in facing them1

PubMed Central

Fonseca-Machado, Mariana de Oliveira; Monteiro, Juliana Cristina dos Santos; Haas, Vanderlei José; Abrão, Ana Cristina Freitas de Vilhena; Gomes-Sponholz, Flávia

2015-01-01

Objective: to identify the relationship between posttraumatic stress disorder, trait and state anxiety, and intimate partner violence during pregnancy. Method: observational, cross-sectional study developed with 358 pregnant women. The Posttraumatic Stress Disorder Checklist - Civilian Version was used, as well as the State-Trait Anxiety Inventory and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence. Results: after adjusting to the multiple logistic regression model, intimate partner violence, occurred during pregnancy, was associated with the indication of posttraumatic stress disorder. The adjusted multiple linear regression models showed that the victims of violence, in the current pregnancy, had higher symptom scores of trait and state anxiety than non-victims. Conclusion: recognizing the intimate partner violence as a clinically relevant and identifiable risk factor for the occurrence of anxiety disorders during pregnancy can be a first step in the prevention thereof. PMID:26487135
Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au; Ebert, Martin A.; Bulsara, Max

Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥more » 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions: Logistic regression and MARS were most likely to be the best-performing strategy for the prediction of urinary symptoms with elastic-net and random forest producing competitive results. The predictive power of the models was modest and endpoint-dependent. New features, including spatial dose maps, may be necessary to achieve better models.« less
Heterogeneity in hedonic modelling of house prices: looking at buyers' household profiles

NASA Astrophysics Data System (ADS)

Kestens, Yan; Thériault, Marius; Des Rosiers, François

2006-03-01

This paper introduces household-level data into hedonic models in order to measure the heterogeneity of implicit prices regarding household type, age, educational attainment, income, and the previous tenure status of the buyers. Two methods are used for this purpose: a first series of models uses expansion terms, whereas a second series applies Geographically Weighted Regressions. Both methods yield conclusive results, showing that the marginal value given to certain property specifics and location attributes do vary regarding the characteristics of the buyer’s household. Particularly, major findings concern the significant effect of income on the location rent as well as the premium paid by highly-educated households in order to fulfil social homogeneity.
Spatiotemporal Modeling of Ozone Levels in Quebec (Canada): A Comparison of Kriging, Land-Use Regression (LUR), and Combined Bayesian Maximum Entropy–LUR Approaches

PubMed Central

Adam-Poupart, Ariane; Brand, Allan; Fournier, Michel; Jerrett, Michael

2014-01-01

Background: Ambient air ozone (O3) is a pulmonary irritant that has been associated with respiratory health effects including increased lung inflammation and permeability, airway hyperreactivity, respiratory symptoms, and decreased lung function. Estimation of O3 exposure is a complex task because the pollutant exhibits complex spatiotemporal patterns. To refine the quality of exposure estimation, various spatiotemporal methods have been developed worldwide. Objectives: We sought to compare the accuracy of three spatiotemporal models to predict summer ground-level O3 in Quebec, Canada. Methods: We developed a land-use mixed-effects regression (LUR) model based on readily available data (air quality and meteorological monitoring data, road networks information, latitude), a Bayesian maximum entropy (BME) model incorporating both O3 monitoring station data and the land-use mixed model outputs (BME-LUR), and a kriging method model based only on available O3 monitoring station data (BME kriging). We performed leave-one-station-out cross-validation and visually assessed the predictive capability of each model by examining the mean temporal and spatial distributions of the average estimated errors. Results: The BME-LUR was the best predictive model (R2 = 0.653) with the lowest root mean-square error (RMSE ;7.06 ppb), followed by the LUR model (R2 = 0.466, RMSE = 8.747) and the BME kriging model (R2 = 0.414, RMSE = 9.164). Conclusions: Our findings suggest that errors of estimation in the interpolation of O3 concentrations with BME can be greatly reduced by incorporating outputs from a LUR model developed with readily available data. Citation: Adam-Poupart A, Brand A, Fournier M, Jerrett M, Smargiassi A. 2014. Spatiotemporal modeling of ozone levels in Quebec (Canada): a comparison of kriging, land-use regression (LUR), and combined Bayesian maximum entropy–LUR approaches. Environ Health Perspect 122:970–976; http://dx.doi.org/10.1289/ehp.1306566 PMID:24879650
The comparison between several robust ridge regression estimators in the presence of multicollinearity and multiple outliers

NASA Astrophysics Data System (ADS)

Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said

2014-09-01

In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.
Classical Statistics and Statistical Learning in Imaging Neuroscience

PubMed Central

Bzdok, Danilo

2017-01-01

Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
Isolating the cow-specific part of residual energy intake in lactating dairy cows using random regressions.

PubMed

Fischer, A; Friggens, N C; Berry, D P; Faverdin, P

2018-07-01

The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
Relationship between chemical structure and the occupational asthma hazard of low molecular weight organic compounds

PubMed Central

Jarvis, J; Seed, M; Elton, R; Sawyer, L; Agius, R

2005-01-01

Aims: To investigate quantitatively, relationships between chemical structure and reported occupational asthma hazard for low molecular weight (LMW) organic compounds; to develop and validate a model linking asthma hazard with chemical substructure; and to generate mechanistic hypotheses that might explain the relationships. Methods: A learning dataset used 78 LMW chemical asthmagens reported in the literature before 1995, and 301 control compounds with recognised occupational exposures and hazards other than respiratory sensitisation. The chemical structures of the asthmagens and control compounds were characterised by the presence of chemical substructure fragments. Odds ratios were calculated for these fragments to determine which were associated with a likelihood of being reported as an occupational asthmagen. Logistic regression modelling was used to identify the independent contribution of these substructures. A post-1995 set of 21 asthmagens and 77 controls were selected to externally validate the model. Results: Nitrogen or oxygen containing functional groups such as isocyanate, amine, acid anhydride, and carbonyl were associated with an occupational asthma hazard, particularly when the functional group was present twice or more in the same molecule. A logistic regression model using only statistically significant independent variables for occupational asthma hazard correctly assigned 90% of the model development set. The external validation showed a sensitivity of 86% and specificity of 99%. Conclusions: Although a wide variety of chemical structures are associated with occupational asthma, bifunctional reactivity is strongly associated with occupational asthma hazard across a range of chemical substructures. This suggests that chemical cross-linking is an important molecular mechanism leading to the development of occupational asthma. The logistic regression model is freely available on the internet and may offer a useful but inexpensive adjunct to the prediction of occupational asthma hazard. PMID:15778257
Clinical Decision Support Model to Predict Occlusal Force in Bruxism Patients

PubMed Central

Thanathornwong, Bhornsawan

2017-01-01

Objectives The aim of this study was to develop a decision support model for the prediction of occlusal force from the size and color of articulating paper markings in bruxism patients. Methods We used the information from the datasets of 30 bruxism patients in which digital measurements of the size and color of articulating paper markings (12-µm Hanel; Coltene/Whaledent GmbH, Langenau, Germany) on canine protected hard stabilization splints were measured in pixels (P) and in red (R), green (G), and blue (B) values using Adobe Photoshop software (Adobe Systems, San Jose, CA, USA). The occlusal force (F) was measured using T-Scan III (Tekscan Inc., South Boston, MA, USA). The multiple regression equation was applied to predict F from the P and RGB. Model evaluation was performed using the datasets from 10 new patients. The patient's occlusal force measured by T-Scan III was used as a ‘gold standard’ to compare with the occlusal force predicted by the multiple regression model. Results The results demonstrate that the correlation between the occlusal force and the pixels and RGB of the articulating paper markings was positive (F = 1.62×P + 0.07×R –0.08×G + 0.08×B + 4.74; R2 = 0.34). There was a high degree of agreement between the occlusal force of the patient measured using T-Scan III and the occlusal force predicted by the model (kappa value = 0.82). Conclusions The results obtained demonstrate that the multiple regression model can predict the occlusal force using the digital values for the size and color of the articulating paper markings in bruxism patients. PMID:29181234
Calibration Model for Apnea-Hypopnea Indices: Impact of Alternative Criteria for Hypopneas

PubMed Central

Ho, Vu; Crainiceanu, Ciprian M.; Punjabi, Naresh M.; Redline, Susan; Gottlieb, Daniel J.

2015-01-01

Study Objective: To characterize the association among apnea-hypopnea indices (AHIs) determined using three common metrics for defining hypopnea, and to develop a model to calibrate between these AHIs. Design: Cross-sectional analysis of Sleep Heart Health Study Data. Setting: Community-based. Participants: There were 6,441 men and women age 40 y or older. Measurement and Results: Three separate AHIs have been calculated, using all apneas (defined as a decrease in airflow greater than 90% from baseline for ≥ 10 sec) plus hypopneas (defined as a decrease in airflow or chest wall or abdominal excursion greater than 30% from baseline, but not meeting apnea definitions) associated with either: (1) a 4% or greater fall in oxyhemoglobin saturation—AHI4; (2) a 3% or greater fall in oxyhemoglobin saturation—AHI3; or (3) a 3% or greater fall in oxyhemoglobin saturation or an event-related arousal—AHI3a. Median values were 5.4, 9.7, and 13.4 for AHI4, AHI3, and AHI3a, respectively (P < 0.0001). Penalized spline regression models were used to compare AHI values across the three metrics and to calculate prediction intervals. Comparison of regression models demonstrates divergence in AHI scores among the three methods at low AHI values and gradual convergence at higher levels of AHI. Conclusions: The three methods of scoring hypopneas yielded significantly different estimates of the apnea-hypopnea index (AHI), although the relative difference is reduced in severe disease. The regression models presented will enable clinicians and researchers to more appropriately compare AHI values obtained using differing metrics for hypopnea. Citation: Ho V, Crainiceanu CM, Punjabi NM, Redline S, Gottlieb DJ. Calibration model for apnea-hypopnea indices: impact of alternative criteria for hypopneas. SLEEP 2015;38(12):1887–1892. PMID:26564122
Poisson Mixture Regression Models for Heart Disease Prediction.

PubMed

Mufudza, Chipo; Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Poisson Mixture Regression Models for Heart Disease Prediction

PubMed Central

Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Imaging Analysis of Hepatoblastoma Resectability Across Neoadjuvant Chemotherapy

PubMed Central

Murphy, Andrew J.; Ayers, Gregory D.; Hilmes, Melissa A.; Mukherjee, Kaushik; Wilson, Kevin J.; Allen, Wade M.; Fernandez-Pineda, Israel; Shinall, Myrick C.; Zhao, Zhiguo; Furman, Wayne L.; McCarville, Mary Beth; Davidoff, Andrew M.; Lovvorn, Harold N.

2013-01-01

Purpose Hepatoblastomas often require neoadjuvant chemotherapy to facilitate partial hepatectomy, which necessitates freedom of tumor borders from the confluence of hepatic veins (COHV), portal vein bifurcation (PVB), and retrohepatic inferior vena cava (IVC). This study aimed to clarify the effect of incremental neoadjuvant cycles on the AHEP0731 protocol criteria of hepatoblastoma resectability. Methods Hepatoblastoma responses to neoadjuvant chemotherapy were analyzed among patients (n=23) treated at two children’s hospitals between 1996 and 2010. Using digital imaging data, ellipsoid and point-based models were created to measure tumor volume regression and respective distances from tumor borders nearest to the COHV, PVB, and IVC. Results Hepatoblastoma volumes regressed with incremental neoadjuvant chemotherapy cycles (p<0.001). Although tumor borders regressed away from the COHV (p=0.008), on average only 1.1mm was gained. No change from tumor borders to the PVB was detected (p=0.102). Distances from tumor borders to the IVC remained stable at one hospital (p=0.612), but increased only 0.15mm every 10 days of therapy at the other (p=0.002). Neoadjuvant chemotherapy induced slightly more tumors to meet the threshold vascular margin of 1cm (baseline to completion): COHV, 11 (47.8%) to 17 (73.9%; p=0.058); PVB, 11 (47.8%) to 15 (65.2%; p=0.157); IVC, 4 (17.4%) to 10 (43.5%; p=0.034). No differences were detected in demographic or disease-specific characteristics between patients who did or did not achieve this 1cm margin after conclusion of chemotherapy. Conclusion Hepatoblastoma volumes regress significantly with increasing neoadjuvant chemotherapy cycles. However, tumors often remain anchored to the major hepatic vasculature, showing marginal improvement in resectability criteria. PMID:23845613
The novel application of artificial neural network on bioelectrical impedance analysis to assess the body composition in elderly

PubMed Central

2013-01-01

Background This study aims to improve accuracy of Bioelectrical Impedance Analysis (BIA) prediction equations for estimating fat free mass (FFM) of the elderly by using non-linear Back Propagation Artificial Neural Network (BP-ANN) model and to compare the predictive accuracy with the linear regression model by using energy dual X-ray absorptiometry (DXA) as reference method. Methods A total of 88 Taiwanese elderly adults were recruited in this study as subjects. Linear regression equations and BP-ANN prediction equation were developed using impedances and other anthropometrics for predicting the reference FFM measured by DXA (FFMDXA) in 36 male and 26 female Taiwanese elderly adults. The FFM estimated by BIA prediction equations using traditional linear regression model (FFMLR) and BP-ANN model (FFMANN) were compared to the FFMDXA. The measuring results of an additional 26 elderly adults were used to validate than accuracy of the predictive models. Results The results showed the significant predictors were impedance, gender, age, height and weight in developed FFMLR linear model (LR) for predicting FFM (coefficient of determination, r2 = 0.940; standard error of estimate (SEE) = 2.729 kg; root mean square error (RMSE) = 2.571kg, P < 0.001). The above predictors were set as the variables of the input layer by using five neurons in the BP-ANN model (r2 = 0.987 with a SD = 1.192 kg and relatively lower RMSE = 1.183 kg), which had greater (improved) accuracy for estimating FFM when compared with linear model. The results showed a better agreement existed between FFMANN and FFMDXA than that between FFMLR and FFMDXA. Conclusion When compared the performance of developed prediction equations for estimating reference FFMDXA, the linear model has lower r2 with a larger SD in predictive results than that of BP-ANN model, which indicated ANN model is more suitable for estimating FFM. PMID:23388042

Application of neural networks and sensitivity analysis to improved prediction of trauma survival.

PubMed

Hunter, A; Kennedy, L; Henry, J; Ferguson, I

2000-05-01

The performance of trauma departments is widely audited by applying predictive models that assess probability of survival, and examining the rate of unexpected survivals and deaths. Although the TRISS methodology, a logistic regression modelling technique, is still the de facto standard, it is known that neural network models perform better. A key issue when applying neural network models is the selection of input variables. This paper proposes a novel form of sensitivity analysis, which is simpler to apply than existing techniques, and can be used for both numeric and nominal input variables. The technique is applied to the audit survival problem, and used to analyse the TRISS variables. The conclusions discuss the implications for the design of further improved scoring schemes and predictive models.
Female married illiteracy as the most important continual determinant of total fertility rate among districts of Empowered Action Group States of India: Evidence from Annual Health Survey 2011–12

PubMed Central

Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti

2017-01-01

Background: District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Objective: Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Material and Methods: Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Results: Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. Conclusion: We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run. PMID:29416999
Predicting fundamental and realized distributions based on thermal niche: A case study of a freshwater turtle

NASA Astrophysics Data System (ADS)

Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco; Ribeiro, Bruno R.

2018-04-01

Species distribution models (SDM) have been broadly used in ecology to address theoretical and practical problems. Currently, there are two main approaches to generate SDMs: (i) correlative, which is based on species occurrences and environmental predictor layers and (ii) process-based models, which are constructed based on species' functional traits and physiological tolerances. The distributions estimated by each approach are based on different components of species niche. Predictions of correlative models approach species realized niches, while predictions of process-based are more akin to species fundamental niche. Here, we integrated the predictions of fundamental and realized distributions of the freshwater turtle Trachemys dorbigni. Fundamental distribution was estimated using data of T. dorbigni's egg incubation temperature, and realized distribution was estimated using species occurrence records. Both types of distributions were estimated using the same regression approaches (logistic regression and support vector machines), both considering macroclimatic and microclimatic temperatures. The realized distribution of T. dorbigni was generally nested in its fundamental distribution reinforcing theoretical assumptions that the species' realized niche is a subset of its fundamental niche. Both modelling algorithms produced similar results but microtemperature generated better results than macrotemperature for the incubation model. Finally, our results reinforce the conclusion that species realized distributions are constrained by other factors other than just thermal tolerances.
Impact of Colic Pain as a Significant Factor for Predicting the Stone Free Rate of One-Session Shock Wave Lithotripsy for Treating Ureter Stones: A Bayesian Logistic Regression Model Analysis

PubMed Central

Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong

2015-01-01

Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
Religiosity and decreased risk of substance use disorders: is the effect mediated by social support or mental health status?

PubMed Central

Harris, Katherine M.; Koenig, Harold G.; Han, Xiaotong; Sullivan, Greer; Mattox, Rhonda; Tang, Lingqi

2009-01-01

Objective The negative association between religiosity (religious beliefs and church attendance) and the likelihood of substance use disorders is well established, but the mechanism(s) remain poorly understood. We investigated whether this association was mediated by social support or mental health status. Method We utilized cross-sectional data from the 2002 National Survey on Drug Use and Health (n = 36,370). We first used logistic regression to regress any alcohol use in the past year on sociodemographic and religiosity variables. Then, among individuals who drank in the past year, we regressed past year alcohol abuse/dependence on sociodemographic and religiosity variables. To investigate whether social support mediated the association between religiosity and alcohol use and alcohol abuse/dependence we repeated the above models, adding the social support variables. To the extent that these added predictors modified the magnitude of the effect of the religiosity variables, we interpreted social support as a possible mediator. We also formally tested for mediation using path analysis. We investigated the possible mediating role of mental health status analogously. Parallel sets of analyses were conducted for any drug use, and drug abuse/dependence among those using any drugs as the dependent variables. Results The addition of social support and mental health status variables to logistic regression models had little effect on the magnitude of the religiosity coefficients in any of the models. While some of the tests of mediation were significant in the path analyses, the results were not always in the expected direction, and the magnitude of the effects was small. Conclusions The association between religiosity and decreased likelihood of a substance use disorder does not appear to be substantively mediated by either social support or mental health status. PMID:19714282
Alterations of papilla dimensions after orthodontic closure of the maxillary midline diastema: a retrospective longitudinal study

PubMed Central

2016-01-01

Purpose The aim of this study was to evaluate alterations of papilla dimensions after orthodontic closure of the diastema between maxillary central incisors. Methods Sixty patients who had a visible diastema between maxillary central incisors that had been closed by orthodontic approximation were selected for this study. Various papilla dimensions were assessed on clinical photographs and study models before the orthodontic treatment and at the follow-up examination after closure of the diastema. Influences of the variables assessed before orthodontic treatment on the alterations of papilla height (PH) and papilla base thickness (PBT) were evaluated by univariate regression analysis. To analyze potential influences of the 3-dimensional papilla dimensions before orthodontic treatment on the alterations of PH and PBT, a multiple regression model was formulated including the 3-dimensional papilla dimensions as predictor variables. Results On average, PH decreased by 0.80 mm and PBT increased after orthodontic closure of the diastema (P<0.01). Univariate regression analysis revealed that the PH (P=0.002) and PBT (P=0.047) before orthodontic treatment influenced the alteration of PH. With respect to the alteration of PBT, the diastema width (P=0.045) and PBT (P=0.000) were found to be influential factors. PBT before the orthodontic treatment significantly influenced the alteration of PBT in the multiple regression model. Conclusions PH decreased but PBT increased after orthodontic closure of the diastema. The papilla dimensions before orthodontic treatment influenced the alterations of PH and PBT after closure of the diastema. The PBT increased more when the diastema width before the orthodontic treatment was larger. PMID:27382507
Asthma exacerbation and proximity of residence to major roads: a population-based matched case-control study among the pediatric Medicaid population in Detroit, Michigan

PubMed Central

2011-01-01

Background The relationship between asthma and traffic-related pollutants has received considerable attention. The use of individual-level exposure measures, such as residence location or proximity to emission sources, may avoid ecological biases. Method This study focused on the pediatric Medicaid population in Detroit, MI, a high-risk population for asthma-related events. A population-based matched case-control analysis was used to investigate associations between acute asthma outcomes and proximity of residence to major roads, including freeways. Asthma cases were identified as all children who made at least one asthma claim, including inpatient and emergency department visits, during the three-year study period, 2004-06. Individually matched controls were randomly selected from the rest of the Medicaid population on the basis of non-respiratory related illness. We used conditional logistic regression with distance as both categorical and continuous variables, and examined non-linear relationships with distance using polynomial splines. The conditional logistic regression models were then extended by considering multiple asthma states (based on the frequency of acute asthma outcomes) using polychotomous conditional logistic regression. Results Asthma events were associated with proximity to primary roads with an odds ratio of 0.97 (95% CI: 0.94, 0.99) for a 1 km increase in distance using conditional logistic regression, implying that asthma events are less likely as the distance between the residence and a primary road increases. Similar relationships and effect sizes were found using polychotomous conditional logistic regression. Another plausible exposure metric, a reduced form response surface model that represents atmospheric dispersion of pollutants from roads, was not associated under that exposure model. Conclusions There is moderately strong evidence of elevated risk of asthma close to major roads based on the results obtained in this population-based matched case-control study. PMID:21513554
Multicollinearity in spatial genetics: separating the wheat from the chaff using commonality analyses.

PubMed

Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C

2015-01-01

Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
Parametric regression model for survival data: Weibull regression model as an example

PubMed Central

2016-01-01

Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
Introduction to the use of regression models in epidemiology.

PubMed

Bender, Ralf

2009-01-01

Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition

PubMed Central

2014-01-01

Background In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves’ dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. Results First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Conclusion Time independent summary statistics may aid the understanding of drugs’ action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies. PMID:24902483
Risk estimation using probability machines

PubMed Central

2014-01-01

Background Logistic regression has been the de facto, and often the only, model used in the description and analysis of relationships between a binary outcome and observed features. It is widely used to obtain the conditional probabilities of the outcome given predictors, as well as predictor effect size estimates using conditional odds ratios. Results We show how statistical learning machines for binary outcomes, provably consistent for the nonparametric regression problem, can be used to provide both consistent conditional probability estimation and conditional effect size estimates. Effect size estimates from learning machines leverage our understanding of counterfactual arguments central to the interpretation of such estimates. We show that, if the data generating model is logistic, we can recover accurate probability predictions and effect size estimates with nearly the same efficiency as a correct logistic model, both for main effects and interactions. We also propose a method using learning machines to scan for possible interaction effects quickly and efficiently. Simulations using random forest probability machines are presented. Conclusions The models we propose make no assumptions about the data structure, and capture the patterns in the data by just specifying the predictors involved and not any particular model structure. So they do not run the same risks of model mis-specification and the resultant estimation biases as a logistic model. This methodology, which we call a “risk machine”, will share properties from the statistical machine that it is derived from. PMID:24581306
Funding source, conflict of interest and positive conclusions in neuro-oncology clinical trials.

PubMed

Moraes, Fabio Y; Mendez, Lucas C; Taunk, Neil K; Raman, Srinivas; Suh, John H; Souhami, Luis; Slotman, Ben; Weltman, Eduardo; Spratt, Daniel E; Berlin, Alejandro; Marta, Gustavo N

2018-02-01

We aimed to test any association between authors' conclusions and self-reported COI or funding sources in central nervous system (CNS) studies. A review was performed for CNS malignancy clinical trials published in the last 5 years. Two investigators independently classified study conclusions according to authors' endorsement of the experimental therapy. Statistical models were used to test for associations between positive conclusions and trials characteristics. From February 2010 to February 2015, 1256 articles were retrieved; 319 were considered eligible trials. Positive conclusions were reported in 56.8% of trials with industry-only, 55.6% with academia-only, 44.1% with academia and industry, 77.8% with none, and 76.4% with not described funding source (p = 0.011). Positive conclusions were reported in 60.4% of trials with unrelated COI, 60% with related COI, and 60% with no COI reported (p = 0.997). Factors that were significantly associated with the presence of positive conclusion included trials design (phase 1) [OR 11.64 (95 CI 4.66-29.09), p < 0.001], geographic location (outside North America or Europe) [OR 1.96 (95 CI 1.05-3.79), P = 0.025], primary outcomes (non-overall or progression free survival) [OR 3.74 (95 CI 2.27-6.18), p < 0.001], and failure to disclose funding source [OR 2.45 (95 CI 1.22-5.22), p = 0.011]. In a multivariable regression model, all these factors remained significantly associated with trial's positive conclusion. Funding source and self-reported COI did not appear to influence the CNS trials conclusion. Funding source information and COI disclosure were under-reported in 14.1 and 17.2% of the CNS trials. Continued efforts are needed to increase rates of both COI and funding source reporting.
Interpretation of commonly used statistical regression models.

PubMed

Kasza, Jessica; Wolfe, Rory

2014-01-01

A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Private Drinking Water Wells as a Source of Exposure to Perfluorooctanoic Acid (PFOA) in Communities Surrounding a Fluoropolymer Production Facility

PubMed Central

Hoffman, Kate; Webster, Thomas F.; Bartell, Scott M.; Weisskopf, Marc G.; Fletcher, Tony; Vieira, Verónica M.

2011-01-01

Background The C8 Health Project was established in 2005 to collect data on perfluorooctanoic acid (PFOA, or C8) and human health in Ohio and West Virginia communities contaminated by a fluoropolymer production facility. Objective We assessed PFOA exposure via contaminated drinking water in a subset of C8 Health Project participants who drank water from private wells. Methods Participants provided demographic information and residential, occupational, and medical histories. Laboratory analyses were conducted to determine serum-PFOA concentrations. PFOA data were collected from 2001 through 2005 from 62 private drinking water wells. We examined the relationship between drinking water and PFOA levels in serum using robust regression methods. As a comparison with regression models, we used a first-order, single-compartment pharmacokinetic model to estimate the serum:drinking-water concentration ratio at steady state. Results The median serum PFOA concentration in 108 study participants who used private wells was 75.7 μg/L, approximately 20 times greater than the levels in the U.S. general population but similar to those of local residents who drank public water. Each 1 μg/L increase in PFOA levels in drinking water was associated with an increase in serum concentrations of 141.5 μg/L (95% confidence interval, 134.9–148.1). The serum:drinking-water concentration ratio for the steady-state pharmacokinetic model was 114. Conclusions PFOA-contaminated drinking water is a significant contributor to PFOA levels in serum in the study population. Regression methods and pharmacokinetic modeling produced similar estimates of the relationship. PMID:20920951
[Bibliometrics and visualization analysis of land use regression models in ambient air pollution research].

PubMed

Zhang, Y J; Zhou, D H; Bai, Z P; Xue, F X

2018-02-10

Objective: To quantitatively analyze the current status and development trends regarding the land use regression (LUR) models on ambient air pollution studies. Methods: Relevant literature from the PubMed database before June 30, 2017 was analyzed, using the Bibliographic Items Co-occurrence Matrix Builder (BICOMB 2.0). Keywords co-occurrence networks, cluster mapping and timeline mapping were generated, using the CiteSpace 5.1.R5 software. Relevant literature identified in three Chinese databases was also reviewed. Results: Four hundred sixty four relevant papers were retrieved from the PubMed database. The number of papers published showed an annual increase, in line with the growing trend of the index. Most papers were published in the journal of Environmental Health Perspectives . Results from the Co-word cluster analysis identified five clusters: cluster#0 consisted of birth cohort studies related to the health effects of prenatal exposure to air pollution; cluster#1 referred to land use regression modeling and exposure assessment; cluster#2 was related to the epidemiology on traffic exposure; cluster#3 dealt with the exposure to ultrafine particles and related health effects; cluster#4 described the exposure to black carbon and related health effects. Data from Timeline mapping indicated that cluster#0 and#1 were the main research areas while cluster#3 and#4 were the up-coming hot areas of research. Ninety four relevant papers were retrieved from the Chinese databases with most of them related to studies on modeling. Conclusion: In order to better assess the health-related risks of ambient air pollution, and to best inform preventative public health intervention policies, application of LUR models to environmental epidemiology studies in China should be encouraged.
Collaborative Chronic Care Models for Mental Health Conditions: Cumulative Meta-Analysis and Meta-Regression to Guide Future Research and Implementation

PubMed Central

Grogan-Kaylor, Andrew; Perron, Brian E.; Kilbourne, Amy M.; Woltmann, Emily; Bauer, Mark S.

2013-01-01

Objective Prior meta-analysis indicates that collaborative chronic care models (CCMs) improve mental and physical health outcomes for individuals with mental disorders. This study aimed to investigate the stability of evidence over time and identify patient and intervention factors associated with CCM effects in order to facilitate implementation and sustainability of CCMs in clinical practice. Method We reviewed 53 CCM trials that analyzed depression, mental quality of life (QOL), or physical QOL outcomes. Cumulative meta-analysis and meta-regression were supplemented by descriptive investigations across and within trials. Results Most trials targeted depression in the primary care setting, and cumulative meta-analysis indicated that effect sizes favoring CCM quickly achieved significance for depression outcomes, and more recently achieved significance for mental and physical QOL. Four of six CCM elements (patient self-management support, clinical information systems, system redesign, and provider decision support) were common among reviewed trials, while two elements (healthcare organization support and linkages to community resources) were rare. No single CCM element was statistically associated with the success of the model. Similarly, meta-regression did not identify specific factors associated with CCM effectiveness. Nonetheless, results within individual trials suggest that increased illness severity predicts CCM outcomes. Conclusions Significant CCM trials have been derived primarily from four original CCM elements. Nonetheless, implementing and sustaining this established model will require healthcare organization support. While CCMs have typically been tested as population-based interventions, evidence supports stepped care application to more severely ill individuals. Future priorities include developing implementation strategies to support adoption and sustainability of the model in clinical settings while maximizing fit of this multi-component framework to local contextual factors. PMID:23938600
Influenza vaccine coverage, influenza-associated morbidity and all-cause mortality in Catalonia (Spain).

PubMed

Muñoz, M Pilar; Soldevila, Núria; Martínez, Anna; Carmona, Glòria; Batalla, Joan; Acosta, Lesly M; Domínguez, Angela

2011-07-12

The objective of this work was to study the behaviour of influenza with respect to morbidity and all-cause mortality in Catalonia, and their association with influenza vaccination coverage. The study was carried out over 13 influenza seasons, from epidemiological week 40 of 1994 to week 20 of 2007, and included confirmed cases of influenza and all-cause mortality. Two generalized linear models were fitted: influenza-associated morbidity was modelled by Poisson regression and all-cause mortality by negative binomial regression. The seasonal component was modelled with the periodic function formed by the sum of the sinus and cosines. Expected influenza mortality during periods of influenza virus circulation was estimated by Poisson regression and its confidence intervals using the Bootstrap approach. Vaccination coverage was associated with a reduction in influenza-associated morbidity (p<0.001), but not with a reduction in all-cause mortality (p=0.149). In the case of influenza-associated morbidity, an increase of 5% in vaccination coverage represented a reduction of 3% in the incidence rate of influenza. There was a positive association between influenza-associated morbidity and all-cause mortality. Excess mortality attributable to influenza epidemics was estimated as 34.4 (95% CI: 28.4-40.8) weekly deaths. In conclusion, all-cause mortality is a good indicator of influenza surveillance and vaccination coverage is associated with a reduction in influenza-associated morbidity but not with all-cause mortality. Copyright © 2011 Elsevier Ltd. All rights reserved.
The Joint Effects of Lifestyle Factors and Comorbidities on the Risk of Colorectal Cancer: A Large Chinese Retrospective Case-Control Study

PubMed Central

Hu, Hai; Zhou, Yangyang; Ren, Shujuan; Wu, Jiajin; Zhu, Meiying; Chen, Donghui; Yang, Haiyan; Wang, Liwei

2015-01-01

Background Colorectal cancer (CRC) is a major cause of cancer morbidity and mortality. In previous epidemiologic studies, the respective correlation between lifestyle factors and comorbidity and CRC has been extensively studied. However, little is known about their joint effects on CRC. Methods We conducted a retrospective case-control study of 1,144 diagnosed CRC patients and 60,549 community controls. A structured questionnaire was administered to the participants about their socio-demographic factors, anthropometric measures, comorbidity history and lifestyle factors. Logistic regression model was used to calculate the odds ratio (ORs) and 95% confidence intervals (95%CIs) for each factor. According to the results from logistic regression model, we further developed healthy lifestyle index (HLI) and comorbidity history index (CHI) to investigate their independent and joint effects on CRC risk. Results Four lifestyle factors (including physical activities, sleep, red meat and vegetable consumption) and four types of comorbidity (including diabetes, hyperlipidemia, history of inflammatory bowel disease and polyps) were found to be independently associated with the risk of CRC in multivariant logistic regression model. Intriguingly, their combined pattern- HLI and CHI demonstrated significant correlation with CRC risk independently (ORHLI: 3.91, 95%CI: 3.13–4.88; ORCHI: 2.49, 95%CI: 2.11–2.93) and jointly (OR: 10.33, 95%CI: 6.59–16.18). Conclusions There are synergistic effects of lifestyle factors and comorbidity on the risk of colorectal cancer in the Chinese population. PMID:26710070
Locoregional Control of Non-Small Cell Lung Cancer in Relation to Automated Early Assessment of Tumor Regression on Cone Beam Computed Tomography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brink, Carsten, E-mail: carsten.brink@rsyd.dk; Laboratory of Radiation Physics, Odense University Hospital; Bernchou, Uffe

2014-07-15

Purpose: Large interindividual variations in volume regression of non-small cell lung cancer (NSCLC) are observable on standard cone beam computed tomography (CBCT) during fractionated radiation therapy. Here, a method for automated assessment of tumor volume regression is presented and its potential use in response adapted personalized radiation therapy is evaluated empirically. Methods and Materials: Automated deformable registration with calculation of the Jacobian determinant was applied to serial CBCT scans in a series of 99 patients with NSCLC. Tumor volume at the end of treatment was estimated on the basis of the first one third and two thirds of the scans.more » The concordance between estimated and actual relative volume at the end of radiation therapy was quantified by Pearson's correlation coefficient. On the basis of the estimated relative volume, the patients were stratified into 2 groups having volume regressions below or above the population median value. Kaplan-Meier plots of locoregional disease-free rate and overall survival in the 2 groups were used to evaluate the predictive value of tumor regression during treatment. Cox proportional hazards model was used to adjust for other clinical characteristics. Results: Automatic measurement of the tumor regression from standard CBCT images was feasible. Pearson's correlation coefficient between manual and automatic measurement was 0.86 in a sample of 9 patients. Most patients experienced tumor volume regression, and this could be quantified early into the treatment course. Interestingly, patients with pronounced volume regression had worse locoregional tumor control and overall survival. This was significant on patient with non-adenocarcinoma histology. Conclusions: Evaluation of routinely acquired CBCT images during radiation therapy provides biological information on the specific tumor. This could potentially form the basis for personalized response adaptive therapy.« less

Study of relationship between clinical factors and velopharyngeal closure in cleft palate patients

PubMed Central

Chen, Qi; Zheng, Qian; Shi, Bing; Yin, Heng; Meng, Tian; Zheng, Guang-ning

2011-01-01

BACKGROUND: This study was carried out to analyze the relationship between clinical factors and velopharyngeal closure (VPC) in cleft palate patients. METHODS: Chi-square test was used to compare the postoperative velopharyngeal closure rate. Logistic regression model was used to analyze independent variables associated with velopharyngeal closure. RESULTS: Difference of postoperative VPC rate in different cleft types, operative ages and surgical techniques was significant (P=0.000). Results of logistic regression analysis suggested that when operative age was beyond deciduous dentition stage, or cleft palate type was complete, or just had undergone a simple palatoplasty without levator veli palatini retropositioning, patients would suffer a higher velopharyngeal insufficiency rate after primary palatal repair. CONCLUSIONS: Cleft type, operative age and surgical technique were the contributing factors influencing VPC rate after primary palatal repair of cleft palate patients. PMID:22279464
The Relationships between Weather-Related Factors and Daily Outdoor Physical Activity Counts on an Urban Greenway

PubMed Central

Wolff, Dana; Fitzhugh, Eugene C.

2011-01-01

The purpose of this study was to examine relationships between weather and outdoor physical activity (PA). An online weather source was used to obtain daily max temperature [DMT], precipitation, and wind speed. An infra-red trail counter provided data on daily trail use along a greenway, over a 2-year period. Multiple regression analysis was used to examine associations between PA and weather, while controlling for day of the week and month of the year. The overall regression model explained 77.0% of the variance in daily PA (p < 0.001). DMT (b = 10.5), max temp-squared (b = −4.0), precipitation (b = −70.0), and max wind speed (b = 1.9) contributed significantly. Conclusion: Aggregated daily data can detect relationships between weather and outdoor PA. PMID:21556205
Can We Predict Individual Combined Benefit and Harm of Therapy? Warfarin Therapy for Atrial Fibrillation as a Test Case

PubMed Central

Li, Guowei; Thabane, Lehana; Delate, Thomas; Witt, Daniel M.; Levine, Mitchell A. H.; Cheng, Ji; Holbrook, Anne

2016-01-01

Objectives To construct and validate a prediction model for individual combined benefit and harm outcomes (stroke with no major bleeding, major bleeding with no stroke, neither event, or both) in patients with atrial fibrillation (AF) with and without warfarin therapy. Methods Using the Kaiser Permanente Colorado databases, we included patients newly diagnosed with AF between January 1, 2005 and December 31, 2012 for model construction and validation. The primary outcome was a prediction model of composite of stroke or major bleeding using polytomous logistic regression (PLR) modelling. The secondary outcome was a prediction model of all-cause mortality using the Cox regression modelling. Results We included 9074 patients with 4537 and 4537 warfarin users and non-users, respectively. In the derivation cohort (n = 4632), there were 136 strokes (2.94%), 280 major bleedings (6.04%) and 1194 deaths (25.78%) occurred. In the prediction models, warfarin use was not significantly associated with risk of stroke, but increased the risk of major bleeding and decreased the risk of death. Both the PLR and Cox models were robust, internally and externally validated, and with acceptable model performances. Conclusions In this study, we introduce a new methodology for predicting individual combined benefit and harm outcomes associated with warfarin therapy for patients with AF. Should this approach be validated in other patient populations, it has potential advantages over existing risk stratification approaches as a patient-physician aid for shared decision-making PMID:27513986
Modified Regression Correlation Coefficient for Poisson Regression Model

NASA Astrophysics Data System (ADS)

Kaengthong, Nattacha; Domthong, Uthumporn

2017-09-01

This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
A secure distributed logistic regression protocol for the detection of rare adverse drug events

PubMed Central

El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

2013-01-01

Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397
Statistical approach to the analysis of olive long-term pollen season trends in southern Spain.

PubMed

García-Mozo, H; Yaezel, L; Oteros, J; Galán, C

2014-03-01

Analysis of long-term airborne pollen counts makes it possible not only to chart pollen-season trends but also to track changing patterns in flowering phenology. Changes in higher plant response over a long interval are considered among the most valuable bioindicators of climate change impact. Phenological-trend models can also provide information regarding crop production and pollen-allergen emission. The interest of this information makes essential the election of the statistical analysis for time series study. We analysed trends and variations in the olive flowering season over a 30-year period (1982-2011) in southern Europe (Córdoba, Spain), focussing on: annual Pollen Index (PI); Pollen Season Start (PSS), Peak Date (PD), Pollen Season End (PSE) and Pollen Season Duration (PSD). Apart from the traditional Linear Regression analysis, a Seasonal-Trend Decomposition procedure based on Loess (STL) and an ARIMA model were performed. Linear regression results indicated a trend toward delayed PSE and earlier PSS and PD, probably influenced by the rise in temperature. These changes are provoking longer flowering periods in the study area. The use of the STL technique provided a clearer picture of phenological behaviour. Data decomposition on pollination dynamics enabled the trend toward an alternate bearing cycle to be distinguished from the influence of other stochastic fluctuations. Results pointed to show a rising trend in pollen production. With a view toward forecasting future phenological trends, ARIMA models were constructed to predict PSD, PSS and PI until 2016. Projections displayed a better goodness of fit than those derived from linear regression. Findings suggest that olive reproductive cycle is changing considerably over the last 30years due to climate change. Further conclusions are that STL improves the effectiveness of traditional linear regression in trend analysis, and ARIMA models can provide reliable trend projections for future years taking into account the internal fluctuations in time series. Copyright © 2013 Elsevier B.V. All rights reserved.
Interrelationship of Cytokines, Hypothalamic-Pituitary-Adrenal Axis Hormones, and Psychosocial Variables in the Prediction of Preterm Birth

PubMed Central

Pearce, B.D.; Grove, J.; Bonney, E.A.; Bliwise, N.; Dudley, D.J.; Schendel, D.E.; Thorsen, P.

2010-01-01

Background/Aims To examine the relationship of biological mediators (cytokines, stress hormones), psychosocial, obstetric history, and demographic factors in the early prediction of preterm birth (PTB) using a comprehensive logistic regression model incorporating diverse risk factors. Methods In this prospective case-control study, maternal serum biomarkers were quantified at 9–23 weeks’ gestation in 60 women delivering at <37 weeks compared to 123 women delivering at term. Biomarker data were combined with maternal sociodemographic factors and stress data into regression models encompassing 22 preterm risk factors and 1st-order interactions. Results Among individual biomarkers, we found that macrophage migration inhibitory factor (MIF), interleukin-10, C-reactive protein (CRP), and tumor necrosis factor-α were statistically significant predictors of PTB at all cutoff levels tested (75th, 85th, and 90th percentiles). We fit multifactor models for PTB prediction at each biomarker cutoff. Our best models revealed that MIF, CRP, risk-taking behavior, and low educational attainment were consistent predictors of PTB at all biomarker cutoffs. The 75th percentile cutoff yielded the best predicting model with an area under the ROC curve of 0.808 (95% CI 0.743–0.874). Conclusion Our comprehensive models highlight the prominence of behavioral risk factors for PTB and point to MIF as a possible psychobiological mediator. PMID:20160447
Comparison of Weibull and Lognormal Cure Models with Cox in the Survival Analysis Of Breast Cancer Patients in Rafsanjan.

PubMed

Hoseini, Mina; Bahrampour, Abbas; Mirzaee, Moghaddameh

2017-02-16

Breast cancer is the most common cancer after lung cancer and the second cause of death. In this study we compared Weibull and Lognormal Cure Models with Cox regression on the survival of breast cancer. A cohort study. The current study retrospective cohort study was conducted on 140 patients referred to Ali Ibn Abitaleb Hospital, Rafsanjan southeastern Iran from 2001 to 2015 suffering from breast cancer. We determined and analyzed the effective survival causes by different models using STATA14. According to AIC, log-normal model was more consistent than Weibull. In the multivariable Lognormal model, the effective factors like smoking, second -hand smoking, drinking herbal tea and the last breast-feeding period were included. In addition, using Cox regression factors of significant were the disease grade, size of tumor and its metastasis (p-value<0.05). As Rafsanjan is surrounded by pistachio orchards and pesticides applied by farmers, people of this city are exposed to agricultural pesticides and its harmful consequences. The effect of the pesticide on breast cancer was studied and the results showed that the effect of pesticides on breast cancer was not in agreement with the models used in this study. Based on different methods for survival analysis, researchers can decide how they can reach a better conclusion. This comparison indicates the result of semi-parametric Cox method is closer to clinical experiences evidences.
Fusing Data Mining, Machine Learning and Traditional Statistics to Detect Biomarkers Associated with Depression

PubMed Central

Dipnall, Joanna F.

2016-01-01

Background Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. Methods The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009–2010). Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators. Results After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30), serum glucose (OR 1.01; 95% CI 1.00, 1.01) and total bilirubin (OR 0.12; 95% CI 0.05, 0.28). Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016), and current smokers (p<0.001). Conclusion The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and complex survey sampling methodology and was demonstrated to be a useful tool for detecting three biomarkers associated with depression for future hypothesis generation: red cell distribution width, serum glucose and total bilirubin. PMID:26848571
Radiation Dose-Response Model for Locally Advanced Rectal Cancer After Preoperative Chemoradiation Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Appelt, Ane L., E-mail: ane.lindegaard.appelt@slb.regionsyddanmark.dk; University of Southern Denmark, Odense; Ploen, John

2013-01-01

Purpose: Preoperative chemoradiation therapy (CRT) is part of the standard treatment of locally advanced rectal cancers. Tumor regression at the time of operation is desirable, but not much is known about the relationship between radiation dose and tumor regression. In the present study we estimated radiation dose-response curves for various grades of tumor regression after preoperative CRT. Methods and Materials: A total of 222 patients, treated with consistent chemotherapy and radiation therapy techniques, were considered for the analysis. Radiation therapy consisted of a combination of external-beam radiation therapy and brachytherapy. Response at the time of operation was evaluated from themore » histopathologic specimen and graded on a 5-point scale (TRG1-5). The probability of achieving complete, major, and partial response was analyzed by ordinal logistic regression, and the effect of including clinical parameters in the model was examined. The radiation dose-response relationship for a specific grade of histopathologic tumor regression was parameterized in terms of the dose required for 50% response, D{sub 50,i}, and the normalized dose-response gradient, {gamma}{sub 50,i}. Results: A highly significant dose-response relationship was found (P=.002). For complete response (TRG1), the dose-response parameters were D{sub 50,TRG1} = 92.0 Gy (95% confidence interval [CI] 79.3-144.9 Gy), {gamma}{sub 50,TRG1} = 0.982 (CI 0.533-1.429), and for major response (TRG1-2) D{sub 50,TRG1} and {sub 2} = 72.1 Gy (CI 65.3-94.0 Gy), {gamma}{sub 50,TRG1} and {sub 2} = 0.770 (CI 0.338-1.201). Tumor size and N category both had a significant effect on the dose-response relationships. Conclusions: This study demonstrated a significant dose-response relationship for tumor regression after preoperative CRT for locally advanced rectal cancer for tumor dose levels in the range of 50.4-70 Gy, which is higher than the dose range usually considered.« less
The recovery of bladder epithelial hyperplasia caused by a melamine diet-induced bladder calculus in mice.

PubMed

Sun, Ying; Jiang, Yi-Na; Xu, Chang-Fu; Du, Yun-Xia; Zhang, Jiao-Jiao; Yan, Yang; Gao, Xiao-Li

2014-02-01

Applying a model of bladder epithelial hyperplasia (BEH) caused by melamine-induced bladder calculus (BC), the recovery of BEH after melamine withdrawal was investigated. One experiment, comprising untreated, melamine and recovery groups, was conducted in Balb/c mice. Each group included 4 subgroups. Mice were fed normal-diet in untreated or a melamine-diet in other groups. The melamine-diet was then substituted with normal-diet in recovery group. Both of BC and BEH were observed after 14 and 56 days of melamine-diet. The BC is relatively uniform at the same melamine-diet durations. The BEH was diffuse with many mitotic figures, 4-7 rows of nuclei, and well-defined umbrella/intermediate cells. No marked differences in BEH degree were observed in the two different melamine-diet durations. On 4-42 days after melamine withdrawal, BC was not found, as the progressive regression with complete regression of BEH was observed, along with well-defined ageing/apoptotic cells in the superficial regions of BEH regression tissue. Conclusion, the melamine-induced BEH is relatively uniform, may be self-limiting in rows of nuclei, and can return to normal. Melamine withdrawal duration is critical for the BEH regression. Tissue of the BEH and its regression is ideal for exploring the renewal as well as growth biology of mammalian urothelium. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Drug use, mental health and problems related to crime and violence: cross-sectional study1

PubMed Central

Claro, Heloísa Garcia; de Oliveira, Márcia Aparecida Ferreira; Bourdreaux, Janet Titus; Fernandes, Ivan Filipe de Almeida Lopes; Pinho, Paula Hayasi; Tarifa, Rosana Ribeiro

2015-01-01

Objective: to investigate the correlation between disorders related to the use of alcohol and other drugs and symptoms of mental disorders, problems related to crime and violence and to age and gender. Methods: cross-sectional descriptive study carried out with 128 users of a Psychosocial Care Center for Alcohol and other Drugs, in the city of São Paulo, interviewed by means of the instrument entitled Global Appraisal of Individual Needs - Short Screener. Univariate and multiple linear regression models were used to verify the correlation between the variables. Results: using univariate regression models, internalizing and externalizing symptoms and problems related to crime/violence proved significant and were included in the multiple model, in which only the internalizing symptoms and problems related to crime and violence remained significant. Conclusions: there is a correlation between the severity of problems related to alcohol use and severity of mental health symptoms and crime and violence in the study sample. The results emphasize the need for an interdisciplinary and intersectional character of attention to users of alcohol and other drugs, since they live in a socially vulnerable environment. PMID:26626010
An Exploratory Analysis of Personality, Attitudes, and Study Skills on the Learning Curve within a Team-based Learning Environment

PubMed Central

Henry, Teague; Campbell, Ashley

2015-01-01

Objective. To examine factors that determine the interindividual variability of learning within a team-based learning environment. Methods. Students in a pharmacokinetics course were given 4 interim, low-stakes cumulative assessments throughout the semester and a cumulative final examination. Students’ Myers-Briggs personality type was assessed, as well as their study skills, motivations, and attitudes towards team-learning. A latent curve model (LCM) was applied and various covariates were assessed to improve the regression model. Results. A quadratic LCM was applied for the first 4 assessments to predict final examination performance. None of the covariates examined significantly impacted the regression model fit except metacognitive self-regulation, which explained some of the variability in the rate of learning. There were some correlations between personality type and attitudes towards team learning, with introverts having a lower opinion of team-learning than extroverts. Conclusion. The LCM could readily describe the learning curve. Extroverted and introverted personality types had the same learning performance even though preference for team-learning was lower in introverts. Other personality traits, study skills, or practice did not significantly contribute to the learning variability in this course. PMID:25861101
Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

PubMed

Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

2011-11-15

There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.
Contrast Enhanced Maximum Intensity Projection Ultrasound Imaging for Assessing Angiogenesis in Murine Glioma and Breast Tumor Models: A Comparative Study

PubMed Central

Forsberg, Flemming; Ro, Raymond J.; Fox, Traci B; Liu, Ji-Bin; Chiou, See-Ying; Potoczek, Magdalena; Goldberg, Barry B

2010-01-01

The purpose of this study was to prospectively compare noninvasive, quantitative measures of vascularity obtained from 4 contrast enhanced ultrasound (US) techniques to 4 invasive immunohistochemical markers of tumor angiogenesis in a large group of murine xenografts. Glioma (C6) or breast cancer (NMU) cells were implanted in 144 rats. The contrast agent Optison (GE Healthcare, Princeton, NJ) was injected in a tail vein (dose: 0.4ml/kg). Power Doppler imaging (PDI), pulse-subtraction harmonic imaging (PSHI), flash-echo imaging (FEI), and Microflow imaging (MFI; a technique creating maximum intensity projection images over time) was performed with an Aplio scanner (Toshiba America Medical Systems, Tustin, CA) and a 7.5 MHz linear array. Fractional tumor neovascularity was calculated from digital clips of contrast US, while the relative area stained was calculated from specimens. Results were compared using a factorial, repeated measures ANOVA, linear regression and z-tests. The tortuous morphology of tumor neovessels was visualized better with MFI than with the other US modes. Cell line, implantation method and contrast US imaging technique were significant parameters in the ANOVA model (p<0.05). The strongest correlation determined by linear regression in the C6 model was between PSHI and percent area stained with CD31 (r=0.37, p<0.0001). In the NMU model the strongest correlation was between FEI and COX-2 (r=0.46, p<0.0001). There were no statistically significant differences between correlations obtained with the various US methods (p>0.05). In conclusion, the largest study of contrast US of murine xenografts to date has been conducted and quantitative contrast enhanced US measures of tumor neovascularity in glioma and breast cancer xenograft models appear to provide a noninvasive marker for angiogenesis; although the best method for monitoring angiogenesis was not conclusively established. PMID:21144542
Brain necrosis after fractionated radiation therapy: Is the halftime for repair longer than we thought?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bender, Edward T.

Purpose: To derive a radiobiological model that enables the estimation of brain necrosis and spinal cord myelopathy rates for a variety of fractionation schemes, and to compare repair effects between brain and spinal cord. Methods: Sigmoidal dose response relationships for brain radiation necrosis and spinal cord myelopathy are derived from clinical data using nonlinear regression. Three different repair models are considered and the repair halftimes are included as regression parameters. Results: For radiation necrosis, a repair halftime of 38.1 (range 6.9-76) h is found with monoexponential repair, while for spinal cord myelopathy, a repair halftime of 4.1 (range 0-8) hmore » is found. The best-fit alpha beta ratio is 0.96 (range 0.24-1.73)Conclusions: A radiobiological model that includes repair corrections can describe the clinical data for a variety of fraction sizes, fractionation schedules, and total doses. Modeling suggests a relatively long repair halftime for brain necrosis. This study suggests that the repair halftime for late radiation effects in the brain may be longer than is currently thought. If confirmed in future studies, this may lead to a re-evaluation of radiation fractionation schedules for some CNS diseases, particularly for those diseases where fractionated stereotactic radiation therapy is used.« less
Nomogram Prediction of Overall Survival After Curative Irradiation for Uterine Cervical Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seo, YoungSeok; Yoo, Seong Yul; Kim, Mi-Sook

Purpose: The purpose of this study was to develop a nomogram capable of predicting the probability of 5-year survival after radical radiotherapy (RT) without chemotherapy for uterine cervical cancer. Methods and Materials: We retrospectively analyzed 549 patients that underwent radical RT for uterine cervical cancer between March 1994 and April 2002 at our institution. Multivariate analysis using Cox proportional hazards regression was performed and this Cox model was used as the basis for the devised nomogram. The model was internally validated for discrimination and calibration by bootstrap resampling. Results: By multivariate regression analysis, the model showed that age, hemoglobin levelmore » before RT, Federation Internationale de Gynecologie Obstetrique (FIGO) stage, maximal tumor diameter, lymph node status, and RT dose at Point A significantly predicted overall survival. The survival prediction model demonstrated good calibration and discrimination. The bootstrap-corrected concordance index was 0.67. The predictive ability of the nomogram proved to be superior to FIGO stage (p = 0.01). Conclusions: The devised nomogram offers a significantly better level of discrimination than the FIGO staging system. In particular, it improves predictions of survival probability and could be useful for counseling patients, choosing treatment modalities and schedules, and designing clinical trials. However, before this nomogram is used clinically, it should be externally validated.« less
To Control False Positives in Gene-Gene Interaction Analysis: Two Novel Conditional Entropy-Based Approaches

PubMed Central

Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng

2013-01-01

Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984
Relationship between FEV1 and Cardiovascular Risk Factors in General Population without Airflow Limitation.

PubMed

Lee, Jeong Hyeon; Kang, Yun-Seong; Jeong, Yun-Jeong; Yoon, Young-Soon; Kwack, Won Gun; Oh, Jin Young

2016-01-01

Purpose. We aimed to determine the value of lung function measurement for predicting cardiovascular (CV) disease by evaluating the association between FEV1 (%) and CV risk factors in general population. Materials and Methods. This was a cross-sectional, retrospective study of subjects above 18 years of age who underwent health examinations. The relationship between FEV1 (%) and presence of carotid plaque and thickened carotid IMT (≥0.8 mm) was analyzed by multiple logistic regression, and the relationship between FEV1 (%) and PWV (%), and serum uric acid was analyzed by multiple linear regression. Various factors were adjusted by using Model 1 and Model 2. Results. 1,003 subjects were enrolled in this study and 96.7% ( n = 970) of the subjects were men. In both models, the odds ratio of the presence of carotid plaque and thickened carotid IMT had no consistent trend and statistical significance. In the analysis of the PWV (%) and uric acid, there was no significant relationship with FEV1 (%) in both models. Conclusion. FEV1 had no significant relationship with CV risk factors. The result suggests that FEV1 may have no association with CV risk factors or may be insensitive to detecting the association in general population without airflow limitation.
High-risk regions and outbreak modelling of tularemia in humans.

PubMed

Desvars-Larrive, A; Liu, X; Hjertqvist, M; Sjöstedt, A; Johansson, A; Rydén, P

2017-02-01

Sweden reports large and variable numbers of human tularemia cases, but the high-risk regions are anecdotally defined and factors explaining annual variations are poorly understood. Here, high-risk regions were identified by spatial cluster analysis on disease surveillance data for 1984-2012. Negative binomial regression with five previously validated predictors (including predicted mosquito abundance and predictors based on local weather data) was used to model the annual number of tularemia cases within the high-risk regions. Seven high-risk regions were identified with annual incidences of 3·8-44 cases/100 000 inhabitants, accounting for 56·4% of the tularemia cases but only 9·3% of Sweden's population. For all high-risk regions, most cases occurred between July and September. The regression models explained the annual variation of tularemia cases within most high-risk regions and discriminated between years with and without outbreaks. In conclusion, tularemia in Sweden is concentrated in a few high-risk regions and shows high annual and seasonal variations. We present reproducible methods for identifying tularemia high-risk regions and modelling tularemia cases within these regions. The results may help health authorities to target populations at risk and lay the foundation for developing an early warning system for outbreaks.

Regression modeling of ground-water flow

USGS Publications Warehouse

Cooley, R.L.; Naff, R.L.

1985-01-01

Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Using decision trees to understand structure in missing data

PubMed Central

Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L

2015-01-01

Objectives Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting Data taken from employees at 3 different industrial sites in Australia. Participants 7915 observations were included. Materials and methods The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions Researchers are encouraged to use CART and BRT models to explore and understand missing data. PMID:26124509
Validation of Statistical Predictive Models Meant to Select Melanoma Patients for Sentinel Lymph Node Biopsy

PubMed Central

Sabel, Michael S.; Rice, John D.; Griffith, Kent A.; Lowe, Lori; Wong, Sandra L.; Chang, Alfred E.; Johnson, Timothy M.; Taylor, Jeremy M.G.

2013-01-01

Introduction To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid SLN biopsy (SLNB). Several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests and support vector machines. We sought to validate recently published models meant to predict sentinel node status. Methods We queried our comprehensive, prospectively-collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon 4 published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false negative rate (FNR). Results Logistic regression performed comparably with our data when considering NPV (89.4% vs. 93.6%); however the model’s specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsies rates that were lower 87.7% vs. 94.1% and 29.8% vs. 14.3%, respectively. Two published models could not be applied to our data due to model complexity and the use of proprietary software. Conclusions Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Development of statistical predictive models must be created in a clinically applicable manner to allow for both validation and ultimately clinical utility. PMID:21822550
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws

USGS Publications Warehouse

Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.

2011-01-01

Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

ERIC Educational Resources Information Center

Haberman, Shelby J.; Sinharay, Sandip

2010-01-01

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
A model of the evaporation of binary-fuel clusters of drops

NASA Technical Reports Server (NTRS)

Harstad, K.; Bellan, J.

1991-01-01

A formulation has been developed to describe the evaporation of dense or dilute clusters of binary-fuel drops. The binary fuel is assumed to be made of a solute and a solvent whose volatility is much lower than that of the solute. Convective flow effects, inducing a circulatory motion inside the drops, are taken into account, as well as turbulence external to the cluster volume. Results obtained with this model show that, similar to the conclusions for single isolated drops, the evaporation of the volatile is controlled by liquid mass diffusion when the cluster is dilute. In contrast, when the cluster is dense, the evaporation of the volatile is controlled by surface layer stripping, that is, by the regression rate of the drop, which is in fact controlled by the evaporation rate of the solvent. These conclusions are in agreement with existing experimental observations. Parametric studies show that these conclusions remain valid with changes in ambient temperature, initial slip velocity between drops and gas, initial drop size, initial cluster size, initial liquid mass fraction of the solute, and various combinations of solvent and solute. The implications of these results for computationally intensive combustor calculations are discussed.
Learning-based computing techniques in geoid modeling for precise height transformation

NASA Astrophysics Data System (ADS)

Erol, B.; Erol, S.

2013-03-01

Precise determination of local geoid is of particular importance for establishing height control in geodetic GNSS applications, since the classical leveling technique is too laborious. A geoid model can be accurately obtained employing properly distributed benchmarks having GNSS and leveling observations using an appropriate computing algorithm. Besides the classical multivariable polynomial regression equations (MPRE), this study attempts an evaluation of learning based computing algorithms: artificial neural networks (ANNs), adaptive network-based fuzzy inference system (ANFIS) and especially the wavelet neural networks (WNNs) approach in geoid surface approximation. These algorithms were developed parallel to advances in computer technologies and recently have been used for solving complex nonlinear problems of many applications. However, they are rather new in dealing with precise modeling problem of the Earth gravity field. In the scope of the study, these methods were applied to Istanbul GPS Triangulation Network data. The performances of the methods were assessed considering the validation results of the geoid models at the observation points. In conclusion the ANFIS and WNN revealed higher prediction accuracies compared to ANN and MPRE methods. Beside the prediction capabilities, these methods were also compared and discussed from the practical point of view in conclusions.
Impact of External Price Referencing on Medicine Prices – A Price Comparison Among 14 European Countries

PubMed Central

Leopold, Christine; Mantel-Teeuwisse, Aukje Katja; Seyfang, Leonhard; Vogler, Sabine; de Joncheere, Kees; Laing, Richard Ogilvie; Leufkens, Hubert

2012-01-01

Objectives: This study aims to examine the impact of external price referencing (EPR) on on-patent medicine prices, adjusting for other factors that may affect price levels such as sales volume, exchange rates, gross domestic product (GDP) per capita, total pharmaceutical expenditure (TPE), and size of the pharmaceutical industry. Methods: Price data of 14 on-patent products, in 14 European countries in 2007 and 2008 were obtained from the Pharmaceutical Price Information Service of the Austrian Health Institute. Based on the unit ex-factory prices in EURO, scaled ranks per country and per product were calculated. For the regression analysis the scaled ranks per country and product were weighted; each country had the same sum of weights but within a country the weights were proportional to its sales volume in the year (data obtained from IMS Health). Taking the scaled ranks, several statistical analyses were performed by using the program “R”, including a multiple regression analysis (including variables such as GDP per capita and national industry size). Results: This study showed that on average EPR as a pricing policy leads to lower prices. However, the large variation in price levels among countries using EPR confirmed that the price level is not only driven by EPR. The unadjusted linear regression model confirms that applying EPR in a country is associated with a lower scaled weighted rank (p=0.002). This interaction persisted after inclusion of total pharmaceutical expenditure per capita and GDP per capita in the final model. Conclusions: The study showed that for patented products, prices are in general lower in case the country applied EPR. Nevertheless substantial price differences among countries that apply EPR could be identified. Possible explanations could be found through a correlation between pharmaceutical industry and the scaled price ranks. In conclusion, we found that implementing external reference pricing could lead to lower prices. PMID:23532710
The Calibration of AVHRR/3 Visible Dual Gain Using Meteosat-8 as a MODIS Calibration Transfer Medium

NASA Technical Reports Server (NTRS)

Avey, Lance; Garber, Donald; Nguyen, Louis; Minnis, Patrick

2007-01-01

This viewgraph presentation reviews the NOAA-17 AVHRR visible channels calibrated against MET-8/MODIS using dual gain regression methods. The topics include: 1) Motivation; 2) Methodology; 3) Dual Gain Regression Methods; 4) Examples of Regression methods; 5) AVHRR/3 Regression Strategy; 6) Cross-Calibration Method; 7) Spectral Response Functions; 8) MET8/NOAA-17; 9) Example of gain ratio adjustment; 10) Effect of mixed low/high count FOV; 11) Monitor dual gains over time; and 12) Conclusions
The relationship between quality of work life and turnover intention of primary health care nurses in Saudi Arabia

PubMed Central

2012-01-01

Background Quality of work life (QWL) has been found to influence the commitment of health professionals, including nurses. However, reliable information on QWL and turnover intention of primary health care (PHC) nurses is limited. The aim of this study was to examine the relationship between QWL and turnover intention of PHC nurses in Saudi Arabia. Methods A cross-sectional survey was used in this study. Data were collected using Brooks’ survey of Quality of Nursing Work Life, the Anticipated Turnover Scale and demographic data questions. A total of 508 PHC nurses in the Jazan Region, Saudi Arabia, completed the questionnaire (RR = 87%). Descriptive statistics, t-test, ANOVA, General Linear Model (GLM) univariate analysis, standard multiple regression, and hierarchical multiple regression were applied for analysis using SPSS v17 for Windows. Results Findings suggested that the respondents were dissatisfied with their work life, with almost 40% indicating a turnover intention from their current PHC centres. Turnover intention was significantly related to QWL. Using standard multiple regression, 26% of the variance in turnover intention was explained by QWL, p < 0.001, with R2 = .263. Further analysis using hierarchical multiple regression found that the total variance explained by the model as a whole (demographics and QWL) was 32.1%, p < 0.001. QWL explained an additional 19% of the variance in turnover intention, after controlling for demographic variables. Conclusions Creating and maintaining a healthy work life for PHC nurses is very important to improve their work satisfaction, reduce turnover, enhance productivity and improve nursing care outcomes. PMID:22970764
Moderation analysis using a two-level regression model.

PubMed

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

2014-10-01

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
The microcomputer scientific software series 2: general linear model--regression.

Treesearch

Harold M. Rauscher

1983-01-01

The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
KRAS polymorphisms are associated with survival of CRC in Chinese population.

PubMed

Dai, Qiong; Wei, Hui Lian; Huang, Juan; Zhou, Tie Jun; Chai, Li; Yang, Zhi-Hui

2016-04-01

rs12245, rs12587, rs9266, rs1137282, rs61764370, and rs712 of KRAS oncogene are characterized in the 3'UTR. The study highlights the important role of these polymorphisms playing in the susceptibility, oxaliplatin-based chemotherapy sensitivity, progression, and prognosis of CRC. Improved multiplex ligation detection reaction (iMLDR) technique is used for genotyping. An unconditional logistic regression model was used to estimate the association of certain polymorphism and CRC risk. The Kaplan-Meier method, log-rank test, and Cox regression model were used to evaluate the effects of polymorphisms on survival analysis. Results demonstrated that TT genotype and T allele of rs712 were associated with the increased risk of CRC; the patients with GG genotype and G allele of rs61764370 had a shorter survival and a higher risk of relapse or metastasis of CRC. Our studies supported the conclusions that rs61764370 and rs712 polymorphisms of the KRAS are functional and it may play an important role in the development of CRC and oxaliplatin-based chemotherapy efficiency and prognosis of CRC.
Developing and Testing a Model to Predict Outcomes of Organizational Change

PubMed Central

Gustafson, David H; Sainfort, François; Eichler, Mary; Adams, Laura; Bisognano, Maureen; Steudel, Harold

2003-01-01

Objective To test the effectiveness of a Bayesian model employing subjective probability estimates for predicting success and failure of health care improvement projects. Data Sources Experts' subjective assessment data for model development and independent retrospective data on 221 healthcare improvement projects in the United States, Canada, and the Netherlands collected between 1996 and 2000 for validation. Methods A panel of theoretical and practical experts and literature in organizational change were used to identify factors predicting the outcome of improvement efforts. A Bayesian model was developed to estimate probability of successful change using subjective estimates of likelihood ratios and prior odds elicited from the panel of experts. A subsequent retrospective empirical analysis of change efforts in 198 health care organizations was performed to validate the model. Logistic regression and ROC analysis were used to evaluate the model's performance using three alternative definitions of success. Data Collection For the model development, experts' subjective assessments were elicited using an integrative group process. For the validation study, a staff person intimately involved in each improvement project responded to a written survey asking questions about model factors and project outcomes. Results Logistic regression chi-square statistics and areas under the ROC curve demonstrated a high level of model performance in predicting success. Chi-square statistics were significant at the 0.001 level and areas under the ROC curve were greater than 0.84. Conclusions A subjective Bayesian model was effective in predicting the outcome of actual improvement projects. Additional prospective evaluations as well as testing the impact of this model as an intervention are warranted. PMID:12785571
Climate variations and salmonellosis transmission in Adelaide, South Australia: a comparison between regression models

NASA Astrophysics Data System (ADS)

Zhang, Ying; Bi, Peng; Hiller, Janet

2008-01-01

This is the first study to identify appropriate regression models for the association between climate variation and salmonellosis transmission. A comparison between different regression models was conducted using surveillance data in Adelaide, South Australia. By using notified salmonellosis cases and climatic variables from the Adelaide metropolitan area over the period 1990-2003, four regression methods were examined: standard Poisson regression, autoregressive adjusted Poisson regression, multiple linear regression, and a seasonal autoregressive integrated moving average (SARIMA) model. Notified salmonellosis cases in 2004 were used to test the forecasting ability of the four models. Parameter estimation, goodness-of-fit and forecasting ability of the four regression models were compared. Temperatures occurring 2 weeks prior to cases were positively associated with cases of salmonellosis. Rainfall was also inversely related to the number of cases. The comparison of the goodness-of-fit and forecasting ability suggest that the SARIMA model is better than the other three regression models. Temperature and rainfall may be used as climatic predictors of salmonellosis cases in regions with climatic characteristics similar to those of Adelaide. The SARIMA model could, thus, be adopted to quantify the relationship between climate variations and salmonellosis transmission.
Analyzing musculoskeletal neck pain, measured as present pain and periods of pain, with three different regression models: a cohort study

PubMed Central

Grimby-Ekman, Anna; Andersson, Eva M; Hagberg, Mats

2009-01-01

Background In the literature there are discussions on the choice of outcome and the need for more longitudinal studies of musculoskeletal disorders. The general aim of this longitudinal study was to analyze musculoskeletal neck pain, in a group of young adults. Specific aims were to determine whether psychosocial factors, computer use, high work/study demands, and lifestyle are long-term or short-term factors for musculoskeletal neck pain, and whether these factors are important for developing or ongoing musculoskeletal neck pain. Methods Three regression models were used to analyze the different outcomes. Pain at present was analyzed with a marginal logistic model, for number of years with pain a Poisson regression model was used and for developing and ongoing pain a logistic model was used. Presented results are odds ratios and proportion ratios (logistic models) and rate ratios (Poisson model). The material consisted of web-based questionnaires answered by 1204 Swedish university students from a prospective cohort recruited in 2002. Results Perceived stress was a risk factor for pain at present (PR = 1.6), for developing pain (PR = 1.7) and for number of years with pain (RR = 1.3). High work/study demands was associated with pain at present (PR = 1.6); and with number of years with pain when the demands negatively affect home life (RR = 1.3). Computer use pattern (number of times/week with a computer session ≥ 4 h, without break) was a risk factor for developing pain (PR = 1.7), but also associated with pain at present (PR = 1.4) and number of years with pain (RR = 1.2). Among life style factors smoking (PR = 1.8) was found to be associated to pain at present. The difference between men and women in prevalence of musculoskeletal pain was confirmed in this study. It was smallest for the outcome ongoing pain (PR = 1.4) compared to pain at present (PR = 2.4) and developing pain (PR = 2.5). Conclusion By using different regression models different aspects of neck pain pattern could be addressed and the risk factors impact on pain pattern was identified. Short-term risk factors were perceived stress, high work/study demands and computer use pattern (break pattern). Those were also long-term risk factors. For developing pain perceived stress and computer use pattern were risk factors. PMID:19545386
A multilateral modelling of Youth Soccer Performance Index (YSPI)

NASA Astrophysics Data System (ADS)

Bisyri Husin Musawi Maliki, Ahmad; Razali Abdullah, Mohamad; Juahir, Hafizan; Abdullah, Farhana; Ain Shahirah Abdullah, Nurul; Muazu Musa, Rabiu; Musliha Mat-Rasid, Siti; Adnan, Aleesha; Azura Kosni, Norlaila; Muhamad, Wan Siti Amalina Wan; Afiqah Mohamad Nasir, Nur

2018-04-01

This study aims to identify the most dominant factors that influencing performance of soccer player and to predict group performance for soccer players. A total of 184 of youth soccer players from Malaysia sport school and six soccer academy encompasses as respondence of the study. Exploratory factor analysis (EFA) and Confirmatory factor analysis (CFA) were computed to identify the most dominant factors whereas reducing the initial 26 parameters with recommended >0.5 of factor loading. Meanwhile, prediction of the soccer performance was predicted by regression model. CFA revealed that sit and reach, vertical jump, VO2max, age, weight, height, sitting height, calf circumference (cc), medial upper arm circumference (muac), maturation, bicep, triceps, subscapular, suprailiac, 5M, 10M, and 20M speed were the most dominant factors. Further index analysis forming Youth Soccer Performance Index (YSPI) resulting by categorizing three groups namely, high, moderate, and low. The regression model for this study was significant set as p < 0.001 and R2 is 0.8222 which explained that the model contributed a total of 82% prediction ability to predict the whole set of the variables. The significant parameters in contributing prediction of YSPI are discussed. As a conclusion, the precision of the prediction models by integrating a multilateral factor reflecting for predicting potential soccer player and hopefully can create a competitive soccer games.
Applicability of the polynomial chaos expansion method for personalization of a cardiovascular pulse wave propagation model.

PubMed

Huberts, W; Donders, W P; Delhaas, T; van de Vosse, F N

2014-12-01

Patient-specific modeling requires model personalization, which can be achieved in an efficient manner by parameter fixing and parameter prioritization. An efficient variance-based method is using generalized polynomial chaos expansion (gPCE), but it has not been applied in the context of model personalization, nor has it ever been compared with standard variance-based methods for models with many parameters. In this work, we apply the gPCE method to a previously reported pulse wave propagation model and compare the conclusions for model personalization with that of a reference analysis performed with Saltelli's efficient Monte Carlo method. We furthermore differentiate two approaches for obtaining the expansion coefficients: one based on spectral projection (gPCE-P) and one based on least squares regression (gPCE-R). It was found that in general the gPCE yields similar conclusions as the reference analysis but at much lower cost, as long as the polynomial metamodel does not contain unnecessary high order terms. Furthermore, the gPCE-R approach generally yielded better results than gPCE-P. The weak performance of the gPCE-P can be attributed to the assessment of the expansion coefficients using the Smolyak algorithm, which might be hampered by the high number of model parameters and/or by possible non-smoothness in the output space. Copyright © 2014 John Wiley & Sons, Ltd.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

PubMed

Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

2017-03-10

To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
On the potential of models for location and scale for genome-wide DNA methylation data

PubMed Central

2014-01-01

Background With the help of epigenome-wide association studies (EWAS), increasing knowledge on the role of epigenetic mechanisms such as DNA methylation in disease processes is obtained. In addition, EWAS aid the understanding of behavioral and environmental effects on DNA methylation. In terms of statistical analysis, specific challenges arise from the characteristics of methylation data. First, methylation β-values represent proportions with skewed and heteroscedastic distributions. Thus, traditional modeling strategies assuming a normally distributed response might not be appropriate. Second, recent evidence suggests that not only mean differences but also variability in site-specific DNA methylation associates with diseases, including cancer. The purpose of this study was to compare different modeling strategies for methylation data in terms of model performance and performance of downstream hypothesis tests. Specifically, we used the generalized additive models for location, scale and shape (GAMLSS) framework to compare beta regression with Gaussian regression on raw, binary logit and arcsine square root transformed methylation data, with and without modeling a covariate effect on the scale parameter. Results Using simulated and real data from a large population-based study and an independent sample of cancer patients and healthy controls, we show that beta regression does not outperform competing strategies in terms of model performance. In addition, Gaussian models for location and scale showed an improved performance as compared to models for location only. The best performance was observed for the Gaussian model on binary logit transformed β-values, referred to as M-values. Our results further suggest that models for location and scale are specifically sensitive towards violations of the distribution assumption and towards outliers in the methylation data. Therefore, a resampling procedure is proposed as a mode of inference and shown to diminish type I error rate in practically relevant settings. We apply the proposed method in an EWAS of BMI and age and reveal strong associations of age with methylation variability that are validated in an independent sample. Conclusions Models for location and scale are promising tools for EWAS that may help to understand the influence of environmental factors and disease-related phenotypes on methylation variability and its role during disease development. PMID:24994026

Evaluation of weighted regression and sample size in developing a taper model for loblolly pine

Treesearch

Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold

1992-01-01

A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Application of a Combined Model with Autoregressive Integrated Moving Average (ARIMA) and Generalized Regression Neural Network (GRNN) in Forecasting Hepatitis Incidence in Heng County, China

PubMed Central

Liang, Hao; Gao, Lian; Liang, Bingyu; Huang, Jiegang; Zang, Ning; Liao, Yanyan; Yu, Jun; Lai, Jingzhen; Qin, Fengxiang; Su, Jinming; Ye, Li; Chen, Hui

2016-01-01

Background Hepatitis is a serious public health problem with increasing cases and property damage in Heng County. It is necessary to develop a model to predict the hepatitis epidemic that could be useful for preventing this disease. Methods The autoregressive integrated moving average (ARIMA) model and the generalized regression neural network (GRNN) model were used to fit the incidence data from the Heng County CDC (Center for Disease Control and Prevention) from January 2005 to December 2012. Then, the ARIMA-GRNN hybrid model was developed. The incidence data from January 2013 to December 2013 were used to validate the models. Several parameters, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and mean square error (MSE), were used to compare the performance among the three models. Results The morbidity of hepatitis from Jan 2005 to Dec 2012 has seasonal variation and slightly rising trend. The ARIMA(0,1,2)(1,1,1)12 model was the most appropriate one with the residual test showing a white noise sequence. The smoothing factor of the basic GRNN model and the combined model was 1.8 and 0.07, respectively. The four parameters of the hybrid model were lower than those of the two single models in the validation. The parameters values of the GRNN model were the lowest in the fitting of the three models. Conclusions The hybrid ARIMA-GRNN model showed better hepatitis incidence forecasting in Heng County than the single ARIMA model and the basic GRNN model. It is a potential decision-supportive tool for controlling hepatitis in Heng County. PMID:27258555
Plasma and serum L-selectin and clinical and subclinical the Multi-Ethnic Study of Atherosclerosis (MESA)cardiovascular disease

PubMed Central

BERARDI, CECILIA; DECKER, PAUL A.; KIRSCH, PHILLIP S.; DE ANDRADE, MARIZA; TSAI, MICHAEL Y.; PANKOW, JAMES S.; SALE, MICHELE M.; SICOTTE, HUGUES; TANG, WEIHONG; HANSON, NAOMI; POLAK, JOSEPH F.; BIELINSKI, SUZETTE J.

2014-01-01

L-selectin has been suggested to play a role in atherosclerosis. Previous studies on cardiovascular disease (CVD) and serum or plasma L-selectin are inconsistent. The association of serum L-selectin (sL-selectin) with carotid intima-media thickness, coronary artery calcium, ankle-brachial index (subclinical CVD) and incident CVD was assessed within 2403 participants in the Multi-Ethnic Study of Atherosclerosis (MESA). Regression analysis and the Tobit model were used to study subclinical disease; Cox Proportional Hazards regression for incident CVD. Mean age was 63 ± 10, 47% were males; mean sL-selectin was significantly different across ethnicities. Within each race/ethnicity, sL-selectin was associated with age and sex; among Caucasians and African Americans, it was associated with smoking status and current alcohol use. sL-selectin levels did not predict subclinical or clinical CVD after correction for multiple comparisons. Conditional logistic regression models were used to study plasma L-selectin and CVD within 154 incident CVD cases, occurred in a median follow up of 8.5 years, and 306 age-, sex-, and ethnicity-matched controls. L-selectin levels in plasma were significantly lower than in serum and the overall concordance was low. Plasma levels were not associated with CVD. In conclusion, this large multi-ethnic population, soluble L-selectin levels did not predict clinical or subclinical CVD. PMID:24631064
Wolf population regulation revisited: again

USGS Publications Warehouse

McRoberts, Ronald E.; Mech, L. David

2014-01-01

The long-accepted conclusion that wolf density is regulated by nutrition was recently challenged, and the conclusion was reached that, at greater levels of prey biomass, social factors such as intraspecific strife and territoriality tend to regulate wolf density. We reanalyzed the data used in that study for 2 reasons: 1) we disputed the use of 2 data points, and 2) because of recognized heteroscedasticity, we used weighted-regression analysis instead of the unweighted regressions used in the original study. We concluded that the data do not support the hypothesis that wolf densities are regulated by social factors.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

NASA Astrophysics Data System (ADS)

Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

2017-06-01

A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Suparti; Wahyu Utami, Tiani

2018-03-01

Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Air Pollution and Lung Function in Dutch Children: A Comparison of Exposure Estimates and Associations Based on Land Use Regression and Dispersion Exposure Modeling Approaches

PubMed Central

Gehring, Ulrike; Hoek, Gerard; Keuken, Menno; Jonkers, Sander; Beelen, Rob; Eeftens, Marloes; Postma, Dirkje S.; Brunekreef, Bert

2015-01-01

Background There is limited knowledge about the extent to which estimates of air pollution effects on health are affected by the choice for a specific exposure model. Objectives We aimed to evaluate the correlation between long-term air pollution exposure estimates using two commonly used exposure modeling techniques [dispersion and land use regression (LUR) models] and, in addition, to compare the estimates of the association between long-term exposure to air pollution and lung function in children using these exposure modeling techniques. Methods We used data of 1,058 participants of a Dutch birth cohort study with measured forced expiratory volume in 1 sec (FEV1), forced vital capacity (FVC), and peak expiratory flow (PEF) measurements at 8 years of age. For each child, annual average outdoor air pollution exposure [nitrogen dioxide (NO2), mass concentration of particulate matter with diameters ≤ 2.5 and ≤ 10 μm (PM2.5, PM10), and PM2.5 soot] was estimated for the current addresses of the participants by a dispersion and a LUR model. Associations between exposures to air pollution and lung function parameters were estimated using linear regression analysis with confounder adjustment. Results Correlations between LUR- and dispersion-modeled pollution concentrations were high for NO2, PM2.5, and PM2.5 soot (R = 0.86–0.90) but low for PM10 (R = 0.57). Associations with lung function were similar for air pollutant exposures estimated using LUR and dispersion modeling, except for associations of PM2.5 with FEV1 and FVC, which were stronger but less precise for exposures based on LUR compared with dispersion model. Conclusions Predictions from LUR and dispersion models correlated very well for PM2.5, NO2, and PM2.5 soot but not for PM10. Health effect estimates did not depend on the type of model used to estimate exposure in a population of Dutch children. Citation Wang M, Gehring U, Hoek G, Keuken M, Jonkers S, Beelen R, Eeftens M, Postma DS, Brunekreef B. 2015. Air pollution and lung function in Dutch children: a comparison of exposure estimates and associations based on land use regression and dispersion exposure modeling approaches. Environ Health Perspect 123:847–851; http://dx.doi.org/10.1289/ehp.1408541 PMID:25839747
Secure and Efficient Regression Analysis Using a Hybrid Cryptographic Framework: Development and Evaluation

PubMed Central

Jiang, Xiaoqian; Aziz, Md Momin Al; Wang, Shuang; Mohammed, Noman

2018-01-01

Background Machine learning is an effective data-driven tool that is being widely used to extract valuable patterns and insights from data. Specifically, predictive machine learning models are very important in health care for clinical data analysis. The machine learning algorithms that generate predictive models often require pooling data from different sources to discover statistical patterns or correlations among different attributes of the input data. The primary challenge is to fulfill one major objective: preserving the privacy of individuals while discovering knowledge from data. Objective Our objective was to develop a hybrid cryptographic framework for performing regression analysis over distributed data in a secure and efficient way. Methods Existing secure computation schemes are not suitable for processing the large-scale data that are used in cutting-edge machine learning applications. We designed, developed, and evaluated a hybrid cryptographic framework, which can securely perform regression analysis, a fundamental machine learning algorithm using somewhat homomorphic encryption and a newly introduced secure hardware component of Intel Software Guard Extensions (Intel SGX) to ensure both privacy and efficiency at the same time. Results Experimental results demonstrate that our proposed method provides a better trade-off in terms of security and efficiency than solely secure hardware-based methods. Besides, there is no approximation error. Computed model parameters are exactly similar to plaintext results. Conclusions To the best of our knowledge, this kind of secure computation model using a hybrid cryptographic framework, which leverages both somewhat homomorphic encryption and Intel SGX, is not proposed or evaluated to this date. Our proposed framework ensures data security and computational efficiency at the same time. PMID:29506966
Improved performance of epidemiologic and genetic risk models for rheumatoid arthritis serologic phenotypes using family history

PubMed Central

Sparks, Jeffrey A.; Chen, Chia-Yen; Jiang, Xia; Askling, Johan; Hiraki, Linda T.; Malspeis, Susan; Klareskog, Lars; Alfredsson, Lars; Costenbader, Karen H.; Karlson, Elizabeth W.

2014-01-01

Objective To develop and validate rheumatoid arthritis (RA) risk models based on family history, epidemiologic factors, and known genetic risk factors. Methods We developed and validated models for RA based on known RA risk factors, among women in two cohorts: the Nurses’ Health Study (NHS, 381 RA cases and 410 controls) and the Epidemiological Investigation of RA (EIRA, 1244 RA cases and 971 controls). Model discrimination was evaluated using the area under the receiver operating characteristic curve (AUC) in logistic regression models for the study population and for those with positive family history. The joint effect of family history with genetics, smoking, and body mass index (BMI) was evaluated using logistic regression models to estimate odds ratios (OR) for RA. Results The complete model including family history, epidemiologic risk factors, and genetics demonstrated AUCs of 0.74 for seropositive RA in NHS and 0.77 for anti-citrullinated protein antibody (ACPA)-positive RA in EIRA. Among women with positive family history, discrimination was excellent for complete models for seropositive RA in NHS (AUC 0.82) and ACPA-positive RA in EIRA (AUC 0.83). Positive family history, high genetic susceptibility, smoking, and increased BMI had an OR of 21.73 for ACPA-positive RA. Conclusions We developed models for seropositive and seronegative RA phenotypes based on family history, epidemiologic and genetic factors. Among those with positive family history, models utilizing epidemiologic and genetic factors were highly discriminatory for seropositive and seronegative RA. Assessing epidemiological and genetic factors among those with positive family history may identify individuals suitable for RA prevention strategies. PMID:24685909
Models for forecasting hospital bed requirements in the acute sector.

PubMed Central

Farmer, R D; Emami, J

1990-01-01

STUDY OBJECTIVE--The aim was to evaluate the current approach to forecasting hospital bed requirements. DESIGN--The study was a time series and regression analysis. The time series for mean duration of stay for general surgery in the age group 15-44 years (1969-1982) was used in the evaluation of different methods of forecasting future values of mean duration of stay and its subsequent use in the formation of hospital bed requirements. RESULTS--It has been suggested that the simple trend fitting approach suffers from model specification error and imposes unjustified restrictions on the data. Time series approach (Box-Jenkins method) was shown to be a more appropriate way of modelling the data. CONCLUSION--The simple trend fitting approach is inferior to the time series approach in modelling hospital bed requirements. PMID:2277253
Epigenome-wide cross-tissue predictive modeling and comparison of cord blood and placental methylation in a birth cohort

PubMed Central

De Carli, Margherita M; Baccarelli, Andrea A; Trevisi, Letizia; Pantic, Ivan; Brennan, Kasey JM; Hacker, Michele R; Loudon, Holly; Brunst, Kelly J; Wright, Robert O; Wright, Rosalind J; Just, Allan C

2017-01-01

Aim: We compared predictive modeling approaches to estimate placental methylation using cord blood methylation. Materials & methods: We performed locus-specific methylation prediction using both linear regression and support vector machine models with 174 matched pairs of 450k arrays. Results: At most CpG sites, both approaches gave poor predictions in spite of a misleading improvement in array-wide correlation. CpG islands and gene promoters, but not enhancers, were the genomic contexts where the correlation between measured and predicted placental methylation levels achieved higher values. We provide a list of 714 sites where both models achieved an R2 ≥0.75. Conclusion: The present study indicates the need for caution in interpreting cross-tissue predictions. Few methylation sites can be predicted between cord blood and placenta. PMID:28234020
Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks

NASA Astrophysics Data System (ADS)

Kanevski, Mikhail

2015-04-01

The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press. With a CD: data, software, guides. (2009). 2. Kanevski M. Spatial Predictions of Soil Contamination Using General Regression Neural Networks. Systems Research and Information Systems, Volume 8, number 4, 1999. 3. Robert S., Foresti L., Kanevski M. Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks. International Journal of Climatology, 33 pp. 1793-1804, 2013.
Hydrological modeling of geophysical parameters of arboviral and protozoan disease vectors in Internally Displaced People camps in Gulu, Uganda

PubMed Central

Jacob, Benjamin G; Muturi, Ephantus J; Caamano, Erick X; Gunter, James T; Mpanga, Enoch; Ayine, Robert; Okelloonen, Joseph; Nyeko, Jack Pen-Mogi; Shililu, Josephat I; Githure, John I; Regens, James L; Novak, Robert J; Kakoma, Ibulaimu

2008-01-01

Background The aim of this study was to determine if remotely sensed data and Digital Elevation Model (DEM) can test relationships between Culex quinquefasciatus and Anopheles gambiae s.l. larval habitats and environmental parameters within Internally Displaced People (IDP) campgrounds in Gulu, Uganda. A total of 65 georeferenced aquatic habitats in various IDP camps were studied to compare the larval abundance of Cx. quinquefasciatus and An. gambiae s.l. The aquatic habitat dataset were overlaid onto Land Use Land Cover (LULC) maps retrieved from Landsat imagery with 150 m × 150 m grid cells stratified by levels of drainage. The LULC change was estimated over a period of 14 years. Poisson regression analyses and Moran's I statistics were used to model relationships between larval abundance and environmental predictors. Individual larval habitat data were further evaluated in terms of their covariations with spatial autocorrelation by regressing them on candidate spatial filter eigenvectors. Multispectral QuickBird imagery classification and DEM-based GIS methods were generated to evaluate stream flow direction and accumulation for identification of immature Cx. quinquefasciatus and An. gambiae s.l. and abundance. Results The main LULC change in urban Gulu IDP camps was non-urban to urban, which included about 71.5 % of the land cover. The regression models indicate that counts of An. gambiae s.l. larvae were associated with shade while Cx. quinquefasciatus were associated with floating vegetation. Moran's I and the General G statistics for mosquito density by species and instars, identified significant clusters of high densities of Anopheles; larvae, however, Culex are not consistently clustered. A stepwise negative binomial regression decomposed the immature An. gambiae s.l. data into empirical orthogonal bases. The data suggest the presence of roughly 11% to 28 % redundant information in the larval count samples. The DEM suggest a positive correlation for Culex (0.24) while for Anopheles there was a negative correlation (-0.23) for a local model distance to stream. Conclusion These data demonstrate that optical remote sensing; geostatistics and DEMs can be used to identify parameters associated with Culex and Anopheles aquatic habitats. PMID:18341699
Assessment of Communications-related Admissions Criteria in a Three-year Pharmacy Program

PubMed Central

Tejada, Frederick R.; Lang, Lynn A.; Purnell, Miriam; Acedera, Lisa; Ngonga, Ferdinand

2015-01-01

Objective. To determine if there is a correlation between TOEFL and other admissions criteria that assess communications skills (ie, PCAT variables: verbal, reading, essay, and composite), interview, and observational scores and to evaluate TOEFL and these admissions criteria as predictors of academic performance. Methods. Statistical analyses included two sample t tests, multiple regression and Pearson’s correlations for parametric variables, and Mann-Whitney U for nonparametric variables, which were conducted on the retrospective data of 162 students, 57 of whom were foreign-born. Results. The multiple regression model of the other admissions criteria on TOEFL was significant. There was no significant correlation between TOEFL scores and academic performance. However, significant correlations were found between the other admissions criteria and academic performance. Conclusion. Since TOEFL is not a significant predictor of either communication skills or academic success of foreign-born PharmD students in the program, it may be eliminated as an admissions criterion. PMID:26430273
NITPICK: peak identification for mass spectrometry data

PubMed Central

Renard, Bernhard Y; Kirchner, Marc; Steen , Hanno; Steen, Judith AJ; Hamprecht , Fred A

2008-01-01

Background The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. Results This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averagine, a novel extension to Senko's well-known averagine model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. Conclusion Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from . PMID:18755032
The Association between Toxic Exposures and Chronic Multisymptom Illness in Veterans of the Wars of Iraq and Afghanistan

PubMed Central

DeBeer, Bryann B.; Davidson, Dena; Meyer, Eric C.; Kimbrel, Nathan A.; Gulliver, Suzy B.; Morissette, Sandra B.

2017-01-01

Objective The purpose of this study was to determine if post-9/11 veterans deployed to the Iraq and Afghanistan conflicts experienced toxic exposures and whether they are related to symptoms of Chronic Multisymptom Illness (CMI). Methods Data from 224 post-9/11 veterans who self-reported exposure to hazards in theater were analyzed using hierarchical regression. Results Of the sample, 97.2% endorsed experiencing one or more potentially toxic exposure. In a regression model, toxic exposures and CMI symptoms were significantly associated above and beyond covariates. Follow-up analyses revealed that pesticide exposures, but not smoke inhalation was associated with CMI symptoms. Conclusions These findings suggest that toxic exposures were common among military personnel deployed to the most recent conflicts, and appear to be associated with CMI symptoms. Additional research on the impact of toxic exposures on returning Iraq and Afghanistan Veterans’ health is needed. PMID:28045798
The Association between Environmental Factors and Scarlet Fever Incidence in Beijing Region: Using GIS and Spatial Regression Models

PubMed Central

Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua

2016-01-01

(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran’s I statistic and Anselin’s local Moran’s I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R2 = 0.0741, log likelihood = −1819.69, AIC = 3665.38), SLM (R2 = 0.0786, log likelihood = −1819.04, AIC = 3665.08) and SEM (R2 = 0.0743, log likelihood = −1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide (p = 0.027), rainfall (p = 0.036) and sunshine hour (p = 0.048), while the relative humidity (p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention. PMID:27827946
The Association between Environmental Factors and Scarlet Fever Incidence in Beijing Region: Using GIS and Spatial Regression Models.

PubMed

Mahara, Gehendra; Wang, Chao; Yang, Kun; Chen, Sipeng; Guo, Jin; Gao, Qi; Wang, Wei; Wang, Quanyi; Guo, Xiuhua

2016-11-04

(1) Background: Evidence regarding scarlet fever and its relationship with meteorological, including air pollution factors, is not very available. This study aimed to examine the relationship between ambient air pollutants and meteorological factors with scarlet fever occurrence in Beijing, China. (2) Methods: A retrospective ecological study was carried out to distinguish the epidemic characteristics of scarlet fever incidence in Beijing districts from 2013 to 2014. Daily incidence and corresponding air pollutant and meteorological data were used to develop the model. Global Moran's I statistic and Anselin's local Moran's I (LISA) were applied to detect the spatial autocorrelation (spatial dependency) and clusters of scarlet fever incidence. The spatial lag model (SLM) and spatial error model (SEM) including ordinary least squares (OLS) models were then applied to probe the association between scarlet fever incidence and meteorological including air pollution factors. (3) Results: Among the 5491 cases, more than half (62%) were male, and more than one-third (37.8%) were female, with the annual average incidence rate 14.64 per 100,000 population. Spatial autocorrelation analysis exhibited the existence of spatial dependence; therefore, we applied spatial regression models. After comparing the values of R-square, log-likelihood and the Akaike information criterion (AIC) among the three models, the OLS model (R² = 0.0741, log likelihood = -1819.69, AIC = 3665.38), SLM (R² = 0.0786, log likelihood = -1819.04, AIC = 3665.08) and SEM (R² = 0.0743, log likelihood = -1819.67, AIC = 3665.36), identified that the spatial lag model (SLM) was best for model fit for the regression model. There was a positive significant association between nitrogen oxide ( p = 0.027), rainfall ( p = 0.036) and sunshine hour ( p = 0.048), while the relative humidity ( p = 0.034) had an adverse association with scarlet fever incidence in SLM. (4) Conclusions: Our findings indicated that meteorological, as well as air pollutant factors may increase the incidence of scarlet fever; these findings may help to guide scarlet fever control programs and targeting the intervention.
Association between Stereotactic Radiotherapy and Death from Brain Metastases of Epithelial Ovarian Cancer: a Gliwice Data Re-Analysis with Penalization

PubMed

Tukiendorf, Andrzej; Mansournia, Mohammad Ali; Wydmański, Jerzy; Wolny-Rokicka, Edyta

2017-04-01

Background: Clinical datasets for epithelial ovarian cancer brain metastatic patients are usually small in size. When adequate case numbers are lacking, resulting estimates of regression coefficients may demonstrate bias. One of the direct approaches to reduce such sparse-data bias is based on penalized estimation. Methods: A re- analysis of formerly reported hazard ratios in diagnosed patients was performed using penalized Cox regression with a popular SAS package providing additional software codes for a statistical computational procedure. Results: It was found that the penalized approach can readily diminish sparse data artefacts and radically reduce the magnitude of estimated regression coefficients. Conclusions: It was confirmed that classical statistical approaches may exaggerate regression estimates or distort study interpretations and conclusions. The results support the thesis that penalization via weak informative priors and data augmentation are the safest approaches to shrink sparse data artefacts frequently occurring in epidemiological research. Creative Commons Attribution License
Design and Development of a Model to Simulate 0-G Treadmill Running Using the European Space Agency's Subject Loading System

NASA Technical Reports Server (NTRS)

Caldwell, E. C.; Cowley, M. S.; Scott-Pandorf, M. M.

2010-01-01

Develop a model that simulates a human running in 0 G using the European Space Agency s (ESA) Subject Loading System (SLS). The model provides ground reaction forces (GRF) based on speed and pull-down forces (PDF). DESIGN The theoretical basis for the Running Model was based on a simple spring-mass model. The dynamic properties of the spring-mass model express theoretical vertical GRF (GRFv) and shear GRF in the posterior-anterior direction (GRFsh) during running gait. ADAMs VIEW software was used to build the model, which has a pelvis, thigh segment, shank segment, and a spring foot (see Figure 1).the model s movement simulates the joint kinematics of a human running at Earth gravity with the aim of generating GRF data. DEVELOPMENT & VERIFICATION ESA provided parabolic flight data of subjects running while using the SLS, for further characterization of the model s GRF. Peak GRF data were fit to a linear regression line dependent on PDF and speed. Interpolation and extrapolation of the regression equation provided a theoretical data matrix, which is used to drive the model s motion equations. Verification of the model was conducted by running the model at 4 different speeds, with each speed accounting for 3 different PDF. The model s GRF data fell within a 1-standard-deviation boundary derived from the empirical ESA data. CONCLUSION The Running Model aids in conducting various simulations (potential scenarios include a fatigued runner or a powerful runner generating high loads at a fast cadence) to determine limitations for the T2 vibration isolation system (VIS) aboard the International Space Station. This model can predict how running with the ESA SLS affects the T2 VIS and may be used for other exercise analyses in the future.

A Multi-Study Analysis of Conceptual and Measurement Issues Related to Health Research on Acculturation in Latinos

PubMed Central

Andrews, Arthur R.; Bridges, Ana J.; Gomez, Debbie

2014-01-01

Purpose The aims of the study were to evaluate the orthogonality of acculturation for Latinos. Design Regression analyses were used to examine acculturation in two Latino samples (N = 77; N = 40). In a third study (N = 673), confirmatory factor analyses compared unidimensional and bidimensional models. Method Acculturation was assessed with the ARSMA-II (Studies 1 and 2), and language proficiency items from the Children of Immigrants Longitudinal Study (Study 3). Results In Studies 1 and 2, the bidimensional model accounted for slightly more variance (R2Study 1 = .11; R2Study 2 = .21) than the unidimensional model (R2Study 1 = .10; R2Study 2 = .19). In Study 3, the bidimensional model evidenced better fit (Akaike information criterion = 167.36) than the unidimensional model (Akaike information criterion = 1204.92). Discussion/Conclusions Acculturation is multidimensional. Implications for Practice Care providers should examine acculturation as a bidimensional construct. PMID:23361579
Applying Kaplan-Meier to Item Response Data

ERIC Educational Resources Information Center

McNeish, Daniel

2018-01-01

Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
The Association Between Internet Use and Ambulatory Care-Seeking Behaviors in Taiwan: A Cross-Sectional Study

PubMed Central

Chen, Tsung-Fu; Liang, Jyh-Chong; Lin, Tzu-Bin; Tsai, Chin-Chung

2016-01-01

Background Compared with the traditional ways of gaining health-related information from newspapers, magazines, radio, and television, the Internet is inexpensive, accessible, and conveys diverse opinions. Several studies on how increasing Internet use affected outpatient clinic visits were inconclusive. Objective The objective of this study was to examine the role of Internet use on ambulatory care-seeking behaviors as indicated by the number of outpatient clinic visits after adjusting for confounding variables. Methods We conducted this study using a sample randomly selected from the general population in Taiwan. To handle the missing data, we built a multivariate logistic regression model for propensity score matching using age and sex as the independent variables. The questionnaires with no missing data were then included in a multivariate linear regression model for examining the association between Internet use and outpatient clinic visits. Results We included a sample of 293 participants who answered the questionnaire with no missing data in the multivariate linear regression model. We found that Internet use was significantly associated with more outpatient clinic visits (P=.04). The participants with chronic diseases tended to make more outpatient clinic visits (P<.01). Conclusions The inconsistent quality of health-related information obtained from the Internet may be associated with patients’ increasing need for interpreting and discussing the information with health care professionals, thus resulting in an increasing number of outpatient clinic visits. In addition, the media literacy of Web-based health-related information seekers may also affect their ambulatory care-seeking behaviors, such as outpatient clinic visits. PMID:27927606
Multiple balance tests improve the assessment of postural stability in subjects with Parkinson's disease

PubMed Central

Jacobs, J V; Horak, F B; Tran, V K; Nutt, J G

2006-01-01

Objectives Clinicians often base the implementation of therapies on the presence of postural instability in subjects with Parkinson's disease (PD). These decisions are frequently based on the pull test from the Unified Parkinson's Disease Rating Scale (UPDRS). We sought to determine whether combining the pull test, the one‐leg stance test, the functional reach test, and UPDRS items 27–29 (arise from chair, posture, and gait) predicts balance confidence and falling better than any test alone. Methods The study included 67 subjects with PD. Subjects performed the one‐leg stance test, the functional reach test, and the UPDRS motor exam. Subjects also responded to the Activities‐specific Balance Confidence (ABC) scale and reported how many times they fell during the previous year. Regression models determined the combination of tests that optimally predicted mean ABC scores or categorised fall frequency. Results When all tests were included in a stepwise linear regression, only gait (UPDRS item 29), the pull test (UPDRS item 30), and the one‐leg stance test, in combination, represented significant predictor variables for mean ABC scores (r2 = 0.51). A multinomial logistic regression model including the one‐leg stance test and gait represented the model with the fewest significant predictor variables that correctly identified the most subjects as fallers or non‐fallers (85% of subjects were correctly identified). Conclusions Multiple balance tests (including the one‐leg stance test, and the gait and pull test items of the UPDRS) that assess different types of postural stress provide an optimal assessment of postural stability in subjects with PD. PMID:16484639
Periodic limb movements of sleep: empirical and theoretical evidence supporting objective at-home monitoring

PubMed Central

Moro, Marilyn; Goparaju, Balaji; Castillo, Jelina; Alameddine, Yvonne; Bianchi, Matt T

2016-01-01

Introduction Periodic limb movements of sleep (PLMS) may increase cardiovascular and cerebrovascular morbidity. However, most people with PLMS are either asymptomatic or have nonspecific symptoms. Therefore, predicting elevated PLMS in the absence of restless legs syndrome remains an important clinical challenge. Methods We undertook a retrospective analysis of demographic data, subjective symptoms, and objective polysomnography (PSG) findings in a clinical cohort with or without obstructive sleep apnea (OSA) from our laboratory (n=443 with OSA, n=209 without OSA). Correlation analysis and regression modeling were performed to determine predictors of periodic limb movement index (PLMI). Markov decision analysis with TreeAge software compared strategies to detect PLMS: in-laboratory PSG, at-home testing, and a clinical prediction tool based on the regression analysis. Results Elevated PLMI values (>15 per hour) were observed in >25% of patients. PLMI values in No-OSA patients correlated with age, sex, self-reported nocturnal leg jerks, restless legs syndrome symptoms, and hypertension. In OSA patients, PLMI correlated only with age and self-reported psychiatric medications. Regression models indicated only a modest predictive value of demographics, symptoms, and clinical history. Decision modeling suggests that at-home testing is favored as the pretest probability of PLMS increases, given plausible assumptions regarding PLMS morbidity, costs, and assumed benefits of pharmacological therapy. Conclusion Although elevated PLMI values were commonly observed, routinely acquired clinical information had only weak predictive utility. As the clinical importance of elevated PLMI continues to evolve, it is likely that objective measures such as PSG or at-home PLMS monitors will prove increasingly important for clinical and research endeavors. PMID:27540316
Genetic Variants in the Hedgehog Interacting Protein Gene Are Associated with the FEV1/FVC Ratio in Southern Han Chinese Subjects with Chronic Obstructive Pulmonary Disease

PubMed Central

Zhang, Zili; Wang, Jian; Zheng, Zeguang; Chen, Xindong; Zeng, Xiansheng; Zhang, Yi; Li, Defu; Shu, Jiaze; Yang, Kai; Lai, Ning; Dong, Lian

2017-01-01

Background Convincing evidences have demonstrated the associations between HHIP and FAM13a polymorphisms and COPD in non-Asian populations. Here genetic variants in HHIP and FAM13a were investigated in Southern Han Chinese COPD. Methods A case-control study was conducted, including 989 cases and 999 controls. The associations between SNPs genotypes and COPD were performed by a logistic regression model; for SNPs and COPD-related phenotypes such as lung function, COPD severity, pack-year of smoking, and smoking status, a linear regression model was employed. Effects of risk alleles, genotypes, and haplotypes of the 3 significant SNPs in the HHIP gene on FEV1/FVC were also assessed in a linear regression model in COPD. Results The mean FEV1/FVC% value was 46.8 in combined COPD population. None of the 8 selected SNPs apparently related to COPD susceptibility. However, three SNPs (rs12509311, rs13118928, and rs182859) in HHIP were associated significantly with the FEV1/FVC% (Pmax = 4.1 × 10−4) in COPD adjusting for gender, age, and smoking pack-years. Moreover, statistical significance between risk alleles and the FEV1/FVC% (P = 2.3 × 10−4), risk genotypes, and the FEV1/FVC% (P = 3.5 × 10−4) was also observed in COPD. Conclusions Genetic variants in HHIP were related with FEV1/FVC in COPD. Significant relationships between risk alleles and risk genotypes and FEV1/FVC in COPD were also identified. PMID:28929109
Characterizing mammographic images by using generic texture features

PubMed Central

2012-01-01

Introduction Although mammographic density is an established risk factor for breast cancer, its use is limited in clinical practice because of a lack of automated and standardized measurement methods. The aims of this study were to evaluate a variety of automated texture features in mammograms as risk factors for breast cancer and to compare them with the percentage mammographic density (PMD) by using a case-control study design. Methods A case-control study including 864 cases and 418 controls was analyzed automatically. Four hundred seventy features were explored as possible risk factors for breast cancer. These included statistical features, moment-based features, spectral-energy features, and form-based features. An elaborate variable selection process using logistic regression analyses was performed to identify those features that were associated with case-control status. In addition, PMD was assessed and included in the regression model. Results Of the 470 image-analysis features explored, 46 remained in the final logistic regression model. An area under the curve of 0.79, with an odds ratio per standard deviation change of 2.88 (95% CI, 2.28 to 3.65), was obtained with validation data. Adding the PMD did not improve the final model. Conclusions Using texture features to predict the risk of breast cancer appears feasible. PMD did not show any additional value in this study. With regard to the features assessed, most of the analysis tools appeared to reflect mammographic density, although some features did not correlate with PMD. It remains to be investigated in larger case-control studies whether these features can contribute to increased prediction accuracy. PMID:22490545
Accounting for individual differences and timing of events: estimating the effect of treatment on criminal convictions in heroin users

PubMed Central

2014-01-01

Background The reduction of crime is an important outcome of opioid maintenance treatment (OMT). Criminal intensity and treatment regimes vary among OMT patients, but this is rarely adjusted for in statistical analyses, which tend to focus on cohort incidence rates and rate ratios. The purpose of this work was to estimate the relationship between treatment and criminal convictions among OMT patients, adjusting for individual covariate information and timing of events, fitting time-to-event regression models of increasing complexity. Methods National criminal records were cross linked with treatment data on 3221 patients starting OMT in Norway 1997–2003. In addition to calculating cohort incidence rates, criminal convictions was modelled as a recurrent event dependent variable, and treatment a time-dependent covariate, in Cox proportional hazards, Aalen’s additive hazards, and semi-parametric additive hazards regression models. Both fixed and dynamic covariates were included. Results During OMT, the number of days with criminal convictions for the cohort as a whole was 61% lower than when not in treatment. OMT was associated with reduced number of days with criminal convictions in all time-to-event regression models, but the hazard ratio (95% CI) was strongly attenuated when adjusting for covariates; from 0.40 (0.35, 0.45) in a univariate model to 0.79 (0.72, 0.87) in a fully adjusted model. The hazard was lower for females and decreasing with older age, while increasing with high numbers of criminal convictions prior to application to OMT (all p < 0.001). The strongest predictors were level of criminal activity prior to entering into OMT, and having a recent criminal conviction (both p < 0.001). The effect of several predictors was significantly time-varying with their effects diminishing over time. Conclusions Analyzing complex observational data regarding to fixed factors only overlooks important temporal information, and naïve cohort level incidence rates might result in biased estimates of the effect of interventions. Applying time-to-event regression models, properly adjusting for individual covariate information and timing of various events, allows for more precise and reliable effect estimates, as well as painting a more nuanced picture that can aid health care professionals and policy makers. PMID:24886472
A Comparison of a Machine Learning Model with EuroSCORE II in Predicting Mortality after Elective Cardiac Surgery: A Decision Curve Analysis

PubMed Central

Allyn, Jérôme; Allou, Nicolas; Augustin, Pascal; Philip, Ivan; Martinet, Olivier; Belghiti, Myriem; Provenchere, Sophie; Montravers, Philippe; Ferdynus, Cyril

2017-01-01

Background The benefits of cardiac surgery are sometimes difficult to predict and the decision to operate on a given individual is complex. Machine Learning and Decision Curve Analysis (DCA) are recent methods developed to create and evaluate prediction models. Methods and finding We conducted a retrospective cohort study using a prospective collected database from December 2005 to December 2012, from a cardiac surgical center at University Hospital. The different models of prediction of mortality in-hospital after elective cardiac surgery, including EuroSCORE II, a logistic regression model and a machine learning model, were compared by ROC and DCA. Of the 6,520 patients having elective cardiac surgery with cardiopulmonary bypass, 6.3% died. Mean age was 63.4 years old (standard deviation 14.4), and mean EuroSCORE II was 3.7 (4.8) %. The area under ROC curve (IC95%) for the machine learning model (0.795 (0.755–0.834)) was significantly higher than EuroSCORE II or the logistic regression model (respectively, 0.737 (0.691–0.783) and 0.742 (0.698–0.785), p < 0.0001). Decision Curve Analysis showed that the machine learning model, in this monocentric study, has a greater benefit whatever the probability threshold. Conclusions According to ROC and DCA, machine learning model is more accurate in predicting mortality after elective cardiac surgery than EuroSCORE II. These results confirm the use of machine learning methods in the field of medical prediction. PMID:28060903
Structural vascular disease in Africans: Performance of ethnic-specific waist circumference cut points using logistic regression and neural network analyses: The SABPA study.

PubMed

Botha, J; de Ridder, J H; Potgieter, J C; Steyn, H S; Malan, L

2013-10-01

A recently proposed model for waist circumference cut points (RPWC), driven by increased blood pressure, was demonstrated in an African population. We therefore aimed to validate the RPWC by comparing the RPWC and the Joint Statement Consensus (JSC) models via Logistic Regression (LR) and Neural Networks (NN) analyses. Urban African gender groups (N=171) were stratified according to the JSC and RPWC cut point models. Ultrasound carotid intima media thickness (CIMT), blood pressure (BP) and fasting bloods (glucose, high density lipoprotein (HDL) and triglycerides) were obtained in a well-controlled setting. The RPWC male model (LR ROC AUC: 0.71, NN ROC AUC: 0.71) was practically equal to the JSC model (LR ROC AUC: 0.71, NN ROC AUC: 0.69) to predict structural vascular -disease. Similarly, the female RPWC model (LR ROC AUC: 0.84, NN ROC AUC: 0.82) and JSC model (LR ROC AUC: 0.82, NN ROC AUC: 0.81) equally predicted CIMT as surrogate marker for structural vascular disease. Odds ratios supported validity where prediction of CIMT revealed -clinical -significance, well over 1, for both the JSC and RPWC models in African males and females (OR 3.75-13.98). In conclusion, the proposed RPWC model was substantially validated utilizing linear and non-linear analyses. We therefore propose ethnic-specific WC cut points (African males, ≥90 cm; -females, ≥98 cm) to predict a surrogate marker for structural vascular disease. © J. A. Barth Verlag in Georg Thieme Verlag KG Stuttgart · New York.
[Risk factor analysis of the patients with solitary pulmonary nodules and establishment of a prediction model for the probability of malignancy].

PubMed

Wang, X; Xu, Y H; Du, Z Y; Qian, Y J; Xu, Z H; Chen, R; Shi, M H

2018-02-23

Objective: This study aims to analyze the relationship among the clinical features, radiologic characteristics and pathological diagnosis in patients with solitary pulmonary nodules, and establish a prediction model for the probability of malignancy. Methods: Clinical data of 372 patients with solitary pulmonary nodules who underwent surgical resection with definite postoperative pathological diagnosis were retrospectively analyzed. In these cases, we collected clinical and radiologic features including gender, age, smoking history, history of tumor, family history of cancer, the location of lesion, ground-glass opacity, maximum diameter, calcification, vessel convergence sign, vacuole sign, pleural indentation, speculation and lobulation. The cases were divided to modeling group (268 cases) and validation group (104 cases). A new prediction model was established by logistic regression analying the data from modeling group. Then the data of validation group was planned to validate the efficiency of the new model, and was compared with three classical models(Mayo model, VA model and LiYun model). With the calculated probability values for each model from validation group, SPSS 22.0 was used to draw the receiver operating characteristic curve, to assess the predictive value of this new model. Results: 112 benign SPNs and 156 malignant SPNs were included in modeling group. Multivariable logistic regression analysis showed that gender, age, history of tumor, ground -glass opacity, maximum diameter, and speculation were independent predictors of malignancy in patients with SPN( P <0.05). We calculated a prediction model for the probability of malignancy as follow: p =e(x)/(1+ e(x)), x=-4.8029-0.743×gender+ 0.057×age+ 1.306×history of tumor+ 1.305×ground-glass opacity+ 0.051×maximum diameter+ 1.043×speculation. When the data of validation group was added to the four-mathematical prediction model, The area under the curve of our mathematical prediction model was 0.742, which is greater than other models (Mayo 0.696, VA 0.634, LiYun 0.681), while the differences between any two of the four models were not significant ( P >0.05). Conclusions: Age of patient, gender, history of tumor, ground-glass opacity, maximum diameter and speculation are independent predictors of malignancy in patients with solitary pulmonary nodule. This logistic regression prediction mathematic model is not inferior to those classical models in estimating the prognosis of SPNs.
Predonation Volume of Future Remnant Cortical Kidney Helps Predict Postdonation Renal Function in Live Kidney Donors.

PubMed

Fananapazir, Ghaneh; Benzl, Robert; Corwin, Michael T; Chen, Ling-Xin; Sageshima, Junichiro; Stewart, Susan L; Troppmann, Christoph

2018-07-01

Purpose To determine whether the predonation computed tomography (CT)-based volume of the future remnant kidney is predictive of postdonation renal function in living kidney donors. Materials and Methods This institutional review board-approved, retrospective, HIPAA-compliant study included 126 live kidney donors who had undergone predonation renal CT between January 2007 and December 2014 as well as 2-year postdonation measurement of estimated glomerular filtration rate (eGFR). The whole kidney volume and cortical volume of the future remnant kidney were measured and standardized for body surface area (BSA). Bivariate linear associations between the ratios of whole kidney volume to BSA and cortical volume to BSA were obtained. A linear regression model for 2-year postdonation eGFR that incorporated donor age, sex, and either whole kidney volume-to-BSA ratio or cortical volume-to-BSA ratio was created, and the coefficient of determination (R 2 ) for the model was calculated. Factors not statistically additive in assessing 2-year eGFR were removed by using backward elimination, and the coefficient of determination for this parsimonious model was calculated. Results Correlation was slightly better for cortical volume-to-BSA ratio than for whole kidney volume-to-BSA ratio (r = 0.48 vs r = 0.44, respectively). The linear regression model incorporating all donor factors had an R 2 of 0.66. The only factors that were significantly additive to the equation were cortical volume-to-BSA ratio and predonation eGFR (P = .01 and P < .01, respectively), and the final parsimonious linear regression model incorporating these two variables explained almost the same amount of variance (R 2 = 0.65) as did the full model. Conclusion The cortical volume of the future remnant kidney helped predict postdonation eGFR at 2 years. The cortical volume-to-BSA ratio should thus be considered for addition as an important variable to living kidney donor evaluation and selection guidelines. © RSNA, 2018.
Prediction of Response to Neoadjuvant Chemotherapy and Radiation Therapy with Baseline and Restaging 18F-FDG PET Imaging Biomarkers in Patients with Esophageal Cancer.

PubMed

Beukinga, Roelof J; Hulshoff, Jan Binne; Mul, Véronique E M; Noordzij, Walter; Kats-Ugurlu, Gursah; Slart, Riemer H J A; Plukker, John T M

2018-06-01

Purpose To assess the value of baseline and restaging fluorine 18 ( 18 F) fluorodeoxyglucose (FDG) positron emission tomography (PET) radiomics in predicting pathologic complete response to neoadjuvant chemotherapy and radiation therapy (NCRT) in patients with locally advanced esophageal cancer. Materials and Methods In this retrospective study, 73 patients with histologic analysis-confirmed T1/N1-3/M0 or T2-4a/N0-3/M0 esophageal cancer were treated with NCRT followed by surgery (Chemoradiotherapy for Esophageal Cancer followed by Surgery Study regimen) between October 2014 and August 2017. Clinical variables and radiomic features from baseline and restaging 18 F-FDG PET were selected by univariable logistic regression and least absolute shrinkage and selection operator. The selected variables were used to fit a multivariable logistic regression model, which was internally validated by using bootstrap resampling with 20 000 replicates. The performance of this model was compared with reference prediction models composed of maximum standardized uptake value metrics, clinical variables, and maximum standardized uptake value at baseline NCRT radiomic features. Outcome was defined as complete versus incomplete pathologic response (tumor regression grade 1 vs 2-5 according to the Mandard classification). Results Pathologic response was complete in 16 patients (21.9%) and incomplete in 57 patients (78.1%). A prediction model combining clinical T-stage and restaging NCRT (post-NCRT) joint maximum (quantifying image orderliness) yielded an optimism-corrected area under the receiver operating characteristics curve of 0.81. Post-NCRT joint maximum was replaceable with five other redundant post-NCRT radiomic features that provided equal model performance. All reference prediction models exhibited substantially lower discriminatory accuracy. Conclusion The combination of clinical T-staging and quantitative assessment of post-NCRT 18 F-FDG PET orderliness (joint maximum) provided high discriminatory accuracy in predicting pathologic complete response in patients with esophageal cancer. © RSNA, 2018 Online supplemental material is available for this article.
Weak interspecific interactions in a sagebrush steppe? Conflicting evidence from observations and experiments.

PubMed

Adler, Peter B; Kleinhesselink, Andrew; Giles, Hooker; Taylor, J Bret; Teller, Brittany; Ellner, Stephen P

2018-04-28

Stable coexistence requires intraspecific limitations to be stronger than interspecific limitations. The greater the difference between intra- and interspecific limitations, the more stable the coexistence, and the weaker the competitive release any species should experience following removal of competitors. We conducted a removal experiment to test whether a previously estimated model, showing surprisingly weak interspecific competition for four dominant species in a sagebrush steppe, accurately predicts competitive release. Our treatments were 1) removal of all perennial grasses and 2) removal of the dominant shrub, Artemisia tripartita. We regressed survival, growth and recruitment on the locations, sizes, and species identities of neighboring plants, along with an indicator variable for removal treatment. If our "baseline" regression model, which accounts for local plant-plant interactions, accurately explains the observed responses to removals, then the removal coefficient should be non-significant. For survival, the removal coefficients were never significantly different from zero, and only A. tripartita showed a (negative) response to removals at the recruitment stage. For growth, the removal treatment effect was significant and positive for two species, Poa secunda and Pseudoroegneria spicata, indicating that the baseline model underestimated interspecific competition. For all three grass species, population models based on the vital rate regressions that included removal effects projected 1.4 to 3-fold increases in equilibrium population size relative to the baseline model (no removal effects). However, we found no evidence of higher response to removal in quadrats with higher pretreatment cover of A. tripartita, or by plants experiencing higher pre-treatment crowding by A. tripartita, raising questions about the mechanisms driving the positive response to removal. While our results show the value of combining observations with a simple removal experiment, more tightly controlled experiments focused on underlying mechanisms may be required to conclusively validate or reject predictions from phenomenological models. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Physiology-Based Modeling May Predict Surgical Treatment Outcome for Obstructive Sleep Apnea

PubMed Central

Li, Yanru; Ye, Jingying; Han, Demin; Cao, Xin; Ding, Xiu; Zhang, Yuhuan; Xu, Wen; Orr, Jeremy; Jen, Rachel; Sands, Scott; Malhotra, Atul; Owens, Robert

2017-01-01

Study Objectives: To test whether the integration of both anatomical and nonanatomical parameters (ventilatory control, arousal threshold, muscle responsiveness) in a physiology-based model will improve the ability to predict outcomes after upper airway surgery for obstructive sleep apnea (OSA). Methods: In 31 patients who underwent upper airway surgery for OSA, loop gain and arousal threshold were calculated from preoperative polysomnography (PSG). Three models were compared: (1) a multiple regression based on an extensive list of PSG parameters alone; (2) a multivariate regression using PSG parameters plus PSG-derived estimates of loop gain, arousal threshold, and other trait surrogates; (3) a physiological model incorporating selected variables as surrogates of anatomical and nonanatomical traits important for OSA pathogenesis. Results: Although preoperative loop gain was positively correlated with postoperative apnea-hypopnea index (AHI) (P = .008) and arousal threshold was negatively correlated (P = .011), in both model 1 and 2, the only significant variable was preoperative AHI, which explained 42% of the variance in postoperative AHI. In contrast, the physiological model (model 3), which included AHIREM (anatomy term), fraction of events that were hypopnea (arousal term), the ratio of AHIREM and AHINREM (muscle responsiveness term), loop gain, and central/mixed apnea index (control of breathing terms), was able to explain 61% of the variance in postoperative AHI. Conclusions: Although loop gain and arousal threshold are associated with residual AHI after surgery, only preoperative AHI was predictive using multivariate regression modeling. Instead, incorporating selected surrogates of physiological traits on the basis of OSA pathophysiology created a model that has more association with actual residual AHI. Commentary: A commentary on this article appears in this issue on page 1023. Clinical Trial Registration: ClinicalTrials.Gov; Title: The Impact of Sleep Apnea Treatment on Physiology Traits in Chinese Patients With Obstructive Sleep Apnea; Identifier: NCT02696629; URL: https://clinicaltrials.gov/show/NCT02696629 Citation: Li Y, Ye J, Han D, Cao X, Ding X, Zhang Y, Xu W, Orr J, Jen R, Sands S, Malhotra A, Owens R. Physiology-based modeling may predict surgical treatment outcome for obstructive sleep apnea. J Clin Sleep Med. 2017;13(9):1029–1037. PMID:28818154
Modelling typhoid risk in Dhaka Metropolitan Area of Bangladesh: the role of socio-economic and environmental factors

PubMed Central

2013-01-01

Background Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as Cholera, Typhoid and Paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and Principal Axis Factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a Quality of Life index, which in turn is used in a regression model of typhoid occurrence and risk. Results The three Principal Factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using Ordinary Least Squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the Principal factors. The use of Geographically Weighted Regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike Information Criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. Conclusions The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka Metropolitan Area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination. PMID:23497202
Is the Critical Shields Stress for Incipient Sediment Motion Dependent on Bed Slope in Natural Channels? No.

NASA Astrophysics Data System (ADS)

Phillips, C. B.; Jerolmack, D. J.

2017-12-01

Understanding when coarse sediment begins to move in a river is essential for linking rivers to the evolution of mountainous landscapes. Unfortunately, the threshold of surface particle motion is notoriously difficult to measure in the field. However, recent studies have shown that the threshold of surface motion is empirically correlated with channel slope, a property that is easy to measure and readily available from the literature. These studies have thoroughly examined the mechanistic underpinnings behind the observed correlation and produced suitably complex models. These models are difficult to implement for natural rivers using widely available data, and thus others have treated the empirical regression between slope and the threshold of motion as a predictive model. We note that none of the authors of the original studies exploring this correlation suggested their empirical regressions be used in a predictive fashion, nevertheless these regressions between slope and the threshold of motion have found their way into numerous recent studies engendering potentially spurious conclusions. We demonstrate that there are two significant problems with using these empirical equations for prediction: (1) the empirical regressions are based on a limited sampling of the phase space of bed-load rivers and (2) the empirical measurements of bankfull and critical shear stresses are paired. The upshot of these problems limits the empirical relations predictive capacity to field sites drawn from the same region of the bed-load river phase space and that the paired nature of the data introduces a spurious correlation when considering the ratio of bankfull to critical shear stress. Using a large compilation of bed-load river hydraulic geometry data, we demonstrate that the variation within independently measured values of the threshold of motion changes systematically with bankfull shields stress and not channel slope. Additionally, we highlight using several recent datasets the potential pitfalls that one can encounter when using simplistic empirical regressions to predict the threshold of motion showing that while these concerns could be construed as subtle the resulting implications can be substantial.
Risk factors for displaced abomasum or ketosis in Swedish dairy herds.

PubMed

Stengärde, L; Hultgren, J; Tråvén, M; Holtenius, K; Emanuelson, U

2012-03-01

Risk factors associated with high or low long-term incidence of displaced abomasum (DA) or clinical ketosis were studied in 60 Swedish dairy herds, using multivariable logistic regression modelling. Forty high-incidence herds were included as cases and 20 low-incidence herds as controls. Incidence rates were calculated based on veterinary records of clinical diagnoses. During the 3-year period preceding the herd classification, herds with a high incidence had a disease incidence of DA or clinical ketosis above the 3rd quartile in a national database for disease recordings. Control herds had no cows with DA or clinical ketosis. All herds were visited during the housing period and herdsmen were interviewed about management routines, housing, feeding, milk yield, and herd health. Target groups were heifers in late gestation, dry cows, and cows in early lactation. Univariable logistic regression was used to screen for factors associated with being a high-incidence herd. A multivariable logistic regression model was built using stepwise regression. A higher maximum daily milk yield in multiparous cows and a large herd size (p=0.054 and p=0.066, respectively) tended to be associated with being a high-incidence herd. Not cleaning the heifer feeding platform daily increased the odds of having a high-incidence herd twelvefold (p<0.01). Keeping cows in only one group in the dry period increased the odds of having a high incidence herd eightfold (p=0.03). Herd size was confounded with housing system. Housing system was therefore added to the final logistic regression model. In conclusion, a large herd size, a high maximum daily milk yield, keeping dry cows in one group, and not cleaning the feeding platform daily appear to be important risk factors for a high incidence of DA or clinical ketosis in Swedish dairy herds. These results confirm the importance of housing, management and feeding in the prevention of metabolic disorders in dairy cows around parturition and in early lactation. Copyright © 2011 Elsevier B.V. All rights reserved.
Relative Contributions of Agricultural Drift, Para-Occupational, and Residential Use Exposure Pathways to House Dust Pesticide Concentrations: Meta-Regression of Published Data

PubMed Central

Deziel, Nicole C.; Freeman, Laura E. Beane; Graubard, Barry I.; Jones, Rena R.; Hoppin, Jane A.; Thomas, Kent; Hines, Cynthia J.; Blair, Aaron; Sandler, Dale P.; Chen, Honglei; Lubin, Jay H.; Andreotti, Gabriella; Alavanja, Michael C. R.; Friesen, Melissa C.

2016-01-01

Background: Increased pesticide concentrations in house dust in agricultural areas have been attributed to several exposure pathways, including agricultural drift, para-occupational, and residential use. Objective: To guide future exposure assessment efforts, we quantified relative contributions of these pathways using meta-regression models of published data on dust pesticide concentrations. Methods: From studies in North American agricultural areas published from 1995 to 2015, we abstracted dust pesticide concentrations reported as summary statistics [e.g., geometric means (GM)]. We analyzed these data using mixed-effects meta-regression models that weighted each summary statistic by its inverse variance. Dependent variables were either the log-transformed GM (drift) or the log-transformed ratio of GMs from two groups (para-occupational, residential use). Results: For the drift pathway, predicted GMs decreased sharply and nonlinearly, with GMs 64% lower in homes 250 m versus 23 m from fields (interquartile range of published data) based on 52 statistics from seven studies. For the para-occupational pathway, GMs were 2.3 times higher [95% confidence interval (CI): 1.5, 3.3; 15 statistics, five studies] in homes of farmers who applied pesticides more recently or frequently versus less recently or frequently. For the residential use pathway, GMs were 1.3 (95% CI: 1.1, 1.4) and 1.5 (95% CI: 1.2, 1.9) times higher in treated versus untreated homes, when the probability that a pesticide was used for the pest treatment was 1–19% and ≥ 20%, respectively (88 statistics, five studies). Conclusion: Our quantification of the relative contributions of pesticide exposure pathways in agricultural populations could improve exposure assessments in epidemiologic studies. The meta-regression models can be updated when additional data become available. Citation: Deziel NC, Beane Freeman LE, Graubard BI, Jones RR, Hoppin JA, Thomas K, Hines CJ, Blair A, Sandler DP, Chen H, Lubin JH, Andreotti G, Alavanja MC, Friesen MC. 2017. Relative contributions of agricultural drift, para-occupational, and residential use exposure pathways to house dust pesticide concentrations: meta-regression of published data. Environ Health Perspect 125:296–305; http://dx.doi.org/10.1289/EHP426 PMID:27458779
A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

ERIC Educational Resources Information Center

Liou, Pey-Yan

2009-01-01

The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

Validation of prediction models: examining temporal and geographic stability of baseline risk and estimated covariate effects

PubMed Central

Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.

2018-01-01

Background Stability in baseline risk and estimated predictor effects both geographically and temporally is a desirable property of clinical prediction models. However, this issue has received little attention in the methodological literature. Our objective was to examine methods for assessing temporal and geographic heterogeneity in baseline risk and predictor effects in prediction models. Methods We studied 14,857 patients hospitalized with heart failure at 90 hospitals in Ontario, Canada, in two time periods. We focussed on geographic and temporal variation in baseline risk (intercept) and predictor effects (regression coefficients) of the EFFECT-HF mortality model for predicting 1-year mortality in patients hospitalized for heart failure. We used random effects logistic regression models for the 14,857 patients. Results The baseline risk of mortality displayed moderate geographic variation, with the hospital-specific probability of 1-year mortality for a reference patient lying between 0.168 and 0.290 for 95% of hospitals. Furthermore, the odds of death were 11% lower in the second period than in the first period. However, we found minimal geographic or temporal variation in predictor effects. Among 11 tests of differences in time for predictor variables, only one had a modestly significant P value (0.03). Conclusions This study illustrates how temporal and geographic heterogeneity of prediction models can be assessed in settings with a large sample of patients from a large number of centers at different time periods. PMID:29350215
The use of artificial neural networks and multiple linear regression to predict rate of medical waste generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jahandideh, Sepideh; Jahandideh, Samad; Asadabadi, Ebrahim Barzegari

2009-11-15

Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R{sup 2} were used to evaluate performancemore » of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R{sup 2} confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.« less
Breeding value accuracy estimates for growth traits using random regression and multi-trait models in Nelore cattle.

PubMed

Boligon, A A; Baldi, F; Mercadante, M E Z; Lobo, R B; Pereira, R J; Albuquerque, L G

2011-06-28

We quantified the potential increase in accuracy of expected breeding value for weights of Nelore cattle, from birth to mature age, using multi-trait and random regression models on Legendre polynomials and B-spline functions. A total of 87,712 weight records from 8144 females were used, recorded every three months from birth to mature age from the Nelore Brazil Program. For random regression analyses, all female weight records from birth to eight years of age (data set I) were considered. From this general data set, a subset was created (data set II), which included only nine weight records: at birth, weaning, 365 and 550 days of age, and 2, 3, 4, 5, and 6 years of age. Data set II was analyzed using random regression and multi-trait models. The model of analysis included the contemporary group as fixed effects and age of dam as a linear and quadratic covariable. In the random regression analyses, average growth trends were modeled using a cubic regression on orthogonal polynomials of age. Residual variances were modeled by a step function with five classes. Legendre polynomials of fourth and sixth order were utilized to model the direct genetic and animal permanent environmental effects, respectively, while third-order Legendre polynomials were considered for maternal genetic and maternal permanent environmental effects. Quadratic polynomials were applied to model all random effects in random regression models on B-spline functions. Direct genetic and animal permanent environmental effects were modeled using three segments or five coefficients, and genetic maternal and maternal permanent environmental effects were modeled with one segment or three coefficients in the random regression models on B-spline functions. For both data sets (I and II), animals ranked differently according to expected breeding value obtained by random regression or multi-trait models. With random regression models, the highest gains in accuracy were obtained at ages with a low number of weight records. The results indicate that random regression models provide more accurate expected breeding values than the traditionally finite multi-trait models. Thus, higher genetic responses are expected for beef cattle growth traits by replacing a multi-trait model with random regression models for genetic evaluation. B-spline functions could be applied as an alternative to Legendre polynomials to model covariance functions for weights from birth to mature age.
Evaluating differential effects using regression interactions and regression mixture models

PubMed Central

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

ERIC Educational Resources Information Center

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
USE IT OR LOSE IT: EAT AND EXERCISE DURING RADIOTHERAPY OR CHEMORADIOTHERAPY FOR PHARYNGEAL CANCERS

PubMed Central

Hutcheson, Katherine A.; Bhayani, Mihir K.; Beadle, Beth M.; Gold, Kathryn A.; Shinn, Eileen H.; Lai, Stephen Y.; Lewin, Jan

2014-01-01

Objective Proactive swallowing therapy promotes ongoing use of the swallowing mechanism during radiotherapy through 2 goals: eat and exercise. The purpose of this study was to evaluate the independent effects of maintaining oral intake throughout treatment and preventive swallowing exercise. Design Retrospective observational study. Setting The University of Texas MD Anderson Cancer Center, Houston. Patients The study included 497 patients treated with definitive radiotherapy (RT) or chemoradiation (CRT) for pharyngeal cancer (458 oropharynx, 39 hypopharynx) between 2002 and 2008. Main Outcome Measures Swallowing-related endpoints were: final diet after RT/CRT and length of gastrostomy-dependence. Primary independent variables included per oral (PO) status at the end of RT/CRT (nothing per oral [NPO], partial PO, 100% PO) and swallowing exercise adherence. Multiple linear regression and ordered logistic regression models were analyzed. Results At the conclusion of RT/CRT, 131 (26%) were NPO, 74% were PO (167 [34%] partial, 199 [40%] full). Fifty-eight percent (286/497) reported adherence to swallowing exercises. Maintenance of PO intake during RT/CRT and swallowing exercise adherence were independently associated (p<0.05) with better long-term diet after RT/CRT and shorter length of gastrostomy dependence in models adjusted for tumor and treatment burden. Conclusions Data indicate independent, positive associations between maintenance of PO intake throughout RT/CRT and swallowing exercise adherence with long-term swallowing outcomes. Patients who either eat or exercise fare better than those who do neither. Patients who both eat and exercise have the highest return to a regular diet and shortest gastrostomy dependence. PMID:24051544
Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour.

PubMed

Edwards, Andrew M; Freeman, Mervyn P; Breed, Greg A; Jonsen, Ian D

2012-01-01

Ecologists are collecting extensive data concerning movements of animals in marine ecosystems. Such data need to be analysed with valid statistical methods to yield meaningful conclusions. We demonstrate methodological issues in two recent studies that reached similar conclusions concerning movements of marine animals (Nature 451:1098; Science 332:1551). The first study analysed vertical movement data to conclude that diverse marine predators (Atlantic cod, basking sharks, bigeye tuna, leatherback turtles and Magellanic penguins) exhibited "Lévy-walk-like behaviour", close to a hypothesised optimal foraging strategy. By reproducing the original results for the bigeye tuna data, we show that the likelihood of tested models was calculated from residuals of regression fits (an incorrect method), rather than from the likelihood equations of the actual probability distributions being tested. This resulted in erroneous Akaike Information Criteria, and the testing of models that do not correspond to valid probability distributions. We demonstrate how this led to overwhelming support for a model that has no biological justification and that is statistically spurious because its probability density function goes negative. Re-analysis of the bigeye tuna data, using standard likelihood methods, overturns the original result and conclusion for that data set. The second study observed Lévy walk movement patterns by mussels. We demonstrate several issues concerning the likelihood calculations (including the aforementioned residuals issue). Re-analysis of the data rejects the original Lévy walk conclusion. We consequently question the claimed existence of scaling laws of the search behaviour of marine predators and mussels, since such conclusions were reached using incorrect methods. We discourage the suggested potential use of "Lévy-like walks" when modelling consequences of fishing and climate change, and caution that any resulting advice to managers of marine ecosystems would be problematic. For reproducibility and future work we provide R source code for all calculations.
Modeling absolute differences in life expectancy with a censored skew-normal regression approach

PubMed Central

Clough-Gorr, Kerri; Zwahlen, Marcel

2015-01-01

Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544
Prenatal Phthalate, Perfluoroalkyl Acid, and Organochlorine Exposures and Term Birth Weight in Three Birth Cohorts: Multi-Pollutant Models Based on Elastic Net Regression

PubMed Central

Lenters, Virissa; Portengen, Lützen; Rignell-Hydbom, Anna; Jönsson, Bo A.G.; Lindh, Christian H.; Piersma, Aldert H.; Toft, Gunnar; Bonde, Jens Peter; Heederik, Dick; Rylander, Lars; Vermeulen, Roel

2015-01-01

Background Some legacy and emerging environmental contaminants are suspected risk factors for intrauterine growth restriction. However, the evidence is equivocal, in part due to difficulties in disentangling the effects of mixtures. Objectives We assessed associations between multiple correlated biomarkers of environmental exposure and birth weight. Methods We evaluated a cohort of 1,250 term (≥ 37 weeks gestation) singleton infants, born to 513 mothers from Greenland, 180 from Poland, and 557 from Ukraine, who were recruited during antenatal care visits in 2002‒2004. Secondary metabolites of diethylhexyl and diisononyl phthalates (DEHP, DiNP), eight perfluoroalkyl acids, and organochlorines (PCB-153 and p,p´-DDE) were quantifiable in 72‒100% of maternal serum samples. We assessed associations between exposures and term birth weight, adjusting for co-exposures and covariates, including prepregnancy body mass index. To identify independent associations, we applied the elastic net penalty to linear regression models. Results Two phthalate metabolites (MEHHP, MOiNP), perfluorooctanoic acid (PFOA), and p,p´-DDE were most consistently predictive of term birth weight based on elastic net penalty regression. In an adjusted, unpenalized regression model of the four exposures, 2-SD increases in natural log–transformed MEHHP, PFOA, and p,p´-DDE were associated with lower birth weight: –87 g (95% CI: –137, –340 per 1.70 ng/mL), –43 g (95% CI: –108, 23 per 1.18 ng/mL), and –135 g (95% CI: –192, –78 per 1.82 ng/g lipid), respectively; and MOiNP was associated with higher birth weight (46 g; 95% CI: –5, 97 per 2.22 ng/mL). Conclusions This study suggests that several of the environmental contaminants, belonging to three chemical classes, may be independently associated with impaired fetal growth. These results warrant follow-up in other cohorts. Citation Lenters V, Portengen L, Rignell-Hydbom A, Jönsson BA, Lindh CH, Piersma AH, Toft G, Bonde JP, Heederik D, Rylander L, Vermeulen R. 2016. Prenatal phthalate, perfluoroalkyl acid, and organochlorine exposures and term birth weight in three birth cohorts: multi-pollutant models based on elastic net regression. Environ Health Perspect 124:365–372; http://dx.doi.org/10.1289/ehp.1408933 PMID:26115335
Alcohol Misuse and Psychological Resilience among U.S. Iraq and Afghanistan Era Veteran Military Personnel

PubMed Central

Green, Kimberly T.; Beckham, Jean C.; Youssef, Nagy; Elbogen, Eric B.

2013-01-01

Objective The present study sought to investigate the longitudinal effects of psychological resilience against alcohol misuse adjusting for socio-demographic factors, trauma-related variables, and self-reported history of alcohol abuse. Methodology Data were from National Post-Deployment Adjustment Study (NPDAS) participants who completed both a baseline and one-year follow-up survey (N=1090). Survey questionnaires measured combat exposure, probable posttraumatic stress disorder (PTSD), psychological resilience, and alcohol misuse, all of which were measured at two discrete time periods (baseline and one-year follow-up). Baseline resilience and change in resilience (increased or decreased) were utilized as independent variables in separate models evaluating alcohol misuse at the one-year follow-up. Results Multiple linear regression analyses controlled for age, gender, level of educational attainment, combat exposure, PTSD symptom severity, and self-reported alcohol abuse. Accounting for these covariates, findings revealed that lower baseline resilience, younger age, male gender, and self-reported alcohol abuse were related to alcohol misuse at the one-year follow-up. A separate regression analysis, adjusting for the same covariates, revealed a relationship between change in resilience (from baseline to the one-year follow-up) and alcohol misuse at the one-year follow-up. The regression model evaluating these variables in a subset of the sample in which all the participants had been deployed to Iraq and/or Afghanistan was consistent with findings involving the overall era sample. Finally, logistic regression analyses of the one-year follow-up data yielded similar results to the baseline and resilience change models. Conclusions These findings suggest that increased psychological resilience is inversely related to alcohol misuse and is protective against alcohol misuse over time. Additionally, it supports the conceptualization of resilience as a process which evolves over time. Moreover, our results underscore the importance of assessing resilience as part of alcohol use screening for preventing alcohol misuse in Iraq and Afghanistan era military veterans. PMID:24090625
SU-E-J-256: Predicting Metastasis-Free Survival of Rectal Cancer Patients Treated with Neoadjuvant Chemo-Radiotherapy by Data-Mining of CT Texture Features of Primary Lesions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhong, H; Wang, J; Shen, L

Purpose: The purpose of this study is to investigate the relationship between computed tomographic (CT) texture features of primary lesions and metastasis-free survival for rectal cancer patients; and to develop a datamining prediction model using texture features. Methods: A total of 220 rectal cancer patients treated with neoadjuvant chemo-radiotherapy (CRT) were enrolled in this study. All patients underwent CT scans before CRT. The primary lesions on the CT images were delineated by two experienced oncologists. The CT images were filtered by Laplacian of Gaussian (LoG) filters with different filter values (1.0–2.5: from fine to coarse). Both filtered and unfiltered imagesmore » were analyzed using Gray-level Co-occurrence Matrix (GLCM) texture analysis with different directions (transversal, sagittal, and coronal). Totally, 270 texture features with different species, directions and filter values were extracted. Texture features were examined with Student’s t-test for selecting predictive features. Principal Component Analysis (PCA) was performed upon the selected features to reduce the feature collinearity. Artificial neural network (ANN) and logistic regression were applied to establish metastasis prediction models. Results: Forty-six of 220 patients developed metastasis with a follow-up time of more than 2 years. Sixtyseven texture features were significantly different in t-test (p<0.05) between patients with and without metastasis, and 12 of them were extremely significant (p<0.001). The Area-under-the-curve (AUC) of ANN was 0.72, and the concordance index (CI) of logistic regression was 0.71. The predictability of ANN was slightly better than logistic regression. Conclusion: CT texture features of primary lesions are related to metastasisfree survival of rectal cancer patients. Both ANN and logistic regression based models can be developed for prediction.« less
[Associations of the Employment Status during the First 2 Years Following Medical Rehabilitation and Long Term Occupational Trajectories: Implications for Outcome Measurement].

PubMed

Holstiege, J; Kaluscha, R; Jankowiak, S; Krischak, G

2017-02-01

Study Objectives: The aim was to investigate the predictive value of the employment status measured in the 6 th , 12 th , 18 th and 24 th month after medical rehabilitation for long-term employment trajectories during 4 years. Methods: A retrospective study was conducted based on a 20%-sample of all patients receiving inpatient rehabilitation funded by the German pension fund. Patients aged <62 years who were treated due to musculoskeletal, cardiovascular or psychosomatic disorders during the years 2002-2005 were included and followed for 4 consecutive years. The predictive value of the employment status in 4 predefined months after discharge (6 th , 12 th , 18 th and 24 th month), for the total number of months in employment in 4 years following rehabilitative treatment was analyzed using multiple linear regression. Per time point, separate regression analyses were conducted, including the employment status (employed vs. unemployed) at the respective point in time as explanatory variable, besides a standard set of additional prognostic variables. Results: A total of 252 591 patients were eligible for study inclusion. The level of explained variance of the regression models increased with the point in time used to measure the employment status, included as explanatory variable. Overall the R²-measure increased by 30% from the regression model that included the employment status in the 6 th month (R²=0.60) to the model that included the work status in the 24 th month (R²=0.78). Conclusion: The degree of accuracy in the prognosis of long-term employment biographies increases with the point in time used to measure employment in the first 2 years following rehabilitation. These findings should be taken into consideration for the predefinition of time points used to measure the employment status in future studies. © Georg Thieme Verlag KG Stuttgart · New York.
Evaluating risk factors for endemic human Salmonella Enteritidis infections with different phage types in Ontario, Canada using multinomial logistic regression and a case-case study approach

PubMed Central

2012-01-01

Background Identifying risk factors for Salmonella Enteritidis (SE) infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT) in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a) have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a) as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68) and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94), after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors. PMID:23057531
Construction and analysis of a modular model of caspase activation in apoptosis

PubMed Central

Harrington, Heather A; Ho, Kenneth L; Ghosh, Samik; Tung, KC

2008-01-01

Background A key physiological mechanism employed by multicellular organisms is apoptosis, or programmed cell death. Apoptosis is triggered by the activation of caspases in response to both extracellular (extrinsic) and intracellular (intrinsic) signals. The extrinsic and intrinsic pathways are characterized by the formation of the death-inducing signaling complex (DISC) and the apoptosome, respectively; both the DISC and the apoptosome are oligomers with complex formation dynamics. Additionally, the extrinsic and intrinsic pathways are coupled through the mitochondrial apoptosis-induced channel via the Bcl-2 family of proteins. Results A model of caspase activation is constructed and analyzed. The apoptosis signaling network is simplified through modularization methodologies and equilibrium abstractions for three functional modules. The mathematical model is composed of a system of ordinary differential equations which is numerically solved. Multiple linear regression analysis investigates the role of each module and reduced models are constructed to identify key contributions of the extrinsic and intrinsic pathways in triggering apoptosis for different cell lines. Conclusion Through linear regression techniques, we identified the feedbacks, dissociation of complexes, and negative regulators as the key components in apoptosis. The analysis and reduced models for our model formulation reveal that the chosen cell lines predominately exhibit strong extrinsic caspase, typical of type I cell, behavior. Furthermore, under the simplified model framework, the selected cells lines exhibit different modes by which caspase activation may occur. Finally the proposed modularized model of apoptosis may generalize behavior for additional cells and tissues, specifically identifying and predicting components responsible for the transition from type I to type II cell behavior. PMID:19077196
Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

PubMed

Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

2018-06-29

A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

ERIC Educational Resources Information Center

Chen, Chau-Kuang

2005-01-01

Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
A spatially filtered multilevel model to account for spatial dependency: application to self-rated health status in South Korea

PubMed Central

2014-01-01

Background This study aims to suggest an approach that integrates multilevel models and eigenvector spatial filtering methods and apply it to a case study of self-rated health status in South Korea. In many previous health-related studies, multilevel models and single-level spatial regression are used separately. However, the two methods should be used in conjunction because the objectives of both approaches are important in health-related analyses. The multilevel model enables the simultaneous analysis of both individual and neighborhood factors influencing health outcomes. However, the results of conventional multilevel models are potentially misleading when spatial dependency across neighborhoods exists. Spatial dependency in health-related data indicates that health outcomes in nearby neighborhoods are more similar to each other than those in distant neighborhoods. Spatial regression models can address this problem by modeling spatial dependency. This study explores the possibility of integrating a multilevel model and eigenvector spatial filtering, an advanced spatial regression for addressing spatial dependency in datasets. Methods In this spatially filtered multilevel model, eigenvectors function as additional explanatory variables accounting for unexplained spatial dependency within the neighborhood-level error. The specification addresses the inability of conventional multilevel models to account for spatial dependency, and thereby, generates more robust outputs. Results The findings show that sex, employment status, monthly household income, and perceived levels of stress are significantly associated with self-rated health status. Residents living in neighborhoods with low deprivation and a high doctor-to-resident ratio tend to report higher health status. The spatially filtered multilevel model provides unbiased estimations and improves the explanatory power of the model compared to conventional multilevel models although there are no changes in the signs of parameters and the significance levels between the two models in this case study. Conclusions The integrated approach proposed in this paper is a useful tool for understanding the geographical distribution of self-rated health status within a multilevel framework. In future research, it would be useful to apply the spatially filtered multilevel model to other datasets in order to clarify the differences between the two models. It is anticipated that this integrated method will also out-perform conventional models when it is used in other contexts. PMID:24571639
Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)

NASA Astrophysics Data System (ADS)

Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul

2018-05-01

The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.
Effects of Climate Change on Salmonella Infections

PubMed Central

Akil, Luma; Reddy, Remata S.

2014-01-01

Abstract Background: Climate change and global warming have been reported to increase spread of foodborne pathogens. To understand these effects on Salmonella infections, modeling approaches such as regression analysis and neural network (NN) were used. Methods: Monthly data for Salmonella outbreaks in Mississippi (MS), Tennessee (TN), and Alabama (AL) were analyzed from 2002 to 2011 using analysis of variance and time series analysis. Meteorological data were collected and the correlation with salmonellosis was examined using regression analysis and NN. Results: A seasonal trend in Salmonella infections was observed (p<0.001). Strong positive correlation was found between high temperature and Salmonella infections in MS and for the combined states (MS, TN, AL) models (R2=0.554; R2=0.415, respectively). NN models showed a strong effect of rise in temperature on the Salmonella outbreaks. In this study, an increase of 1°F was shown to result in four cases increase of Salmonella in MS. However, no correlation between monthly average precipitation rate and Salmonella infections was observed. Conclusion: There is consistent evidence that gastrointestinal infection with bacterial pathogens is positively correlated with ambient temperature, as warmer temperatures enable more rapid replication. Warming trends in the United States and specifically in the southern states may increase rates of Salmonella infections. PMID:25496072
Estimating Building Age with 3d GIS

NASA Astrophysics Data System (ADS)

Biljecki, F.; Sindram, M.

2017-10-01

Building datasets (e.g. footprints in OpenStreetMap and 3D city models) are becoming increasingly available worldwide. However, the thematic (attribute) aspect is not always given attention, as many of such datasets are lacking in completeness of attributes. A prominent attribute of buildings is the year of construction, which is useful for some applications, but its availability may be scarce. This paper explores the potential of estimating the year of construction (or age) of buildings from other attributes using random forest regression. The developed method has a two-fold benefit: enriching datasets and quality control (verification of existing attributes). Experiments are carried out on a semantically rich LOD1 dataset of Rotterdam in the Netherlands using 9 attributes. The results are mixed: the accuracy in the estimation of building age depends on the available information used in the regression model. In the best scenario we have achieved predictions with an RMSE of 11 years, but in more realistic situations with limited knowledge about buildings the error is much larger (RMSE = 26 years). Hence the main conclusion of the paper is that inferring building age with 3D city models is possible to a certain extent because it reveals the approximate period of construction, but precise estimations remain a difficult task.

Trend analysis of salt load and evaluation of the frequency of water-quality measurements for the Gunnison, the Colorado, and the Dolores rivers in Colorado and Utah

USGS Publications Warehouse

Kircher, J.E.; Dinicola, Richard S.; Middelburg, R.F.

1984-01-01

Monthly values were computed for water-quality constituents at four streamflow gaging stations in the Upper Colorado River basin for the determination of trends. Seasonal regression and seasonal Kendall trend analysis techniques were applied to two monthly data sets at each station site for four different time periods. A recently developed method for determining optimal water-discharge data-collection frequency was also applied to the monthly water-quality data. Trend analysis results varied with each monthly load computational method, period of record, and trend detection model used. No conclusions could be reached regarding which computational method was best to use in trend analysis. Time-period selection for analysis was found to be important with regard to intended use of the results. Seasonal Kendall procedures were found to be applicable to most data sets. Seasonal regression models were more difficult to apply and were sometimes of questionable validity; however, those results were more informative than seasonal Kendall results. The best model to use depends upon the characteristics of the data and the amount of trend information needed. The measurement-frequency optimization method had potential for application to water-quality data, but refinements are needed. (USGS)
Bayesian Unimodal Density Regression for Causal Inference

ERIC Educational Resources Information Center

Karabatsos, George; Walker, Stephen G.

2011-01-01

Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

ERIC Educational Resources Information Center

Culpepper, Steven Andrew; Park, Trevor

2017-01-01

A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
A simple approach to power and sample size calculations in logistic regression and Cox regression models.

PubMed

Vaeth, Michael; Skovlund, Eva

2004-06-15

For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Coping Styles in Heart Failure Patients with Depressive Symptoms

PubMed Central

Trivedi, Ranak B.; Blumenthal, James A.; O'Connor, Christopher; Adams, Kirkwood; Hinderliter, Alan; Sueta-Dupree, Carla; Johnson, Kristy; Sherwood, Andrew

2009-01-01

Objective Elevated depressive symptoms have been linked to poorer prognosis in heart failure (HF) patients. Our objective was to identify coping styles associated with depressive symptoms in HF patients. Methods 222 stable HF patients (32.75% female, 45.4% non-Hispanic Black) completed multiple questionnaires. Beck Depression Inventory (BDI) assessed depressive symptoms, Life Orientation Test (LOT-R) assessed optimism, ENRICHD Social Support Inventory (ESSI) and Perceived Social Support Scale (PSSS) assessed social support, and COPE assessed coping styles. Linear regression analyses were employed to assess the association of coping styles with continuous BDI scores. Logistic regression analyses were performed using BDI scores dichotomized into BDI<10 versus BDI≥10, to identify coping styles accompanying clinically significant depressive symptoms. Results In linear regression models, higher BDI scores were associated with lower scores on the acceptance (β=-.14), humor (β=-.15), planning (β=-.15), and emotional support (β=-.14) subscales of the COPE, and higher scores on the behavioral disengagement (β=.41), denial (β=.33), venting (β=.25), and mental disengagement (β=.22) subscales. Higher PSSS and ESSI scores were associated with lower BDI scores (β=-.32 and -.25, respectively). Higher LOT-R scores were associated with higher BDI scores (β=.39, p<.001). In logistical regression models, BDI≥10 was associated with greater likelihood of behavioral disengagement (OR=1.3), denial (OR=1.2), mental disengagement (OR=1.3), venting (OR=1.2), and pessimism (OR=1.2), and lower perceived social support measured by PSSS (OR=.92) and ESSI (OR=.92). Conclusion Depressive symptoms in HF patients are associated with avoidant coping, lower perceived social support, and pessimism. Results raise the possibility that interventions designed to improve coping may reduce depressive symptoms. PMID:19773027
Marital status and survival of patients with oral cavity squamous cell carcinoma: a population-based study

PubMed Central

Shi, Xiao; Zhang, Ting-ting; Hu, Wei-ping; Ji, Qing-hai

2017-01-01

Background The relationship between marital status and oral cavity squamous cell carcinoma (OCSCC) survival has not been explored. The objective of our study was to evaluate the impact of marital status on OCSCC survival and investigate the potential mechanisms. Results Married patients had better 5-year cancer-specific survival (CSS) (66.7% vs 54.9%) and 5-year overall survival (OS) (56.0% vs 41.1%). In multivariate Cox regression models, unmarried patients also showed higher mortality risk for both CSS (Hazard Ratio [HR]: 1.260, 95% confidence interval (CI): 1.187–1.339, P < 0.001) and OS (HR: 1.328, 95% CI: 1.266–1.392, P < 0.001). Multivariate logistic regression showed married patients were more likely to be diagnosed at earlier stage (P < 0.001) and receive surgery (P < 0.001). Married patients still demonstrated better prognosis in the 1:1 matched group analysis (CSS: 62.9% vs 60.8%, OS: 52.3% vs 46.5%). Materials and Methods 11022 eligible OCSCC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, including 5902 married and 5120 unmarried individuals. Kaplan-Meier analysis, Log-rank test and Cox proportional hazards regression model were used to analyze survival and mortality risk. Influence of marital status on stage, age at diagnosis and selection of treatment was determined by binomial and multinomial logistic regression. Propensity score matching method was adopted to perform a 1:1 matched cohort. Conclusions Marriage has an independently protective effect on OCSCC survival. Earlier diagnosis and more sufficient treatment are possible explanations. Besides, even after 1:1 matching, survival advantage of married group still exists, indicating that spousal support from other aspects may also play an important role. PMID:28415710
Estimating and Predicting Metal Concentration Using Online Turbidity Values and Water Quality Models in Two Rivers of the Taihu Basin, Eastern China

PubMed Central

Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin

2016-01-01

Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R2 = 0.86–0.93 for 72 data sets collected in the industrial river and R2 = 0.60–0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals’ concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density. PMID:27028017
The effectiveness of manual and mechanical instrumentation for the retreatment of three different root canal filling materials.

PubMed

Somma, Francesco; Cammarota, Giuseppe; Plotino, Gianluca; Grande, Nicola M; Pameijer, Cornelis H

2008-04-01

The aim of this study was to compare the effectiveness of the Mtwo R (Sweden & Martina, Padova, Italy), ProTaper retreatment files (Dentsply-Maillefer, Ballaigues, Switzerland), and a Hedström manual technique in the removal of three different filling materials (gutta-percha, Resilon [Resilon Research LLC, Madison, CT], and EndoRez [Ultradent Products Inc, South Jordan, UT]) during retreatment. Ninety single-rooted straight premolars were instrumented and randomly divided into 9 groups of 10 teeth each (n = 10) with regards to filling material and instrument used. For all roots, the following data were recorded: procedural errors, time of retreatment, apically extruded material, canal wall cleanliness through optical stereomicroscopy (OSM), and scanning electron microscopy (SEM). A linear regression analysis and three logistic regression analyses were performed to assess the level of significance set at p = 0.05. The results indicated that the overall regression models were statistically significant. The Mtwo R, ProTaper retreatment files, and Resilon filling material had a positive impact in reducing the time for retreatment. Both ProTaper retreatment files and Mtwo R showed a greater extrusion of debris. For both OSM and SEM logistic regression models, the root canal apical third had the greatest impact on the score values. EndoRez filling material resulted in cleaner root canal walls using OSM analysis, whereas Resilon filling material and both engine-driven NiTi rotary techniques resulted in less clean root canal walls according to SEM analysis. In conclusion, all instruments left remnants of filling material and debris on the root canal walls irrespective of the root filling material used. Both the engine-driven NiTi rotary systems proved to be safe and fast devices for the removal of endodontic filling material.
Estimating and Predicting Metal Concentration Using Online Turbidity Values and Water Quality Models in Two Rivers of the Taihu Basin, Eastern China.

PubMed

Yao, Hong; Zhuang, Wei; Qian, Yu; Xia, Bisheng; Yang, Yang; Qian, Xin

2016-01-01

Turbidity (T) has been widely used to detect the occurrence of pollutants in surface water. Using data collected from January 2013 to June 2014 at eleven sites along two rivers feeding the Taihu Basin, China, the relationship between the concentration of five metals (aluminum (Al), titanium (Ti), nickel (Ni), vanadium (V), lead (Pb)) and turbidity was investigated. Metal concentration was determined using inductively coupled plasma mass spectrometry (ICP-MS). The linear regression of metal concentration and turbidity provided a good fit, with R(2) = 0.86-0.93 for 72 data sets collected in the industrial river and R(2) = 0.60-0.85 for 60 data sets collected in the cleaner river. All the regression presented good linear relationship, leading to the conclusion that the occurrence of the five metals are directly related to suspended solids, and these metal concentration could be approximated using these regression equations. Thus, the linear regression equations were applied to estimate the metal concentration using online turbidity data from January 1 to June 30 in 2014. In the prediction, the WASP 7.5.2 (Water Quality Analysis Simulation Program) model was introduced to interpret the transport and fates of total suspended solids; in addition, metal concentration downstream of the two rivers was predicted. All the relative errors between the estimated and measured metal concentration were within 30%, and those between the predicted and measured values were within 40%. The estimation and prediction process of metals' concentration indicated that exploring the relationship between metals and turbidity values might be one effective technique for efficient estimation and prediction of metal concentration to facilitate better long-term monitoring with high temporal and spatial density.
Comparative evaluation of urban storm water quality models

NASA Astrophysics Data System (ADS)

Vaze, J.; Chiew, Francis H. S.

2003-10-01

The estimation of urban storm water pollutant loads is required for the development of mitigation and management strategies to minimize impacts to receiving environments. Event pollutant loads are typically estimated using either regression equations or "process-based" water quality models. The relative merit of using regression models compared to process-based models is not clear. A modeling study is carried out here to evaluate the comparative ability of the regression equations and process-based water quality models to estimate event diffuse pollutant loads from impervious surfaces. The results indicate that, once calibrated, both the regression equations and the process-based model can estimate event pollutant loads satisfactorily. In fact, the loads estimated using the regression equation as a function of rainfall intensity and runoff rate are better than the loads estimated using the process-based model. Therefore, if only estimates of event loads are required, regression models should be used because they are simpler and require less data compared to process-based models.
Recurrence risk model for esophageal cancer after radical surgery

PubMed Central

Tao, Hua; Song, Dan; Chen, Cheng

2013-01-01

Objective The aim of the present study was to construct a risk assessment model which was tested by disease-free survival (DFS) of esophageal cancer after radical surgery. Methods A total of 164 consecutive esophageal cancer patients who had undergone radical surgery between January 2005 and December 2006 were retrospectively analyzed. The cutpoint of value at risk (VaR) was inferred by stem-and-leaf plot, as well as by independent-samples t-test for recurrence-free time, further confirmed by crosstab chi-square test, univariate analysis and Cox regression analysis for DFS. Results The cutpoint of VaR was 0.3 on the basis of our model. The rate of recurrence was 30.3% (30/99) and 52.3% (34/65) in VaR <0.3 and VaR ≥0.3 (chi-square test, χ2 =7.984, P=0.005), respectively. The 1-, 3-, and 5-year DFS of esophageal cancer after radical surgery was 70.4%, 48.7%, and 45.3%, respectively in VaR ≥0.3, whereas 91.5%, 75.8%, and 67.3%, respectively in VaR <0.3 (Log-rank test, χ2 =9.59, P=0.0020), and further confirmed by Cox regression analysis [hazard ratio =2.10, 95% confidence interval (CI): 1.2649-3.4751; P=0.0041]. Conclusions The model could be applied for integrated assessment of recurrence risk after radical surgery for esophageal cancer. PMID:24255579
Exploring and accounting for publication bias in mental health: a brief overview of methods.

PubMed

Mavridis, Dimitris; Salanti, Georgia

2014-02-01

OBJECTIVE Publication bias undermines the integrity of published research. The aim of this paper is to present a synopsis of methods for exploring and accounting for publication bias. METHODS We discussed the main features of the following methods to assess publication bias: funnel plot analysis; trim-and-fill methods; regression techniques and selection models. We applied these methods to a well-known example of antidepressants trials that compared trials submitted to the Food and Drug Administration (FDA) for regulatory approval. RESULTS The funnel plot-related methods (visual inspection, trim-and-fill, regression models) revealed an association between effect size and SE. Contours of statistical significance showed that asymmetry in the funnel plot is probably due to publication bias. Selection model found a significant correlation between effect size and propensity for publication. CONCLUSIONS Researchers should always consider the possible impact of publication bias. Funnel plot-related methods should be seen as a means of examining for small-study effects and not be directly equated with publication bias. Possible causes for funnel plot asymmetry should be explored. Contours of statistical significance may help disentangle whether asymmetry in a funnel plot is caused by publication bias or not. Selection models, although underused, could be useful resource when publication bias and heterogeneity are suspected because they address directly the problem of publication bias and not that of small-study effects.
Impact of weather factors on hand, foot and mouth disease, and its role in short-term incidence trend forecast in Huainan City, Anhui Province.

PubMed

Zhao, Desheng; Wang, Lulu; Cheng, Jian; Xu, Jun; Xu, Zhiwei; Xie, Mingyu; Yang, Huihui; Li, Kesheng; Wen, Lingying; Wang, Xu; Zhang, Heng; Wang, Shusi; Su, Hong

2017-03-01

Hand, foot, and mouth disease (HFMD) is one of the most common communicable diseases in China, and current climate change had been recognized as a significant contributor. Nevertheless, no reliable models have been put forward to predict the dynamics of HFMD cases based on short-term weather variations. The present study aimed to examine the association between weather factors and HFMD, and to explore the accuracy of seasonal auto-regressive integrated moving average (SARIMA) model with local weather conditions in forecasting HFMD. Weather and HFMD data from 2009 to 2014 in Huainan, China, were used. Poisson regression model combined with a distributed lag non-linear model (DLNM) was applied to examine the relationship between weather factors and HFMD. The forecasting model for HFMD was performed by using the SARIMA model. The results showed that temperature rise was significantly associated with an elevated risk of HFMD. Yet, no correlations between relative humidity, barometric pressure and rainfall, and HFMD were observed. SARIMA models with temperature variable fitted HFMD data better than the model without it (sR 2 increased, while the BIC decreased), and the SARIMA (0, 1, 1)(0, 1, 0) 52 offered the best fit for HFMD data. In addition, compared with females and nursery children, males and scattered children may be more suitable for using SARIMA model to predict the number of HFMD cases and it has high precision. In conclusion, high temperature could increase the risk of contracting HFMD. SARIMA model with temperature variable can effectively improve its forecast accuracy, which can provide valuable information for the policy makers and public health to construct a best-fitting model and optimize HFMD prevention.
Time Series Analysis for Forecasting Hospital Census: Application to the Neonatal Intensive Care Unit

PubMed Central

Hoover, Stephen; Jackson, Eric V.; Paul, David; Locke, Robert

2016-01-01

Summary Background Accurate prediction of future patient census in hospital units is essential for patient safety, health outcomes, and resource planning. Forecasting census in the Neonatal Intensive Care Unit (NICU) is particularly challenging due to limited ability to control the census and clinical trajectories. The fixed average census approach, using average census from previous year, is a forecasting alternative used in clinical practice, but has limitations due to census variations. Objective Our objectives are to: (i) analyze the daily NICU census at a single health care facility and develop census forecasting models, (ii) explore models with and without patient data characteristics obtained at the time of admission, and (iii) evaluate accuracy of the models compared with the fixed average census approach. Methods We used five years of retrospective daily NICU census data for model development (January 2008 – December 2012, N=1827 observations) and one year of data for validation (January – December 2013, N=365 observations). Best-fitting models of ARIMA and linear regression were applied to various 7-day prediction periods and compared using error statistics. Results The census showed a slightly increasing linear trend. Best fitting models included a non-seasonal model, ARIMA(1,0,0), seasonal ARIMA models, ARIMA(1,0,0)x(1,1,2)7 and ARIMA(2,1,4)x(1,1,2)14, as well as a seasonal linear regression model. Proposed forecasting models resulted on average in 36.49% improvement in forecasting accuracy compared with the fixed average census approach. Conclusions Time series models provide higher prediction accuracy under different census conditions compared with the fixed average census approach. Presented methodology is easily applicable in clinical practice, can be generalized to other care settings, support short- and long-term census forecasting, and inform staff resource planning. PMID:27437040
Impact of weather factors on hand, foot and mouth disease, and its role in short-term incidence trend forecast in Huainan City, Anhui Province

NASA Astrophysics Data System (ADS)

Zhao, Desheng; Wang, Lulu; Cheng, Jian; Xu, Jun; Xu, Zhiwei; Xie, Mingyu; Yang, Huihui; Li, Kesheng; Wen, Lingying; Wang, Xu; Zhang, Heng; Wang, Shusi; Su, Hong

2017-03-01

Hand, foot, and mouth disease (HFMD) is one of the most common communicable diseases in China, and current climate change had been recognized as a significant contributor. Nevertheless, no reliable models have been put forward to predict the dynamics of HFMD cases based on short-term weather variations. The present study aimed to examine the association between weather factors and HFMD, and to explore the accuracy of seasonal auto-regressive integrated moving average (SARIMA) model with local weather conditions in forecasting HFMD. Weather and HFMD data from 2009 to 2014 in Huainan, China, were used. Poisson regression model combined with a distributed lag non-linear model (DLNM) was applied to examine the relationship between weather factors and HFMD. The forecasting model for HFMD was performed by using the SARIMA model. The results showed that temperature rise was significantly associated with an elevated risk of HFMD. Yet, no correlations between relative humidity, barometric pressure and rainfall, and HFMD were observed. SARIMA models with temperature variable fitted HFMD data better than the model without it (s R 2 increased, while the BIC decreased), and the SARIMA (0, 1, 1)(0, 1, 0)52 offered the best fit for HFMD data. In addition, compared with females and nursery children, males and scattered children may be more suitable for using SARIMA model to predict the number of HFMD cases and it has high precision. In conclusion, high temperature could increase the risk of contracting HFMD. SARIMA model with temperature variable can effectively improve its forecast accuracy, which can provide valuable information for the policy makers and public health to construct a best-fitting model and optimize HFMD prevention.
Does information available at admission for delivery improve prediction of vaginal birth after cesarean?

PubMed Central

Grobman, William A.; Lai, Yinglei; Landon, Mark B.; Spong, Catherine Y.; Leveno, Kenneth J.; Rouse, Dwight J.; Varner, Michael W.; Moawad, Atef H.; Simhan, Hyagriv N.; Harper, Margaret; Wapner, Ronald J.; Sorokin, Yoram; Miodovnik, Menachem; Carpenter, Marshall; O'sullivan, Mary J.; Sibai, Baha M.; Langer, Oded; Thorp, John M.; Ramin, Susan M.; Mercer, Brian M.

2010-01-01

Objective To construct a predictive model for vaginal birth after cesarean (VBAC) that combines factors that can be ascertained only as the pregnancy progresses with those known at initiation of prenatal care. Study design Using multivariable modeling, we constructed a predictive model for VBAC that included patient factors known at the initial prenatal visit as well as those that only became evident as the pregancy progressed to the admission for delivery. Results 9616 women were analyzed. The regression equation for VBAC success included multiple factors that could not be known at the first prenatal visit. The area under the curve for this model was significantly greater (P < .001) than that of a model that included only factors available at the first prenatal visit. Conclusion A prediction model for VBAC success that incorporates factors that can be ascertained only as the pregnancy progresses adds to the predictive accuracy of a model that uses only factors available at a first prenatal visit. PMID:19813165
A generalized right truncated bivariate Poisson regression model with applications to health data.

PubMed

Islam, M Ataharul; Chowdhury, Rafiqul I

2017-01-01

A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.
A generalized right truncated bivariate Poisson regression model with applications to health data

PubMed Central

Islam, M. Ataharul; Chowdhury, Rafiqul I.

2017-01-01

A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model. PMID:28586344
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

NASA Astrophysics Data System (ADS)

Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

2018-04-01

In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Cross-cultural generalizability of personality dimensions: relating indigenous and imported dimensions in two cultures.

PubMed

Katigbak, M S; Church, A T; Akamine, T X

1996-01-01

The cross-cultural generalizability of personality dimensions was investigated by (a) identifying indigenous Philippine dimensions, (b) testing the cross-cultural replicability of the NEO 5-factor model (P. T. Costa & R.R. McCrae, 1992), and (c) relating Philippine and Western dimensions in Philippine and U.S. samples of college students. Filipino self-ratings (N = 536) on indigenous items were factor analyzed, and 6 Philippine dimensions were obtained. Conclusions about the replicability of the 5-factor model in the Philippines (N = 432) depended on whether exploratory, Procrustes, or confirmatory factor methods were used. In regression and joint factor analyses, moderate to strong associations were found between the Philippine dimensions and (a) dimensions from the 5-factor model in both Philippine (N = 387) and U.S. (N = 610) samples, and (b) the Tellegen model (A. Tellegen, 1985; A. Tellegen & N.G. Waller, in press) in a U.S. sample (N = 603).

Spatial Assessment of Model Errors from Four Regression Techniques

Treesearch

Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

2005-01-01

Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?

NASA Astrophysics Data System (ADS)

Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.

2016-12-01

Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.
Developing a predictive tropospheric ozone model for Tabriz

NASA Astrophysics Data System (ADS)

Khatibi, Rahman; Naghipour, Leila; Ghorbani, Mohammad A.; Smith, Michael S.; Karimi, Vahid; Farhoudi, Reza; Delafrouz, Hadi; Arvanaghi, Hadi

2013-04-01

Predictive ozone models are becoming indispensable tools by providing a capability for pollution alerts to serve people who are vulnerable to the risks. We have developed a tropospheric ozone prediction capability for Tabriz, Iran, by using the following five modeling strategies: three regression-type methods: Multiple Linear Regression (MLR), Artificial Neural Networks (ANNs), and Gene Expression Programming (GEP); and two auto-regression-type models: Nonlinear Local Prediction (NLP) to implement chaos theory and Auto-Regressive Integrated Moving Average (ARIMA) models. The regression-type modeling strategies explain the data in terms of: temperature, solar radiation, dew point temperature, and wind speed, by regressing present ozone values to their past values. The ozone time series are available at various time intervals, including hourly intervals, from August 2010 to March 2011. The results for MLR, ANN and GEP models are not overly good but those produced by NLP and ARIMA are promising for the establishing a forecasting capability.
Unitary Response Regression Models

ERIC Educational Resources Information Center

Lipovetsky, S.

2007-01-01

The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Sudarno

2018-05-01

The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
[From clinical judgment to linear regression model.

PubMed

Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

2013-01-01

When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Impact of multicollinearity on small sample hydrologic regression models

NASA Astrophysics Data System (ADS)

Kroll, Charles N.; Song, Peter

2013-06-01

Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Risk Factors Predicting Infectious Lactational Mastitis: Decision Tree Approach versus Logistic Regression Analysis.

PubMed

Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María

2016-09-01

Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.
Disability weights for infectious diseases in four European countries: comparison between countries and across respondent characteristics

PubMed Central

Maertens de Noordhout, Charline; Devleesschauwer, Brecht; Salomon, Joshua A; Turner, Heather; Cassini, Alessandro; Colzani, Edoardo; Speybroeck, Niko; Polinder, Suzanne; Kretzschmar, Mirjam E; Havelaar, Arie H; Haagsma, Juanita A

2018-01-01

Abstract Background In 2015, new disability weights (DWs) for infectious diseases were constructed based on data from four European countries. In this paper, we evaluated if country, age, sex, disease experience status, income and educational levels have an impact on these DWs. Methods We analyzed paired comparison responses of the European DW study by participants’ characteristics with separate probit regression models. To evaluate the effect of participants’ characteristics, we performed correlation analyses between countries and within country by respondent characteristics and constructed seven probit regression models, including a null model and six models containing participants’ characteristics. We compared these seven models using Akaike Information Criterion (AIC). Results According to AIC, the probit model including country as covariate was the best model. We found a lower correlation of the probit coefficients between countries and income levels (range rs: 0.97–0.99, P < 0.01) than between age groups (range rs: 0.98–0.99, P < 0.01), educational level (range rs: 0.98–0.99, P < 0.01), sex (rs = 0.99, P < 0.01) and disease status (rs = 0.99, P < 0.01). Within country the lowest correlations of the probit coefficients were between low and high income level (range rs = 0.89–0.94, P < 0.01). Conclusions We observed variations in health valuation across countries and within country between income levels. These observations should be further explored in a systematic way, also in non-European countries. We recommend future researches studying the effect of other characteristics of respondents on health assessment. PMID:29020343
The Causal Meaning of Genomic Predictors and How It Affects Construction and Comparison of Genome-Enabled Selection Models

PubMed Central

Valente, Bruno D.; Morota, Gota; Peñagaricano, Francisco; Gianola, Daniel; Weigel, Kent; Rosa, Guilherme J. M.

2015-01-01

The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability. PMID:25908318
The Cost of Unintended Pregnancies for Employer-Sponsored Health Insurance Plans

PubMed Central

Dieguez, Gabriela; Pyenson, Bruce S.; Law, Amy W.; Lynen, Richard; Trussell, James

2015-01-01

Background Pregnancy is associated with a significant cost for employers providing health insurance benefits to their employees. The latest study on the topic was published in 2002, estimating the unintended pregnancy rate for women covered by employer-sponsored insurance benefits to be approximately 29%. Objectives The primary objective of this study was to update the cost of unintended pregnancy to employer-sponsored health insurance plans with current data. The secondary objective was to develop a regression model to identify the factors and associated magnitude that contribute to unintended pregnancies in the employee benefits population. Methods We developed stepwise multinomial logistic regression models using data from a national survey on maternal attitudes about pregnancy before and shortly after giving birth. The survey was conducted by the Centers for Disease Control and Prevention through mail and via telephone interviews between 2009 and 2011 of women who had had a live birth. The regression models were then applied to a large commercial health claims database from the Truven Health MarketScan to retrospectively assign the probability of pregnancy intention to each delivery. Results Based on the MarketScan database, we estimate that among employer-sponsored health insurance plans, 28.8% of pregnancies are unintended, which is consistent with national findings of 29% in a survey by the Centers for Disease Control and Prevention. These unintended pregnancies account for 27.4% of the annual delivery costs to employers in the United States, or approximately 1% of the typical employer's health benefits spending for 1 year. Using these findings, we present a regression model that employers could apply to their claims data to identify the risk for unintended pregnancies in their health insurance population. Conclusion The availability of coverage for contraception without employee cost-sharing, as was required by the Affordable Care Act in 2012, combined with the ability to identify women who are at high risk for an unintended pregnancy, can help employers address the costs of unintended pregnancies in their employee benefits population. This can also help to bring contraception efforts into the mainstream of other preventive and wellness programs, such as smoking cessation, obesity management, and diabetes control programs. PMID:26005515
Genetic evaluation and selection response for growth in meat-type quail through random regression models using B-spline functions and Legendre polynomials.

PubMed

Mota, L F M; Martins, P G M A; Littiere, T O; Abreu, L R A; Silva, M A; Bonafé, C M

2018-04-01

The objective was to estimate (co)variance functions using random regression models (RRM) with Legendre polynomials, B-spline function and multi-trait models aimed at evaluating genetic parameters of growth traits in meat-type quail. A database containing the complete pedigree information of 7000 meat-type quail was utilized. The models included the fixed effects of contemporary group and generation. Direct additive genetic and permanent environmental effects, considered as random, were modeled using B-spline functions considering quadratic and cubic polynomials for each individual segment, and Legendre polynomials for age. Residual variances were grouped in four age classes. Direct additive genetic and permanent environmental effects were modeled using 2 to 4 segments and were modeled by Legendre polynomial with orders of fit ranging from 2 to 4. The model with quadratic B-spline adjustment, using four segments for direct additive genetic and permanent environmental effects, was the most appropriate and parsimonious to describe the covariance structure of the data. The RRM using Legendre polynomials presented an underestimation of the residual variance. Lesser heritability estimates were observed for multi-trait models in comparison with RRM for the evaluated ages. In general, the genetic correlations between measures of BW from hatching to 35 days of age decreased as the range between the evaluated ages increased. Genetic trend for BW was positive and significant along the selection generations. The genetic response to selection for BW in the evaluated ages presented greater values for RRM compared with multi-trait models. In summary, RRM using B-spline functions with four residual variance classes and segments were the best fit for genetic evaluation of growth traits in meat-type quail. In conclusion, RRM should be considered in genetic evaluation of breeding programs.
Compatible Models of Carbon Content of Individual Trees on a Cunninghamia lanceolata Plantation in Fujian Province, China

PubMed Central

Zhuo, Lin; Tao, Hong; Wei, Hong; Chengzhen, Wu

2016-01-01

We tried to establish compatible carbon content models of individual trees for a Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) plantation from Fujian province in southeast China. In general, compatibility requires that the sum of components equal the whole tree, meaning that the sum of percentages calculated from component equations should equal 100%. Thus, we used multiple approaches to simulate carbon content in boles, branches, foliage leaves, roots and the whole individual trees. The approaches included (i) single optimal fitting (SOF), (ii) nonlinear adjustment in proportion (NAP) and (iii) nonlinear seemingly unrelated regression (NSUR). These approaches were used in combination with variables relating diameter at breast height (D) and tree height (H), such as D, D2H, DH and D&H (where D&H means two separate variables in bivariate model). Power, exponential and polynomial functions were tested as well as a new general function model was proposed by this study. Weighted least squares regression models were employed to eliminate heteroscedasticity. Model performances were evaluated by using mean residuals, residual variance, mean square error and the determination coefficient. The results indicated that models with two dimensional variables (DH, D2H and D&H) were always superior to those with a single variable (D). The D&H variable combination was found to be the most useful predictor. Of all the approaches, SOF could establish a single optimal model separately, but there were deviations in estimating results due to existing incompatibilities, while NAP and NSUR could ensure predictions compatibility. Simultaneously, we found that the new general model had better accuracy than others. In conclusion, we recommend that the new general model be used to estimate carbon content for Chinese fir and considered for other vegetation types as well. PMID:26982054
Serum Folate Shows an Inverse Association with Blood Pressure in a Cohort of Chinese Women of Childbearing Age: A Cross-Sectional Study

PubMed Central

Shen, Minxue; Tan, Hongzhuan; Zhou, Shujin; Retnakaran, Ravi; Smith, Graeme N.; Davidge, Sandra T.; Trasler, Jacquetta; Walker, Mark C.; Wen, Shi Wu

2016-01-01

Background It has been reported that higher folate intake from food and supplementation is associated with decreased blood pressure (BP). The association between serum folate concentration and BP has been examined in few studies. We aim to examine the association between serum folate and BP levels in a cohort of young Chinese women. Methods We used the baseline data from a pre-conception cohort of women of childbearing age in Liuyang, China, for this study. Demographic data were collected by structured interview. Serum folate concentration was measured by immunoassay, and homocysteine, blood glucose, triglyceride and total cholesterol were measured through standardized clinical procedures. Multiple linear regression and principal component regression model were applied in the analysis. Results A total of 1,532 healthy normotensive non-pregnant women were included in the final analysis. The mean concentration of serum folate was 7.5 ± 5.4 nmol/L and 55% of the women presented with folate deficiency (< 6.8 nmol/L). Multiple linear regression and principal component regression showed that serum folate levels were inversely associated with systolic and diastolic BP, after adjusting for demographic, anthropometric, and biochemical factors. Conclusions Serum folate is inversely associated with BP in non-pregnant women of childbearing age with high prevalence of folate deficiency. PMID:27182603
Live Donor Renal Anatomic Asymmetry and Post-Transplant Renal Function

PubMed Central

Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S.; Newhouse, Jeffrey H.; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J.; Carroll, Maureen A.; Sharif, Sairah; Cohen, David J.; Ratner, Lloyd E.; Hardy, Mark A.

2014-01-01

Background Relationship between live donor renal anatomic asymmetry and post-transplant recipient function has not been studied extensively. Methods We analyzed 96 live-kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from CT angiograms) and their matching recipients. Split function differences (SFD) were quantified with 99mTc-DMSA renography. Implantation biopsies at time-zero were semi-quantitatively scored. A comprehensive model utilizing donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at one-year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60ml/min/1.73 m2 at one-year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the CKD-EPI formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). Results In the study cohort, the mean Vol/Wgt and eGFR at one-year were 2.04 ml/kg and 60.4 ml/min/1.73m2, respectively. Volume and split ratios between two donor kidneys were strongly correlated (r=0.79, p-value<0.001). The biopsy scores among SFD categories (<5%, 5–10%, >10%) were not different (p=0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR>60ml/min/1.73 m2 (OR=8.94, 95% CI 2.47–32.25, p=0.001) and had a strong discriminatory power in predicting the risk of eGFR<60ml/min/1.73m2 at one-year (ROC curve=0.78, 95% CI 0.68–0.89). Conclusion In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at one-year post-transplantation. Renography can be replaced with CT volume calculation in estimating split renal function. PMID:25719258
Messing Up Texas?: A Re-Analysis of the Effects of Executions on Homicides.

PubMed

Brandt, Patrick T; Kovandzic, Tomislav V

2015-01-01

Executions in Texas from 1994-2005 do not deter homicides, contrary to the results of Land et al. (2009). We find that using different models--based on pre-tests for unit roots that correct for earlier model misspecifications--one cannot reject the null hypothesis that executions do not lead to a change in homicides in Texas over this period. Using additional control variables, we show that variables such as the number of prisoners in Texas may drive the main drop in homicides over this period. Such conclusions however are highly sensitive to model specification decisions, calling into question the assumptions about fixed parameters and constant structural relationships. This means that using dynamic regressions to account for policy changes that may affect homicides need to be done with significant care and attention.
A Pre-Screening Questionnaire to Predict Non-24-Hour Sleep-Wake Rhythm Disorder (N24HSWD) among the Blind

PubMed Central

Flynn-Evans, Erin E.; Lockley, Steven W.

2016-01-01

Study Objectives: There is currently no questionnaire-based pre-screening tool available to detect non-24-hour sleep-wake rhythm disorder (N24HSWD) among blind patients. Our goal was to develop such a tool, derived from gold standard, objective hormonal measures of circadian entrainment status, for the detection of N24HSWD among those with visual impairment. Methods: We evaluated the contribution of 40 variables in their ability to predict N24HSWD among 127 blind women, classified using urinary 6-sulfatoxymelatonin period, an objective marker of circadian entrainment status in this population. We subjected the 40 candidate predictors to 1,000 bootstrapped iterations of a logistic regression forward selection model to predict N24HSWD, with model inclusion set at the p < 0.05 level. We removed any predictors that were not selected at least 1% of the time in the 1,000 bootstrapped models and applied a second round of 1,000 bootstrapped logistic regression forward selection models to the remaining 23 candidate predictors. We included all questions that were selected at least 10% of the time in the final model. We subjected the selected predictors to a final logistic regression model to predict N24SWD over 1,000 bootstrapped models to calculate the concordance statistic and adjusted optimism of the final model. We used this information to generate a predictive model and determined the sensitivity and specificity of the model. Finally, we applied the model to a cohort of 1,262 blind women who completed the survey, but did not collect urine samples. Results: The final model consisted of eight questions. The concordance statistic, adjusted for bootstrapping, was 0.85. The positive predictive value was 88%, the negative predictive value was 79%. Applying this model to our larger dataset of women, we found that 61% of those without light perception, and 27% with some degree of light perception, would be referred for further screening for N24HSWD. Conclusions: Our model has predictive utility sufficient to serve as a pre-screening questionnaire for N24HSWD among the blind. Citation: Flynn-Evans EE, Lockley SW. A pre-screening questionnaire to predict non-24-hour sleep-wake rhythm disorder (N24HSWD) among the blind. J Clin Sleep Med 2016;12(5):703–710. PMID:26951421
Real estate value prediction using multivariate regression models

NASA Astrophysics Data System (ADS)

Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

2017-11-01

The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Exploring bikeability in a metropolitan setting: stimulating and hindering factors in commuting route environments

PubMed Central

2012-01-01

Background Route environments may influence people's active commuting positively and thereby contribute to public health. Assessments of route environments are, however, needed in order to better understand the possible relationship between active commuting and the route environment. The aim of this study was, therefore, to assess the potential associations between perceptions of whether the route environment on the whole hinders or stimulates bicycle commuting and perceptions of environmental factors. Methods The Active Commuting Route Environment Scale (ACRES) was used for the assessment of bicycle commuters' perceptions of their route environments in the inner urban parts of Greater Stockholm, Sweden. Bicycle commuters (n = 827) were recruited by advertisements in newspapers. Simultaneous multiple regression analyses were used to assess the relation between predictor variables (such as levels of exhaust fumes, noise, traffic speed, traffic congestion and greenery) and the outcome variable (hindering - stimulating route environments). Two models were run, (Model 1) without and (Model 2) with the item traffic: unsafe or safe included as a predictor. Results Overall, about 40% of the variance of hindering - stimulating route environments was explained by the environmental predictors in our models (Model 1, R2 = 0.415, and Model 2, R 2= 0.435). The regression equation for Model 1 was: y = 8.53 + 0.33 ugly or beautiful + 0.14 greenery + (-0.14) course of the route + (-0.13) exhaust fumes + (-0.09) congestion: all types of vehicles (p ≤ 0.019). The regression equation for Model 2 was y = 6.55 + 0.31 ugly or beautiful + 0.16 traffic: unsafe or safe + (-0.13) exhaust fumes + 0.12 greenery + (-0.12) course of the route (p ≤ 0.001). Conclusions The main results indicate that beautiful, green and safe route environments seem to be, independently of each other, stimulating factors for bicycle commuting in inner urban areas. On the other hand, exhaust fumes, traffic congestion and low 'directness' of the route seem to be hindering factors. Furthermore, the overall results illustrate the complexity of a research area at the beginning of exploration. PMID:22401492
Teenage smoking, attempts to quit, and school performance.

PubMed Central

Hu, T W; Lin, Z; Keeler, T E

1998-01-01

OBJECTIVES: This study examined the relationship between school performance, smoking, and quitting attempts among teenagers. METHODS: A logistic regression model was used to predict the probability of being a current smoker or a former smoker. Data were derived from the 1990 California Youth Tobacco Survey. RESULTS: Students' school performance was a key factor in predicting smoking and quitting attempts when other sociodemographic and family income factors were controlled. CONCLUSIONS: Developing academic or remedial classes designed to improve students' school performance may lead to a reduction in smoking rates among teenagers while simultaneously providing a human capital investment in their futures. PMID:9618625

Pathological grief and the activation of latent self-images.

PubMed

Horowitz, M J; Wilner, N; Marmar, C; Krupnick, J

1980-10-01

The authors studied the case material for patients treated with either psychoanalysis or brief therapy to examine the basis for the various states of pathological grief after berevavement. They view these states as intensifications or unusual prolongations of states found in normal grief and describe them in terms of the reemergence of self-images and role relationship models that had been held in check by the existence ofthe deceased person. This conclusion concerning preexisting mental schemata leads to an elaboration and partial revision of theories of regression, ambivalence, and introjection as causes of pathological grief.
THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE

PubMed Central

Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan

2015-01-01

Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran’s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran’s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. Results: of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran’s libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Conclusions: Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries. PMID:26622203
Mental Health Correlates of Cigarette Use in LGBT Individuals in the Southeastern United States.

PubMed

Drescher, Christopher F; Lopez, Eliot J; Griffin, James A; Toomey, Thomas M; Eldridge, Elizabeth D; Stepleman, Lara M

2018-05-12

Smoking prevalence for lesbian, gay, bisexual, and transgender (LGBT) individuals is higher than for heterosexual, cisgender individuals. Elevated smoking rates have been linked to psychiatric comorbidities, substance use, poverty, low education levels, and stress. This study examined mental health (MH) correlates of cigarette use in LGBT individuals residing in a metropolitan area in the southeastern United States. Participants were 335 individuals from an LGBT health needs assessment (mean age 34.7; SD = 13.5; 63% gay/lesbian; 66% Caucasian; 81% cisgender). Demographics, current/past psychiatric diagnoses, number of poor MH days in the last 30, the Patient Health Questionnaire (PHQ) 2 depression screener, the Three-Item Loneliness Scale, and frequency of cigarette use were included. Analyses included bivariate correlations, analysis of variance (ANOVA), and regression. Multiple demographic and MH factors were associated with smoker status and frequency of smoking. A logistic regression indicated that lower education and bipolar disorder were most strongly associated with being a smoker. For smokers, a hierarchical regression model including demographic and MH variables accounted for 17.6% of the variance in frequency of cigarette use. Only education, bipolar disorder, and the number of poor MH days were significant contributors in the overall model. Conclusions/Importance: Less education, bipolar disorder, and recurrent poor MH increase LGBT vulnerability to cigarette use. Access to LGBT-competent MH providers who can address culturally specific factors in tobacco cessation is crucial to reducing this health disparities.
Early Home Activities and Oral Language Skills in Middle Childhood: A Quantile Analysis

ERIC Educational Resources Information Center

Law, James; Rush, Robert; King, Tom; Westrupp, Elizabeth; Reilly, Sheena

2018-01-01

Oral language development is a key outcome of elementary school, and it is important to identify factors that predict it most effectively. Commonly researchers use ordinary least squares regression with conclusions restricted to average performance conditional on relevant covariates. Quantile regression offers a more sophisticated alternative.…
Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

PubMed Central

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-01-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

PubMed

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-12-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT).

PubMed

Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E

2015-05-01

The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert Manfred; Volden, Thomas R.

2010-01-01

The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins

USGS Publications Warehouse

Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.

2016-01-01

Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins

NASA Astrophysics Data System (ADS)

Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.

2016-12-01

Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
[RS estimation of inventory parameters and carbon storage of moso bamboo forest based on synergistic use of object-based image analysis and decision tree].

PubMed

Du, Hua Qiang; Sun, Xiao Yan; Han, Ning; Mao, Fang Jie

2017-10-01

By synergistically using the object-based image analysis (OBIA) and the classification and regression tree (CART) methods, the distribution information, the indexes (including diameter at breast, tree height, and crown closure), and the aboveground carbon storage (AGC) of moso bamboo forest in Shanchuan Town, Anji County, Zhejiang Province were investigated. The results showed that the moso bamboo forest could be accurately delineated by integrating the multi-scale ima ge segmentation in OBIA technique and CART, which connected the image objects at various scales, with a pretty good producer's accuracy of 89.1%. The investigation of indexes estimated by regression tree model that was constructed based on the features extracted from the image objects reached normal or better accuracy, in which the crown closure model archived the best estimating accuracy of 67.9%. The estimating accuracy of diameter at breast and tree height was relatively low, which was consistent with conclusion that estimating diameter at breast and tree height using optical remote sensing could not achieve satisfactory results. Estimation of AGC reached relatively high accuracy, and accuracy of the region of high value achieved above 80%.
Novel spectrophotometric determination of chloramphenicol and dexamethasone in the presence of non labeled interfering substances using univariate methods and multivariate regression model updating

NASA Astrophysics Data System (ADS)

Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom

2015-04-01

Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Social Determinants of Chronic Prostatitis/Chronic Pelvic Pain Syndrome Related Lifestyle and Behaviors among Urban Men in China: A Case-Control Study

PubMed Central

Chen, Chen; Chen, Liang; Han, Qingrong; Ye, Huarong

2016-01-01

Purpose. In order to find key risk factors of chronic prostatitis/chronic pelvic pain syndrome (CP/CPPS) among urban men in China, an age-matched case-control study was performed from September 2012 to May 2013 in Yichang, Hubei Province, China. Methodology. A total of 279 patients and 558 controls were recruited in this study. Data were collected by a self-administered questionnaire, including demographics, diet and lifestyle, psychological status, and a physical exam. Conditional logistic regression model was used to analyze collected data. Results. Chemical factors exposure, night shift, severity of mood, and poor self-health cognition were entered into the regression model, and result displayed that these four factors had odds ratios of 1.929 (95% CI, 1.321–2.819), 1.456 (95% CI, 1.087–1.949), 1.619 (95% CI, 1.280–2.046), and 1.304 (95% CI, 1.094–1.555), respectively, which suggested that these four factors could significantly affect CP/CPPS. Conclusion. These results suggest that many factors affect CP/CPPS, including biological, social, and psychological factors. PMID:27579305
A generalized regression model of arsenic variations in the shallow groundwater of Bangladesh

PubMed Central

Taylor, Richard G.; Chandler, Richard E.

2015-01-01

Abstract Localized studies of arsenic (As) in Bangladesh have reached disparate conclusions regarding the impact of irrigation‐induced recharge on As concentrations in shallow (≤50 m below ground level) groundwater. We construct generalized regression models (GRMs) to describe observed spatial variations in As concentrations in shallow groundwater both (i) nationally, and (ii) regionally within Holocene deposits where As concentrations in groundwater are generally high (>10 μg L−1). At these scales, the GRMs reveal statistically significant inverse associations between observed As concentrations and two covariates: (1) hydraulic conductivity of the shallow aquifer and (2) net increase in mean recharge between predeveloped and developed groundwater‐fed irrigation periods. Further, the GRMs show that the spatial variation of groundwater As concentrations is well explained by not only surface geology but also statistical interactions (i.e., combined effects) between surface geology and mean groundwater recharge, thickness of surficial silt and clay, and well depth. Net increases in recharge result from intensive groundwater abstraction for irrigation, which induces additional recharge where it is enabled by a permeable surface geology. Collectively, these statistical associations indicate that irrigation‐induced recharge serves to flush mobile As from shallow groundwater. PMID:27524841
Differences by Sexual Orientation in Expectations About Future Long-Term Care Needs Among Adults 40 to 65 Years Old

PubMed Central

Gonzales, Gilbert; Shippee, Tetyana P.

2015-01-01

Objectives. We examined whether and how lesbian, gay, and bisexual (LGB) adults between 40 and 65 years of age differ from heterosexual adults in long-term care (LTC) expectations. Methods. Our data were derived from the 2013 National Health Interview Survey. We used ordered logistic regression to compare the odds of expected future use of LTC among LGB (n = 297) and heterosexual (n = 13 120) adults. We also used logistic regression models to assess the odds of expecting to use specific sources of care. All models controlled for key socioeconomic characteristics. Results. Although LGB adults had greater expectations of needing LTC in the future than their heterosexual counterparts, that association was largely explained by sociodemographic and health differences. After control for these differentials, LGB adults were less likely to expect care from family and more likely to expect to use institutional care in old age. Conclusions. LGB adults may rely more heavily than heterosexual adults on formal systems of care. As the older population continues to diversify, nursing homes and assisted living facilities should work to ensure safety and culturally sensitive best practices for older LGB groups. PMID:26378822
The Impact of Green Stormwater Infrastructure Installation on Surrounding Health and Safety

PubMed Central

Low, Sarah C.; Henning, Jason; Branas, Charles C.

2015-01-01

Objectives. We investigated the health and safety effects of urban green stormwater infrastructure (GSI) installments. Methods. We conducted a difference-in-differences analysis of the effects of GSI installments on health (e.g., blood pressure, cholesterol and stress levels) and safety (e.g., felonies, nuisance and property crimes, narcotics crimes) outcomes from 2000 to 2012 in Philadelphia, Pennsylvania. We used mixed-effects regression models to compare differences in pre- and posttreatment measures of outcomes for treatment sites (n = 52) and randomly chosen, matched control sites (n = 186) within multiple geographic extents surrounding GSI sites. Results. Regression-adjusted models showed consistent and statistically significant reductions in narcotics possession (18%–27% less) within 16th-mile, quarter-mile, half-mile (P < .001), and eighth-mile (P < .01) distances from treatment sites and at the census tract level (P < .01). Narcotics manufacture and burglaries were also significantly reduced at multiple scales. Nonsignificant reductions in homicides, assaults, thefts, public drunkenness, and narcotics sales were associated with GSI installation in at least 1 geographic extent. Conclusions. Health and safety considerations should be included in future assessments of GSI programs. Subsequent studies should assess mechanisms of this association. PMID:25602887
Low Survival Rates of Oral and Oropharyngeal Squamous Cell Carcinoma

PubMed Central

da Silva Júnior, Francisco Feliciano; dos Santos, Karine de Cássia Batista; Ferreira, Stefania Jeronimo

2017-01-01

Aim To assess the epidemiological and clinical factors that influence the prognosis of oral and oropharyngeal squamous cell carcinoma (SCC). Methods One hundred and twenty-one cases of oral and oropharyngeal SCC were selected. The survival curves for each variable were estimated using the Kaplan-Meier method. The Cox regression model was applied to assess the effect of the variables on survival. Results Cancers at an advanced stage were observed in 103 patients (85.1%). Cancers on the tongue were more frequent (23.1%). The survival analysis was 59.9% in one year, 40.7% in two years, and 27.8% in 5 years. There was a significant low survival rate linked to alcohol intake (p = 0.038), advanced cancer staging (p = 0.003), and procedures without surgery (p < 0.001). When these variables were included in the Cox regression model only surgery procedures (p = 0.005) demonstrated a significant effect on survival. Conclusion The findings suggest that patients who underwent surgery had a greater survival rate compared with those that did not. The low survival rates and the high percentage of patients diagnosed at advanced stages demonstrate that oral and oropharyngeal cancer patients should receive more attention. PMID:28638410
Addictive Internet Use among Korean Adolescents: A National Survey

PubMed Central

Heo, Jongho; Oh, Juhwan; Subramanian, S. V.; Kim, Yoon; Kawachi, Ichiro

2014-01-01

Background A psychological disorder called ‘Internet addiction’ has newly emerged along with a dramatic increase of worldwide Internet use. However, few studies have used population-level samples nor taken into account contextual factors on Internet addiction. Methods and Findings We identified 57,857 middle and high school students (13–18 year olds) from a Korean nationally representative survey, which was surveyed in 2009. To identify associated factors with addictive Internet use, two-level multilevel regression models were fitted with individual-level responses (1st level) nested within schools (2nd level) to estimate associations of individual and school characteristics simultaneously. Gender differences of addictive Internet use were estimated with the regression model stratified by gender. Significant associations were found between addictive Internet use and school grade, parental education, alcohol use, tobacco use, and substance use. Female students in girls' schools were more likely to use Internet addictively than those in coeducational schools. Our results also revealed significant gender differences of addictive Internet use in its associated individual- and school-level factors. Conclusions Our results suggest that multilevel risk factors along with gender differences should be considered to protect adolescents from addictive Internet use. PMID:24505318
Multiresponse semiparametric regression for modelling the effect of regional socio-economic variables on the use of information technology

NASA Astrophysics Data System (ADS)

Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania

2017-03-01

Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Linear regression metamodeling as a tool to summarize and present simulation model results.

PubMed

Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

2013-10-01

Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

Mitigating Errors in External Respiratory Surrogate-Based Models of Tumor Position

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malinowski, Kathleen T.; Fischell Department of Bioengineering, University of Maryland, College Park, MD; McAvoy, Thomas J.

2012-04-01

Purpose: To investigate the effect of tumor site, measurement precision, tumor-surrogate correlation, training data selection, model design, and interpatient and interfraction variations on the accuracy of external marker-based models of tumor position. Methods and Materials: Cyberknife Synchrony system log files comprising synchronously acquired positions of external markers and the tumor from 167 treatment fractions were analyzed. The accuracy of Synchrony, ordinary-least-squares regression, and partial-least-squares regression models for predicting the tumor position from the external markers was evaluated. The quantity and timing of the data used to build the predictive model were varied. The effects of tumor-surrogate correlation and the precisionmore » in both the tumor and the external surrogate position measurements were explored by adding noise to the data. Results: The tumor position prediction errors increased during the duration of a fraction. Increasing the training data quantities did not always lead to more accurate models. Adding uncorrelated noise to the external marker-based inputs degraded the tumor-surrogate correlation models by 16% for partial-least-squares and 57% for ordinary-least-squares. External marker and tumor position measurement errors led to tumor position prediction changes 0.3-3.6 times the magnitude of the measurement errors, varying widely with model algorithm. The tumor position prediction errors were significantly associated with the patient index but not with the fraction index or tumor site. Partial-least-squares was as accurate as Synchrony and more accurate than ordinary-least-squares. Conclusions: The accuracy of surrogate-based inferential models of tumor position was affected by all the investigated factors, except for the tumor site and fraction index.« less
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

PubMed Central

Weiss, Brandi A.; Dardick, William

2015-01-01

This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.

PubMed

Weiss, Brandi A; Dardick, William

2016-12-01

This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.
A potent combination of the novel PI3K inhibitor, GDC-0941, with imatinib in gastrointestinal stromal tumor xenografts: long-lasting responses after treatment withdrawal

PubMed Central

Floris, Giuseppe; Wozniak, Agnieszka; Sciot, Raf; Li, Haifu; Friedman, Lori; Van Looy, Thomas; Wellens, Jasmien; Vermaelen, Peter; Deroose, Christophe M.; Fletcher, Jonathan A.; Debiec-Rychter, Maria; Schöffski, Patrick

2015-01-01

Introduction Oncogenic signaling in gastrointestinal stromal tumors (GIST) is sustained via PI3K/AKT pathway. We used a panel of six GIST xenograft models to assess efficacy of GDC-0941 as single agent or in combination with imatinib (IMA). Experimental design Nude mice (n=136) were grafted bilaterally with human GIST carrying divers KIT mutations. Mice were orally dosed over four weeks, grouped as follows: A) control; B) GDC-0941; C) IMA and D) GDC+IMA treatments. Xenografts re-growth after treatment discontinuation was assessed in group C and D for additional four weeks. Tumor response was assessed by volume measurements, micro-PET imaging, histopathology and immunoblotting. Moreover genomic alterations in PTEN/PI3K/AKT pathway were evaluated. Results In all models, GDC-0941 caused tumor growth stabilization, inhibiting tumor cells proliferation but did not induce apoptosis. Under GDC+IMA, profound tumor regression, superior to either treatment alone, was observed. This effect was associated with the best histologic response, a nearly complete proliferation arrest and increased apoptosis. Tumor re-growth assays confirmed superior activity of GDC+IMA over IMA; in three out of six models tumor volume remained reduced and stable even after treatment discontinuation. A positive correlation between response to GDC+IMA and PTEN loss, both on gene and protein levels, was found. Conclusion GDC+IMA has significant antitumor efficacy in GIST xenografts, inducing more substantial tumor regression, apoptosis and durable effects than IMA. Notably, after treatment withdrawal, tumor regression was sustained in tumors exposed to GDC+IMA, which was not observed under IMA. Assessment of PTEN status may represent a useful predictive biomarker for patient selection. PMID:23231951
Factors associated with interest in novel interfaces for upper limb prosthesis control

PubMed Central

Engdahl, Susannah M.; Chestek, Cynthia A.; Kelly, Brian; Davis, Alicia

2017-01-01

Background Surgically invasive interfaces for upper limb prosthesis control may allow users to operate advanced, multi-articulated devices. Given the potential medical risks of these invasive interfaces, it is important to understand what factors influence an individual’s decision to try one. Methods We conducted an anonymous online survey of individuals with upper limb loss. A total of 232 participants provided personal information (such as age, amputation level, etc.) and rated how likely they would be to try noninvasive (myoelectric) and invasive (targeted muscle reinnervation, peripheral nerve interfaces, cortical interfaces) interfaces for prosthesis control. Bivariate relationships between interest in each interface and 16 personal descriptors were examined. Significant variables from the bivariate analyses were then entered into multiple logistic regression models to predict interest in each interface. Results While many of the bivariate relationships were significant, only a few variables remained significant in the regression models. The regression models showed that participants were more likely to be interested in all interfaces if they had unilateral limb loss (p ≤ 0.001, odds ratio ≥ 2.799). Participants were more likely to be interested in the three invasive interfaces if they were younger (p < 0.001, odds ratio ≤ 0.959) and had acquired limb loss (p ≤ 0.012, odds ratio ≥ 3.287). Participants who used a myoelectric device were more likely to be interested in myoelectric control than those who did not (p = 0.003, odds ratio = 24.958). Conclusions Novel prosthesis control interfaces may be accepted most readily by individuals who are young, have unilateral limb loss, and/or have acquired limb loss However, this analysis did not include all possible factors that may have influenced participant’s opinions on the interfaces, so additional exploration is warranted. PMID:28767716
Co-occurring risk factors for current cigarette smoking in a U.S. nationally representative sample

PubMed Central

Higgins, Stephen T.; Kurti, Allison N.; Redner, Ryan; White, Thomas J.; Keith, Diana R.; Gaalema, Diann E.; Sprague, Brian L.; Stanton, Cassandra A.; Roberts, Megan E.; Doogan, Nathan J.; Priest, Jeff S.

2016-01-01

Introduction Relatively little has been reported characterizing cumulative risk associated with co-occurring risk factors for cigarette smoking. The purpose of the present study was to address that knowledge gap in a U.S. nationally representative sample. Methods Data were obtained from 114,426 adults (≥ 18 years) in the U.S. National Survey on Drug Use and Health (years 2011–13). Multiple logistic regression and classification and regression tree (CART) modeling were used to examine risk of current smoking associated with eight co-occurring risk factors (age, gender, race/ethnicity, educational attainment, poverty, drug abuse/dependence, alcohol abuse/dependence, mental illness). Results Each of these eight risk factors was independently associated with significant increases in the odds of smoking when concurrently present in a multiple logistic regression model. Effects of risk-factor combinations were typically summative. Exceptions to that pattern were in the direction of less-than-summative effects when one of the combined risk factors was associated with generally high or low rates of smoking (e.g., drug abuse/dependence, age ≥65). CART modeling identified subpopulation risk profiles wherein smoking prevalence varied from a low of 11% to a high of 74% depending on particular risk factor combinations. Being a college graduate was the strongest independent predictor of smoking status, classifying 30% of the adult population. Conclusions These results offer strong evidence that the effects associated with common risk factors for cigarette smoking are independent, cumulative, and generally summative. The results also offer potentially useful insights into national population risk profiles around which U.S. tobacco policies can be developed or refined. PMID:26902875
SU-E-J-03: A Comprehensive Comparison Between Alpha and Beta Emitters for Cancer Radioimmunotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, C.Y.; Guatelli, S; Oborn, B

2014-06-01

Purpose: The purpose of this study is to perform a comprehensive comparison of the therapeutic efficacy and cytotoxicity of alpha and beta emitters for Radioimmunotherapy (RIT). For each stage of cancer development, specific models were built for the separate objectives of RIT to be addressed:a) kill isolated cancer cells in transit in the lymphatic and vascular circulation,b) regress avascular cell clusters,c) regress tumor vasculature and tumors. Methods: Because of the nature of short range, high LET alpha and long energy beta radiation and heterogeneous antigen expression among cancer cells, the microdosimetric approach is essential for the RIT assessment. Geant4 basedmore » microdosimetric models are developed for the three different stages of cancer progression: cancer cells, cell clusters and tumors. The energy deposition, specific energy resulted from different source distribution in the three models was calculated separately for 4 alpha emitting radioisotopes ({sup 211}At, {sup 213}Bi, {sup 223}Ra and {sup 225}Ac) and 6 beta emitters ({sup 32}P, {sup 33}P, {sup 67}Cu, {sup 90}Y, {sup 131}I and {sup 177}Lu). The cell survival, therapeutic efficacy and cytotoxicity are determined and compared between alpha and beta emitters. Results: We show that internal targeted alpha radiation has advantages over beta radiation for killing isolated cancer cells, regressing small cell clusters and also solid tumors. Alpha particles have much higher dose specificity and potency than beta particles. They can deposit 3 logs more dose than beta emitters to single cells and solid tumor. Tumor control probability relies on deep penetration of radioisotopes to cancer cell clusters and solid tumors. Conclusion: The results of this study provide a quantitative understanding of the efficacy and cytotoxicity of RIT for each stage of cancer development.« less
Retinal nerve fibre layer thinning is associated with drug resistance in epilepsy

PubMed Central

Balestrini, Simona; Clayton, Lisa M S; Bartmann, Ana P; Chinthapalli, Krishna; Novy, Jan; Coppola, Antonietta; Wandschneider, Britta; Stern, William M; Acheson, James; Bell, Gail S; Sander, Josemir W; Sisodiya, Sanjay M

2016-01-01

Objective Retinal nerve fibre layer (RNFL) thickness is related to the axonal anterior visual pathway and is considered a marker of overall white matter ‘integrity’. We hypothesised that RNFL changes would occur in people with epilepsy, independently of vigabatrin exposure, and be related to clinical characteristics of epilepsy. Methods Three hundred people with epilepsy attending specialist clinics and 90 healthy controls were included in this cross-sectional cohort study. RNFL imaging was performed using spectral-domain optical coherence tomography (OCT). Drug resistance was defined as failure of adequate trials of two antiepileptic drugs to achieve sustained seizure freedom. Results The average RNFL thickness and the thickness of each of the 90° quadrants were significantly thinner in people with epilepsy than healthy controls (p<0.001, t test). In a multivariate logistic regression model, drug resistance was the only significant predictor of abnormal RNFL thinning (OR=2.09, 95% CI 1.09 to 4.01, p=0.03). Duration of epilepsy (coefficient −0.16, p=0.004) and presence of intellectual disability (coefficient −4.0, p=0.044) also showed a significant relationship with RNFL thinning in a multivariate linear regression model. Conclusions Our results suggest that people with epilepsy with no previous exposure to vigabatrin have a significantly thinner RNFL than healthy participants. Drug resistance emerged as a significant independent predictor of RNFL borderline attenuation or abnormal thinning in a logistic regression model. As this is easily assessed by OCT, RNFL thickness might be used to better understand the mechanisms underlying drug resistance, and possibly severity. Longitudinal studies are needed to confirm our findings. PMID:25886782
Air Pollution and Cognitive Development at Age 7 in a Prospective Italian Birth Cohort.

PubMed

Porta, Daniela; Narduzzi, Silvia; Badaloni, Chiara; Bucci, Simone; Cesaroni, Giulia; Colelli, Valentina; Davoli, Marina; Sunyer, Jordi; Zirro, Eleonora; Schwartz, Joel; Forastiere, Francesco

2016-03-01

Early life exposure to air pollution has been linked with cognitive impairment in children, but the results have not been conclusive. We analyzed the association between traffic-related air pollution and cognitive function in a prospective birth cohort in Rome. A cohort of 719 newborns was enrolled in 2003-2004 as part of the GASPII project. At age 7 years, 474 children took the Wechsler Intelligence Scale for Children-III to assess their cognitive development in terms of IQ composite scores. Exposure to air pollutants (NO2, PMcoarse, PM2.5, PM2.5 absorbance) at birth was assessed using land use regression models. We also considered variables indicating traffic intensity. The effect of environmental pollution on IQ was evaluated performing a linear regression model for each outcome, adjusting for gender, child age at cognitive test, maternal age at delivery, parental educational level, siblings, socio-economic status, maternal smoking during pregnancy, and tester. To account for selection bias at enrollment and during follow-up, the regression models were weighted for the inverse probabilities of participation and follow-up. A 10 μg/m³ higher NO2 exposure during pregnancy was associated with 1.4 fewer points (95% confidence interval = -2.6, -0.20) of verbal IQ, and 1.4 fewer points (95% confidence interval = -2.7, -0.20) of verbal comprehension IQ. Similar associations were found for traffic intensity in a 100 m buffer around home. Other pollutants showed negative associations with larger confidence intervals. Consistent with previous evidence, this study suggests an association of exposure to NO2 and traffic intensity with the verbal area of cognitive development.See Video Abstract at http://links.lww.com/EDE/B12.
Alterations of the Tunica Vasculosa Lentis in the Rat Model of Retinopathy of Prematurity

PubMed Central

Favazza, Tara L; Tanimoto, Naoyuki; Munro, Robert J.; Beck, Susanne C.; Garrido, Marina G.; Seide, Christina; Sothilingam, Vithiyanjali; Hansen, Ronald M.; Fulton, Anne B.; Seeliger, Mathias W.; Akula, James D

2013-01-01

Purpose To study the relation between retinal and tunica vasculosa lentis (TVL) disease in ROP. Although the clinical hallmark of retinopathy of prematurity (ROP) is abnormal retinal blood vessels, the vessels of the anterior segment, including the TVL, are also altered. Methods ROP was induced in Long Evans pigmented and Sprague-Dawley albino rats; room-air-reared (RAR) rats served as controls. Then, fluorescein angiographic images of the TVL and retinal vessels were serially obtained with a scanning laser ophthalmoscope (SLO) near the height of retinal vascular disease, ∼20 days-of-age, and again at 30 and 64 days-of-age. Additionally, electroretinograms (ERGs) were obtained prior to the first imaging session. The TVL images were analyzed for percent coverage of the posterior lens. The tortuosity of the retinal arterioles was determined using Retinal Image multiScale Analysis (RISA; Gelman et al., 2005). Results In the youngest ROP rats, the TVL was dense, while in RAR rats, it was relatively sparse. By 30 days, the TVL in RAR rats had almost fully regressed, while in ROP rats it was still pronounced. By the final test age, the TVL had completely regressed in both ROP and RAR rats. In parallel, the tortuous retinal arterioles in ROP rats resolved with increasing age. ERG components indicating postreceptoral dysfunction, the b-wave and oscillatory potentials (OPs), were attenuated in ROP rats. Conclusions These findings underscore the retinal vascular abnormalities and, for the first time, show abnormal anterior segment vasculature in the rat model of ROP. There is delayed regression of the TVL in the rat model of ROP. This demonstrates that ROP is a disease of the whole eye. PMID:23748796
The Roles of IL-6, IL-10, and IL-1RA in Obesity and Insulin Resistance in African-Americans

PubMed Central

Doumatey, Ayo; Huang, Hanxia; Zhou, Jie; Chen, Guanjie; Shriner, Daniel; Adeyemo, Adebowale

2011-01-01

Objective: The aim of the study was to investigate the associations between IL-1 receptor antagonist (IL-1RA), IL-6, IL-10, measures of obesity, and insulin resistance in African-Americans. Research Design and Methods: Nondiabetic participants (n = 1025) of the Howard University Family Study were investigated for associations between serum IL (IL-1RA, IL-6, IL-10), measures of obesity, and insulin resistance, with adjustment for age and sex. Measures of obesity included body mass index, waist circumference, hip circumference, waist-to-hip ratio, and percent fat mass. Insulin resistance was assessed using the homeostasis model assessment of insulin resistance (HOMA-IR). Data were analyzed with R statistical software using linear regression and likelihood ratio tests. Results: IL-1RA and IL-6 were associated with measures of obesity and insulin resistance, explaining 4–12.7% of the variance observed (P values < 0.001). IL-1RA was bimodally distributed and therefore was analyzed based on grouping those with low vs. high IL-1RA levels. High IL-1RA explained up to 20 and 12% of the variance in measures of obesity and HOMA-IR, respectively. Among the IL, only high IL-1RA improved the fit of models regressing HOMA-IR on measures of obesity. In contrast, all measures of obesity improved the fit of models regressing HOMA-IR on IL. IL-10 was not associated with obesity measures or HOMA-IR. Conclusions: High IL-1RA levels and obesity measures are associated with HOMA-IR in this population-based sample of African-Americans. The results suggest that obesity and increased levels of IL-1RA both contribute to the development of insulin resistance. PMID:21956416
Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree

PubMed Central

de los Campos, Gustavo; Naya, Hugo; Gianola, Daniel; Crossa, José; Legarra, Andrés; Manfredi, Eduardo; Weigel, Kent; Cotes, José Miguel

2009-01-01

The availability of genomewide dense markers brings opportunities and challenges to breeding programs. An important question concerns the ways in which dense markers and pedigrees, together with phenotypic records, should be used to arrive at predictions of genetic values for complex traits. If a large number of markers are included in a regression model, marker-specific shrinkage of regression coefficients may be needed. For this reason, the Bayesian least absolute shrinkage and selection operator (LASSO) (BL) appears to be an interesting approach for fitting marker effects in a regression model. This article adapts the BL to arrive at a regression model where markers, pedigrees, and covariates other than markers are considered jointly. Connections between BL and other marker-based regression models are discussed, and the sensitivity of BL with respect to the choice of prior distributions assigned to key parameters is evaluated using simulation. The proposed model was fitted to two data sets from wheat and mouse populations, and evaluated using cross-validation methods. Results indicate that inclusion of markers in the regression further improved the predictive ability of models. An R program that implements the proposed model is freely available. PMID:19293140
Support vector methods for survival analysis: a comparison between ranking and regression approaches.

PubMed

Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

2011-10-01

To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included. Copyright © 2011 Elsevier B.V. All rights reserved.
Construction of mathematical model for measuring material concentration by colorimetric method

NASA Astrophysics Data System (ADS)

Liu, Bing; Gao, Lingceng; Yu, Kairong; Tan, Xianghua

2018-06-01

This paper use the method of multiple linear regression to discuss the data of C problem of mathematical modeling in 2017. First, we have established a regression model for the concentration of 5 substances. But only the regression model of the substance concentration of urea in milk can pass through the significance test. The regression model established by the second sets of data can pass the significance test. But this model exists serious multicollinearity. We have improved the model by principal component analysis. The improved model is used to control the system so that it is possible to measure the concentration of material by direct colorimetric method.
Developing and testing a global-scale regression model to quantify mean annual streamflow

NASA Astrophysics Data System (ADS)

Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.

2017-01-01

Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.
Predicting U.S. Army Reserve Unit Manning Using Market Demographics

DTIC Science & Technology

2015-06-01

develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

NASA Technical Reports Server (NTRS)

Parsons, Vickie s.

2009-01-01

The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
Stochastic Approximation Methods for Latent Regression Item Response Models

ERIC Educational Resources Information Center

von Davier, Matthias; Sinharay, Sandip

2010-01-01

This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Using Weighted Least Squares Regression for Obtaining Langmuir Sorption Constants

USDA-ARS?s Scientific Manuscript database

One of the most commonly used models for describing phosphorus (P) sorption to soils is the Langmuir model. To obtain model parameters, the Langmuir model is fit to measured sorption data using least squares regression. Least squares regression is based on several assumptions including normally dist...
Regression analysis using dependent Polya trees.

PubMed

Schörgendorfer, Angela; Branscum, Adam J

2013-11-30

Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

The microbiological profile and presence of bloodstream infection influence mortality rates in necrotizing fasciitis

PubMed Central

2011-01-01

Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053
[Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

PubMed

Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

2011-11-01

To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Geodesic least squares regression on information manifolds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be

We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply thismore » to scaling laws in magnetic confinement fusion.« less
Associations among job demands and resources, work engagement, and psychological distress: fixed-effects model analysis in Japan

PubMed Central

Oshio, Takashi; Inoue, Akiomi

2018-01-01

Objectives: We examined the associations among job demands and resources, work engagement, and psychological distress, adjusted for time-invariant individual attributes. Methods: We used data from a Japanese occupational cohort survey, which included 18,702 observations of 7,843 individuals. We investigated how work engagement, measured by the Utrecht Work Engagement Scale, was associated with key aspects of job demands and resources, using fixed-effects regression models. We further estimated the fixed-effects models to assess how work engagement moderated the association between each job characteristic and psychological distress as measured by Kessler 6 scores. Results: The fixed-effects models showed that work engagement was positively associated with job resources, as did pooled cross-sectional and prospective cohort models. Specifically, the standardized regression coefficients (β) were 0.148 and 0.120 for extrinsic reward and decision latitude, respectively, compared to -0.159 and 0.020 for role ambiguity and workload and time pressure, respectively (p < 0.001 for all associations). Work engagement modestly moderated the associations of psychological distress with workload and time pressure and extrinsic reward; a one-standard deviation increase in work engagement moderated their associations by 19.2% (p < 0.001) and 11.3% (p = 0.034), respectively. Conclusions: Work engagement was associated with job demands and resources, which is in line with the theoretical prediction of the job demands-resources model, even after controlling for time-invariant individual attributes. Work engagement moderated the association between selected aspects of job demands and resources and psychological distress. PMID:29563368
Background stratified Poisson regression analysis of cohort data.

PubMed

Richardson, David B; Langholz, Bryan

2012-03-01

Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Procedures for adjusting regional regression models of urban-runoff quality using local data

USGS Publications Warehouse

Hoos, A.B.; Sisolak, J.K.

1993-01-01

Statistical operations termed model-adjustment procedures (MAP?s) can be used to incorporate local data into existing regression models to improve the prediction of urban-runoff quality. Each MAP is a form of regression analysis in which the local data base is used as a calibration data set. Regression coefficients are determined from the local data base, and the resulting `adjusted? regression models can then be used to predict storm-runoff quality at unmonitored sites. The response variable in the regression analyses is the observed load or mean concentration of a constituent in storm runoff for a single storm. The set of explanatory variables used in the regression analyses is different for each MAP, but always includes the predicted value of load or mean concentration from a regional regression model. The four MAP?s examined in this study were: single-factor regression against the regional model prediction, P, (termed MAP-lF-P), regression against P,, (termed MAP-R-P), regression against P, and additional local variables (termed MAP-R-P+nV), and a weighted combination of P, and a local-regression prediction (termed MAP-W). The procedures were tested by means of split-sample analysis, using data from three cities included in the Nationwide Urban Runoff Program: Denver, Colorado; Bellevue, Washington; and Knoxville, Tennessee. The MAP that provided the greatest predictive accuracy for the verification data set differed among the three test data bases and among model types (MAP-W for Denver and Knoxville, MAP-lF-P and MAP-R-P for Bellevue load models, and MAP-R-P+nV for Bellevue concentration models) and, in many cases, was not clearly indicated by the values of standard error of estimate for the calibration data set. A scheme to guide MAP selection, based on exploratory data analysis of the calibration data set, is presented and tested. The MAP?s were tested for sensitivity to the size of a calibration data set. As expected, predictive accuracy of all MAP?s for the verification data set decreased as the calibration data-set size decreased, but predictive accuracy was not as sensitive for the MAP?s as it was for the local regression models.
Comparing Mammography Abnormality Features and Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy

PubMed Central

Burnside, Elizabeth S.; Liu, Jie; Wu, Yirong; Onitilo, Adedayo A.; McCarty, Catherine; Page, C. David; Peissig, Peggy; Trentham-Dietz, Amy; Kitchner, Terrie; Fan, Jun; Yuan, Ming

2015-01-01

Rationale and Objectives The discovery of germline genetic variants associated with breast cancer has engendered interest in risk stratification for improved, targeted detection and diagnosis. However, there has yet to be a comparison of the predictive ability of these genetic variants with mammography abnormality descriptors. Materials and Methods Our IRB-approved, HIPAA-compliant study utilized a personalized medicine registry in which participants consented to provide a DNA sample and participate in longitudinal follow-up. In our retrospective, age-matched, case-controlled study of 373 cases and 395 controls who underwent breast biopsy, we collected risk factors selected a priori based on the literature including: demographic variables based on the Gail model, common germline genetic variants, and diagnostic mammography findings according to BI-RADS. We developed predictive models using logistic regression to determine the predictive ability of: 1) demographic variables, 2) 10 selected genetic variants, or 3) mammography BI-RADS features. We evaluated each model in turn by calculating a risk score for each patient using 10-fold cross validation; used this risk estimate to construct ROC curves; and compared the AUC of each using the DeLong method. Results The performance of the regression model using demographic risk factors was not statistically different from the model using genetic variants (p=0.9). The model using mammography features (AUC = 0.689) was superior to both the demographic model (AUC = .598; p<0.001) and the genetic model (AUC = .601; p<0.001). Conclusion BI-RADS features exceeded the ability of demographic and 10 selected germline genetic variants to predict breast cancer in women recommended for biopsy. PMID:26514439
Accounting for measurement error in log regression models with applications to accelerated testing.

PubMed

Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M

2018-01-01

In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

PubMed

Chen, Baojiang; Qin, Jing

2014-05-10

In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.
Logistic regression for dichotomized counts.

PubMed

Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W

2016-12-01

Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
The Equivalence of Regression Models Using Difference Scores and Models Using Separate Scores for Each Informant: Implications for the Study of Informant Discrepancies

ERIC Educational Resources Information Center

Laird, Robert D.; Weems, Carl F.

2011-01-01

Research on informant discrepancies has increasingly utilized difference scores. This article demonstrates the statistical equivalence of regression models using difference scores (raw or standardized) and regression models using separate scores for each informant to show that interpretations should be consistent with both models. First,…
Gambling disorder-related illegal acts: Regression model of associated factors

PubMed Central

Gorsane, Mohamed Ali; Reynaud, Michel; Vénisse, Jean-Luc; Legauffre, Cindy; Valleur, Marc; Magalon, David; Fatséas, Mélina; Chéreau-Boudet, Isabelle; Guilleux, Alice; JEU Group; Challet-Bouju, Gaëlle; Grall-Bronnec, Marie

2017-01-01

Background and aims Gambling disorder-related illegal acts (GDRIA) are often crucial events for gamblers and/or their entourage. This study was designed to determine the predictive factors of GDRIA. Methods Participants were 372 gamblers reporting at least three DSM-IV-TR (American Psychiatric Association, 2000) criteria. They were assessed on the basis of sociodemographic characteristics, gambling-related characteristics, their personality profile, and psychiatric comorbidities. A multiple logistic regression was performed to identify the relevant predictors of GDRIA and their relative contribution to the prediction of the presence of GDRIA. Results Multivariate analysis revealed a higher South Oaks Gambling Scale score, comorbid addictive disorders, and a lower level of income as GDRIA predictors. Discussion and conclusion An original finding of this study was that the comorbid addictive disorder effect might be mediated by a disinhibiting effect of stimulant substances on GDRIA. Further studies are necessary to replicate these results, especially in a longitudinal design, and to explore specific therapeutic interventions. PMID:28198636
Gambling disorder-related illegal acts: Regression model of associated factors.

PubMed

Gorsane, Mohamed Ali; Reynaud, Michel; Vénisse, Jean-Luc; Legauffre, Cindy; Valleur, Marc; Magalon, David; Fatséas, Mélina; Chéreau-Boudet, Isabelle; Guilleux, Alice; Challet-Bouju, Gaëlle; Grall-Bronnec, Marie

2017-03-01

Background and aims Gambling disorder-related illegal acts (GDRIA) are often crucial events for gamblers and/or their entourage. This study was designed to determine the predictive factors of GDRIA. Methods Participants were 372 gamblers reporting at least three DSM-IV-TR (American Psychiatric Association, 2000) criteria. They were assessed on the basis of sociodemographic characteristics, gambling-related characteristics, their personality profile, and psychiatric comorbidities. A multiple logistic regression was performed to identify the relevant predictors of GDRIA and their relative contribution to the prediction of the presence of GDRIA. Results Multivariate analysis revealed a higher South Oaks Gambling Scale score, comorbid addictive disorders, and a lower level of income as GDRIA predictors. Discussion and conclusion An original finding of this study was that the comorbid addictive disorder effect might be mediated by a disinhibiting effect of stimulant substances on GDRIA. Further studies are necessary to replicate these results, especially in a longitudinal design, and to explore specific therapeutic interventions.
Factors associated with reporting of abuse against children and adolescents by nurses within Primary Health Care1

PubMed Central

Rolim, Ana Carine Arruda; Moreira, Gracyelle Alves Remigio; Gondim, Sarah Maria Mendes; Paz, Soraya da Silva; Vieira, Luiza Jane Eyre de Souza

2014-01-01

OBJECTIVE: to analyze the factors associated with the underreporting on the part of nurses within Primary Health Care of abuse against children and adolescents. METHOD: cross-sectional study with 616 nurses. A questionnaire addressed socio-demographic data, profession, instrumentation and knowledge on the topic, identification and reporting of abuse cases. Bivariate and multivariate logistic regression was used. RESULTS: female nurses, aged between 21 and 32 years old, not married, with five or more years since graduation, with graduate studies, and working for five or more years in PHC predominated. The final regression model showed that factors such as working for five or more years, having a reporting form within the PHC unit, and believing that reporting within Primary Health Care is an advantage, facilitate reporting. CONCLUSION: the study's results may, in addition to sensitizing nurses, support management professionals in establishing strategies intended to produce compliance with reporting as a legal device that ensures the rights of children and adolescents. PMID:25591102
Work stress, sleep deficiency and predicted 10-year cardiometabolic risk in a female patient care worker population

PubMed Central

Jacobsen, Henrik Børsting; Reme, Silje Endresen; Sembajwe, Grace; Hopcia, Karen; Stiles, Tore C.; Sorensen, Glorian; Porter, James H.; Marino, Miguel; Buxton, Orfeu M.

2014-01-01

Objectives The aim of this study was to investigate the longitudinal effect of work-related stress, sleep deficiency and physical activity on 10-year cardiometabolic risk among an all-female worker population. Methods Data on patient care workers (n=99) was collected two years apart. Baseline measures included: job stress, physical activity, night work and sleep deficiency. Biomarkers and objective measurements were used to estimate 10-year cardiometabolic risk at follow-up. Significant associations (P<0.05) from baseline analyses were used to build a multivariable linear regression model. Results The participants were mostly white nurses with a mean age of 41 years. Adjusted linear regression showed that having sleep maintenance problems, a different occupation than nurse, and/or not exercising at recommended levels at baseline increased the 10-year cardiometabolic risk at follow-up. Conclusions In female workers prone to work-related stress and sleep deficiency, maintaining sleep and exercise patterns had a strong impact on modifiable 10-year cardiometabolic risk. PMID:24809311
Love, Trust, and HIV Risk Among Female Sex Workers and Their Intimate Male Partners

PubMed Central

Bazzi, Angela Robertson; Martinez, Gustavo; Rangel, M. Gudelia; Ulibarri, Monica D.; Fergus, Kirkpatrick B.; Amaro, Hortensia; Strathdee, Steffanie A.

2015-01-01

Objectives. We examined correlates of love and trust among female sex workers and their noncommercial male partners along the Mexico–US border. Methods. From 2011 to 2012, 322 partners in Tijuana and Ciudad Juárez, Mexico, completed assessments of love and trust. Cross-sectional dyadic regression analyses identified associations of relationship characteristics and HIV risk behaviors with love and trust. Results. Within 161 couples, love and trust scores were moderately high (median 70/95 and 29/40 points, respectively) and correlated with relationship satisfaction. In regression analyses of HIV risk factors, men and women who used methamphetamine reported lower love scores, whereas women who used heroin reported slightly higher love. In an alternate model, men with concurrent sexual partners had lower love scores. For both partners, relationship conflict was associated with lower trust. Conclusions. Love and trust are associated with relationship quality, sexual risk, and drug use patterns that shape intimate partners’ HIV risk. HIV interventions should consider the emotional quality of sex workers’ intimate relationships. PMID:26066947
The severity of Minamata disease declined in 25 years: temporal profile of the neurological findings analyzed by multiple logistic regression model.

PubMed

Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

2005-01-01

Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.
Primary gastric mucosa associated lymphoid tissue lymphoma: Clinical data predicted treatment outcome

PubMed Central

Todorovic, Milena; Balint, Bela; Jevtic, Miodrag; Suvajdzic, Nada; Ceric, Amela; Stamatovic, Dragana; Markovic, Olivera; Perunicic, Maja; Marjanovic, Slobodan; Krstic, Miodrag

2008-01-01

AIM: To determine clinical characteristics and treatment outcome of gastric lymphoma after chemotherapy and immuno-chemotherapy. METHODS: Thirty four patients with primary gastric mucosa associated lymphoid tissue (MALT) lymphoma (Ann Arbor stages I to IV) were enrolled. All had upper gastric endoscopy, abdominal ultrasonography, CT and H pylori status assessment (histology and serology). After anti-H pylori treatment and initial chemotherapy, patients were re-examined every 4 mo. RESULTS: Histological regression of the lymphoma was complete in 22/34 (64.7%) and partial in 9 (26.5%) patients. Median follow up time for these 31 responders was 60 mo (range 48-120). No regression was noted in 3 patients. Among the 25 (73.5%) H pylori positive patients, the eradication rate was 100%. CONCLUSION: Using univariate analysis, predictive factors for overall survival were international prognostic index (IPI) score, hemoglobin level, erythrocyte sedimentation rate (ESR), and platelet numbers (P < 0.005). In addition to this, Cox proportion hazard model differentiate IPI score, ESR, and platelets as predictors of survival. PMID:18416467
Inequity in Health Care Financing in Iran: Progressive or Regressive Mechanism?

PubMed Central

Rad, Enayatollah Homaie; Khodaparast, Marzie

2016-01-01

Objective: Having progressive health finance mechanism is very important to decrease inequity in health systems. Revenue collection is one of the aspects of health care financing. In this study, taxation system and health insurance contribution of Iranians were assessed. Materials and Methods: Data of 2012 household expenditures survey were used in this study, and payments of the families for health insurances and tax payments were extracted from the study. Kakwani index was calculated for assessing the progressivity of these payments. At the end, a model was designed to find the effective factors. Results: We found that taxation mechanism was progressive, but insurance contribution mechanism was very regressive. The portion of people living in urban regions was higher in the payments of insurance and tax. Less educated families had lower contribution in health insurance and families with more aging persons paid more for health insurance. Conclusion: Policy makers must pay more attention to the health insurance contribution and change the laws in favour of the poor. PMID:27551174
Defining Nitrogen Kinetics for Air Break in Prebreath

NASA Technical Reports Server (NTRS)

Conkin, Johnny

2010-01-01

Actual tissue nitrogen (N2) kinetics are complex; the uptake and elimination is often approximated with a single half-time compartment in statistical descriptions of denitrogenation [prebreathe(PB)] protocols. Air breaks during PB complicate N2 kinetics. A comparison of symmetrical versus asymmetrical N2 kinetics was performed using the time to onset of hypobaric decompression sickness (DCS) as a surrogate for actual venous N2 tension. METHODS: Published results of 12 tests involving 179 hypobaric exposures in altitude chambers after PB, with and without airbreaks, provide the complex protocols from which to model N2 kinetics. DCS survival time for combined control and airbreaks were described with an accelerated log logistic model where N2 uptake and elimination before, during, and after the airbreak was computed with a simple exponential function or a function that changed half-time depending on ambient N2 partial pressure. P1N2-P2 = (Delta)P defined decompression dose for each altitude exposure, where P2 was the test altitude and P1N2 was computed N2 pressure at the beginning of the altitude exposure. RESULTS: The log likelihood (LL) without decompression dose (null model) was -155.6, and improved (best-fit) to -97.2 when dose was defined with a 240 min half-time for both N2 elimination and uptake during the PB. The description of DCS survival time was less precise with asymmetrical N2 kinetics, for example, LL was -98.9 with 240 min half-time elimination and 120 min half-time uptake. CONCLUSION: The statistical regression described survival time mechanistically linked to symmetrical N2 kinetics during PBs that also included airbreaks. The results are data-specific, and additional data may change the conclusion. The regression is useful to compute additional PB time to compensate for an airbreak in PB within the narrow range of tested conditions.

Evaluation of modeled bacteria loads along an impaired stream reach receiving discharge from a municipal separate storm sewer system in Independence, Mo.

USGS Publications Warehouse

Flickinger, Allison; Christensen, Eric D.

2017-01-01

The Little Blue River in Jackson County, Missouri, was listed as impaired in 2012 due to Escherichia coli (E. coli) from urban runoff and storm sewers. A study was initiated to characterize E. coli concentrations and loads to aid in the development of a total maximum daily load implementation plan. Longitudinal sampling along the stream revealed spatial and temporal variability in E. coli loads. Regression models were developed to better represent E. coli variability in the impaired reach using continuous hydrologic and water-quality parameters as predictive parameters. Daily loads calculated from main-stem samples were significantly higher downstream compared to upstream even though there was no significant difference between the upstream and downstream measured concentrations and no significant conclusions could be drawn from model-estimated loads due to model-associated uncertainty. Increasing sample frequency could decrease the bias and increase the accuracy of the modeled results.
Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

USGS Publications Warehouse

Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

2009-01-01

In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.

PubMed

Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C

2014-03-01

To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Two levels ARIMAX and regression models for forecasting time series data with calendar variation effects

NASA Astrophysics Data System (ADS)

Suhartono, Lee, Muhammad Hisyam; Prastyo, Dedy Dwi

2015-12-01

The aim of this research is to develop a calendar variation model for forecasting retail sales data with the Eid ul-Fitr effect. The proposed model is based on two methods, namely two levels ARIMAX and regression methods. Two levels ARIMAX and regression models are built by using ARIMAX for the first level and regression for the second level. Monthly men's jeans and women's trousers sales in a retail company for the period January 2002 to September 2009 are used as case study. In general, two levels of calendar variation model yields two models, namely the first model to reconstruct the sales pattern that already occurred, and the second model to forecast the effect of increasing sales due to Eid ul-Fitr that affected sales at the same and the previous months. The results show that the proposed two level calendar variation model based on ARIMAX and regression methods yields better forecast compared to the seasonal ARIMA model and Neural Networks.
Regression Models for Identifying Noise Sources in Magnetic Resonance Images

PubMed Central

Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.

2009-01-01

Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Regression Model for Light Weight and Crashworthiness Enhancement Design of Automotive Parts in Frontal CAR Crash

NASA Astrophysics Data System (ADS)

Bae, Gihyun; Huh, Hoon; Park, Sungho

This paper deals with a regression model for light weight and crashworthiness enhancement design of automotive parts in frontal car crash. The ULSAB-AVC model is employed for the crash analysis and effective parts are selected based on the amount of energy absorption during the crash behavior. Finite element analyses are carried out for designated design cases in order to investigate the crashworthiness and weight according to the material and thickness of main energy absorption parts. Based on simulations results, a regression analysis is performed to construct a regression model utilized for light weight and crashworthiness enhancement design of automotive parts. An example for weight reduction of main energy absorption parts demonstrates the validity of a regression model constructed.
Army College Fund Cost-Effectiveness Study

DTIC Science & Technology

1990-11-01

Section A.2 presents a theory of enlistment supply to provide a basis for specifying the regression model , The model Is specified in Section A.3, which...Supplementary materials are included in the final four sections. Section A.6 provides annual trends in the regression model variables. Estimates of the model ...millions, A.S. ESTIMATION OF A YOUTH EARNINGS FORECASTING MODEL Civilian pay is an important explanatory variable in the regression model . Previous
RRegrs: an R package for computer-aided model selection with multiple regression models.

PubMed

Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L

2015-01-01

Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.
Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

PubMed

Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

2015-01-01

Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
A preliminary study on postmortem interval estimation of suffocated rats by GC-MS/MS-based plasma metabolic profiling.

PubMed

Sato, Takako; Zaitsu, Kei; Tsuboi, Kento; Nomura, Masakatsu; Kusano, Maiko; Shima, Noriaki; Abe, Shuntaro; Ishii, Akira; Tsuchihashi, Hitoshi; Suzuki, Koichi

2015-05-01

Estimation of postmortem interval (PMI) is an important goal in judicial autopsy. Although many approaches can estimate PMI through physical findings and biochemical tests, accurate PMI calculation by these conventional methods remains difficult because PMI is readily affected by surrounding conditions, such as ambient temperature and humidity. In this study, Sprague-Dawley (SD) rats (10 weeks) were sacrificed by suffocation, and blood was collected by dissection at various time intervals (0, 3, 6, 12, 24, and 48 h; n = 6) after death. A total of 70 endogenous metabolites were detected in plasma by gas chromatography-tandem mass spectrometry (GC-MS/MS). Each time group was separated from each other on the principal component analysis (PCA) score plot, suggesting that the various endogenous metabolites changed with time after death. To prepare a prediction model of a PMI, a partial least squares (or projection to latent structure, PLS) regression model was constructed using the levels of significantly different metabolites determined by variable importance in the projection (VIP) score and the Kruskal-Wallis test (P < 0.05). Because the constructed PLS regression model could successfully predict each PMI, this model was validated with another validation set (n = 3). In conclusion, plasma metabolic profiling demonstrated its ability to successfully estimate PMI under a certain condition. This result can be considered to be the first step for using the metabolomics method in future forensic casework.
Serum metabolomics differentiating pancreatic cancer from new-onset diabetes

PubMed Central

He, Xiangyi; Zhong, Jie; Wang, Shuwei; Zhou, Yufen; Wang, Lei; Zhang, Yongping; Yuan, Yaozong

2017-01-01

To establish a screening strategy for pancreatic cancer (PC) based on new-onset diabetic mellitus (NO-DM), serum metabolomics analysis and a search for the metabolic pathways associated with PC related DM were performed. Serum samples from patients with NO-DM (n = 30) and patients with pancreatic cancer and NO-DM were examined by liquid chromatography-mass spectrometry. Data were analyzed using principal components analysis (PCA) and orthogonal projection to latent structures (OPLS) of the most significant metabolites. The diagnostic model was constructed using logistic regression analysis. Metabolic pathways were analyzed using the web-based tool MetPA. PC patients with NO-DM were older and had a lower BMI and shorter duration of DM than those with NO-DM. The metabolomic profiles of patients with PC and NO-DM were significantly different from those of patients with NO-DM in the PCA and OPLS models. Sixty two differential metabolites were identified by the OPLS model. The logistic regression model using a panel of two metabolites including N_Succinyl_L_diaminopimelic_acid and PE (18:2) had high sensitivity (93.3%) and specificity (93.1%) for PC. The top three metabolic pathways associated with PC related DM were valine, leucine and isoleucine biosynthesis and degradation, primary bile acid biosynthesis, and sphingolipid metabolism. In conclusion, screening for PC based on NO-DM using serum metabolomics in combination with clinic characteristics and CA19-9 is a potential useful strategy. Several metabolic pathways differed between PC related DM and type 2 DM. PMID:28418859
The International Classification of Functioning as an explanatory model of health after distal radius fracture: A cohort study

PubMed Central

Harris, Jocelyn E; MacDermid, Joy C; Roth, James

2005-01-01

Background Distal radius fractures are common injuries that have an increasing impact on health across the lifespan. The purpose of this study was to identify health impacts in body structure/function, activity, and participation at baseline and follow-up, to determine whether they support the ICF model of health. Methods This is a prospective cohort study of 790 individuals who were assessed at 1 week, 3 months, and 1 year post injury. The Patient Rated Wrist Evaluation (PRWE), The Wrist Outcome Measure (WOM), and the Medical Outcome Survey Short-Form (SF-36) were used to measure impairment, activity, participation, and health. Multiple regression was used to develop explanatory models of health outcome. Results Regression analysis showed that the PRWE explained between 13% (one week) and 33% (three months) of the SF-36 Physical Component Summary Scores with pain, activities and participation subscales showing dominant effects at different stages of recovery. PRWE scores were less related to Mental Component Summary Scores, 10% (three months) and 8% (one year). Wrist impairment scores were less powerful predictors of health status than the PRWE. Conclusion The ICF is an informative model for examining distal radius fracture. Difficulty in the domains of activity and participation were able to explain a significant portion of physical health. Post-fracture rehabilitation and outcome assessments should extend beyond physical impairment to insure comprehensive treatment to individuals with distal radius fracture. PMID:16288664
Sperm function and assisted reproduction technology

PubMed Central

MAAß, GESA; BÖDEKER, ROLF‐HASSO; SCHEIBELHUT, CHRISTINE; STALF, THOMAS; MEHNERT, CLAAS; SCHUPPE, HANS‐CHRISTIAN; JUNG, ANDREAS; SCHILL, WOLF‐BERNHARD

2005-01-01

The evaluation of different functional sperm parameters has become a tool in andrological diagnosis. These assays determine the sperm's capability to fertilize an oocyte. It also appears that sperm functions and semen parameters are interrelated and interdependent. Therefore, the question arose whether a given laboratory test or a battery of tests can predict the outcome in in vitro fertilization (IVF). One‐hundred and sixty‐one patients who underwent an IVF treatment were selected from a database of 4178 patients who had been examined for male infertility 3 months before or after IVF. Sperm concentration, motility, acrosin activity, acrosome reaction, sperm morphology, maternal age, number of transferred embryos, embryo score, fertilization rate and pregnancy rate were determined. In addition, logistic regression models to describe fertilization rate and pregnancy were developed. All the parameters in the models were dichotomized and intra‐ and interindividual variability of the parameters were assessed. Although the sperm parameters showed good correlations with IVF when correlated separately, the only essential parameter in the multivariate model was morphology. The enormous intra‐ and interindividual variability of the values was striking. In conclusion, our data indicate that the andrological status at the end of the respective treatment does not necessarily represent the status at the time of IVF. Despite a relatively low correlation coefficient in the logistic regression model, it appears that among the parameters tested, the most reliable parameter to predict fertilization is normal sperm morphology. (Reprod Med Biol 2005; 4: 7–30) PMID:29699207
Correlation and simple linear regression.

PubMed

Eberly, Lynn E

2007-01-01

This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

NASA Technical Reports Server (NTRS)

MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

2005-01-01

Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Two-level structural sparsity regularization for identifying lattices and defects in noisy images

DOE PAGES

Li, Xin; Belianinov, Alex; Dyck, Ondrej E.; ...

2018-03-09

Here, this paper presents a regularized regression model with a two-level structural sparsity penalty applied to locate individual atoms in a noisy scanning transmission electron microscopy image (STEM). In crystals, the locations of atoms is symmetric, condensed into a few lattice groups. Therefore, by identifying the underlying lattice in a given image, individual atoms can be accurately located. We propose to formulate the identification of the lattice groups as a sparse group selection problem. Furthermore, real atomic scale images contain defects and vacancies, so atomic identification based solely on a lattice group may result in false positives and false negatives.more » To minimize error, model includes an individual sparsity regularization in addition to the group sparsity for a within-group selection, which results in a regression model with a two-level sparsity regularization. We propose a modification of the group orthogonal matching pursuit (gOMP) algorithm with a thresholding step to solve the atom finding problem. The convergence and statistical analyses of the proposed algorithm are presented. The proposed algorithm is also evaluated through numerical experiments with simulated images. The applicability of the algorithm on determination of atom structures and identification of imaging distortions and atomic defects was demonstrated using three real STEM images. In conclusion, we believe this is an important step toward automatic phase identification and assignment with the advent of genomic databases for materials.« less
Factors Associated with Sexual Violence against Men Who Have Sex with Men and Transgendered Individuals in Karnataka, India

PubMed Central

Shaw, Souradet Y.; Lorway, Robert R.; Deering, Kathleen N.; Avery, Lisa; Mohan, H. L.; Bhattacharjee, Parinita; Reza-Paul, Sushena; Isac, Shajy; Ramesh, Banadakoppa M.; Washington, Reynold; Moses, Stephen; Blanchard, James F.

2012-01-01

Objectives There is a lack of information on sexual violence (SV) among men who have sex with men and transgendered individuals (MSM-T) in southern India. As SV has been associated with HIV vulnerability, this study examined health related behaviours and practices associated with SV among MSM-T. Design Data were from cross-sectional surveys from four districts in Karnataka, India. Methods Multivariable logistic regression models were constructed to examine factors related to SV. Multivariable negative binomial regression models examined the association between physician visits and SV. Results A total of 543 MSM-T were included in the study. Prevalence of SV was 18% in the past year. HIV prevalence among those reporting SV was 20%, compared to 12% among those not reporting SV (p = .104). In multivariable models, and among sex workers, those reporting SV were more likely to report anal sex with 5+ casual sex partners in the past week (AOR: 4.1; 95%CI: 1.2–14.3, p = .029). Increased physician visits among those reporting SV was reported only for those involved in sex work (ARR: 1.7; 95%CI: 1.1–2.7, p = .012). Conclusions These results demonstrate high levels of SV among MSM-T populations, highlighting the importance of integrating interventions to reduce violence as part of HIV prevention programs and health services. PMID:22448214
Two-level structural sparsity regularization for identifying lattices and defects in noisy images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Xin; Belianinov, Alex; Dyck, Ondrej E.

Here, this paper presents a regularized regression model with a two-level structural sparsity penalty applied to locate individual atoms in a noisy scanning transmission electron microscopy image (STEM). In crystals, the locations of atoms is symmetric, condensed into a few lattice groups. Therefore, by identifying the underlying lattice in a given image, individual atoms can be accurately located. We propose to formulate the identification of the lattice groups as a sparse group selection problem. Furthermore, real atomic scale images contain defects and vacancies, so atomic identification based solely on a lattice group may result in false positives and false negatives.more » To minimize error, model includes an individual sparsity regularization in addition to the group sparsity for a within-group selection, which results in a regression model with a two-level sparsity regularization. We propose a modification of the group orthogonal matching pursuit (gOMP) algorithm with a thresholding step to solve the atom finding problem. The convergence and statistical analyses of the proposed algorithm are presented. The proposed algorithm is also evaluated through numerical experiments with simulated images. The applicability of the algorithm on determination of atom structures and identification of imaging distortions and atomic defects was demonstrated using three real STEM images. In conclusion, we believe this is an important step toward automatic phase identification and assignment with the advent of genomic databases for materials.« less
Dental age estimation of growing children by measurement of open apices: A Malaysian formula

PubMed Central

Cugati, Navaneetha; Kumaresan, Ramesh; Srinivasan, Balamanikanda; Karthikeyan, Priyadarshini

2015-01-01

Background: Age estimation is of prime importance in forensic science and clinical dentistry. Age estimation based on teeth development is one reliable approach. Many radiographic methods are proposed on the Western population for estimating dental age, and a similar assessment was found to be inadequate in Malaysian population. Hence, this study aims at formulating a regression model for dental age estimation in Malaysian children population using Cameriere's method. Materials and Methods: Orthopantomographs of 421 Malaysian children aged between 5 and 16 years involving all the three ethnic origins were digitalized and analyzed using Cameriere's method of age estimation. The subjects’ age was modeled as a function of the morphological variables, gender (g), ethnicity, sum of normalized open apices (s), number of tooth with completed root formation (N0) and the first-order interaction between s and N0. Results: The variables that contributed significantly to the fit were included in the regression model, yielding the following formula: Age = 11.368-0.345g + 0.553No -1.096s - 0.380s.No, where g is a variable, 1 for males and 2 for females. The equation explained 87.1% of total deviance. Conclusion: The results obtained insist on reframing the original Cameriere's formula to suit the population of the nation specifically. Further studies are to be conducted to evaluate the applicability of this formula on a larger sample size. PMID:26816464
Exposure to air pollution and tobacco smoking and their combined effects on depression in six low- and middle-income countries.

PubMed

Lin, Hualiang; Guo, Yanfei; Kowal, Paul; Airhihenbuwa, Collins O; Di, Qian; Zheng, Yang; Zhao, Xing; Vaughn, Michael G; Howard, Steven; Schootman, Mario; Salinas-Rodriguez, Aaron; Yawson, Alfred E; Arokiasamy, Perianayagam; Manrique-Espinoza, Betty Soledad; Biritwum, Richard B; Rule, Stephen P; Minicuci, Nadia; Naidoo, Nirmala; Chatterji, Somnath; Qian, Zhengmin Min; Ma, Wenjun; Wu, Fan

2017-09-01

Background Little is known about the joint mental health effects of air pollution and tobacco smoking in low- and middle-income countries. Aims To investigate the effects of exposure to ambient fine particulate matter pollution (PM 2.5 ) and smoking and their combined (interactive) effects on depression. Method Multilevel logistic regression analysis of baseline data of a prospective cohort study ( n = 41 785). The 3-year average concentrations of PM 2.5 were estimated using US National Aeronautics and Space Administration satellite data, and depression was diagnosed using a standardised questionnaire. Three-level logistic regression models were applied to examine the associations with depression. Results The odds ratio (OR) for depression was 1.09 (95% C11.01-1.17) per 10 μg/m 3 increase in ambient PM 2.5 , and the association remained after adjusting for potential confounding factors (adjusted OR = 1.10, 95% CI 1.02-1.19). Tobacco smoking (smoking status, frequency, duration and amount) was also significantly associated with depression. There appeared to be a synergistic interaction between ambient PM 2.5 and smoking on depression in the additive model, but the interaction was not statistically significant in the multiplicative model. Conclusions Our study suggests that exposure to ambient PM 2.5 may increase the risk of depression, and smoking may enhance this effect. © The Royal College of Psychiatrists 2017.

Atmospheric pollutants and hospital admissions due to pneumonia in children

PubMed Central

Negrisoli, Juliana; Nascimento, Luiz Fernando C.

2013-01-01

OBJECTIVE: To analyze the relationship between exposure to air pollutants and hospitalizations due to pneumonia in children of Sorocaba, São Paulo, Brazil. METHODS: Time series ecological study, from 2007 to 2008. Daily data were obtained from the State Environmental Agency for Pollution Control for particulate matter, nitric oxide, nitrogen dioxide, ozone, besides air temperature and relative humidity. The data concerning pneumonia admissions were collected in the public health system of Sorocaba. Correlations between the variables of interest using Pearson cofficient were calculated. Models with lags from zero to five days after exposure to pollutants were performed to analyze the association between the exposure to environmental pollutants and hospital admissions. The analysis used the generalized linear model of Poisson regression, being significant p<0.05. RESULTS: There were 1,825 admissions for pneumonia, with a daily mean of 2.5±2.1. There was a strong correlation between pollutants and hospital admissions, except for ozone. Regarding the Poisson regression analysis with the multi-pollutant model, only nitrogen dioxide was statistically significant in the same day (relative risk - RR=1.016), as well as particulate matter with a lag of four days (RR=1.009) after exposure to pollutants. CONCLUSIONS: There was an acute effect of exposure to nitrogen dioxide and a later effect of exposure to particulate matter on children hospitalizations for pneumonia in Sorocaba. PMID:24473956
Stochastic Approximation Methods for Latent Regression Item Response Models. Research Report. ETS RR-09-09

ERIC Educational Resources Information Center

von Davier, Matthias; Sinharay, Sandip

2009-01-01

This paper presents an application of a stochastic approximation EM-algorithm using a Metropolis-Hastings sampler to estimate the parameters of an item response latent regression model. Latent regression models are extensions of item response theory (IRT) to a 2-level latent variable model in which covariates serve as predictors of the…
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

ERIC Educational Resources Information Center

Weiss, Brandi A.; Dardick, William

2016-01-01

This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

ERIC Educational Resources Information Center

Shieh, Gwowen

2009-01-01

In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
Building Regression Models: The Importance of Graphics.

ERIC Educational Resources Information Center

Dunn, Richard

1989-01-01

Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)
Testing Different Model Building Procedures Using Multiple Regression.

ERIC Educational Resources Information Center

Thayer, Jerome D.

The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…
Genetic evaluation of egg production curve in Thai native chickens by random regression and spline models.

PubMed

Mookprom, S; Boonkum, W; Kunhareang, S; Siripanya, S; Duangjinda, M

2017-02-01

The objective of this research is to investigate appropriate random regression models with various covariance functions, for the genetic evaluation of test-day egg production. Data included 7,884 monthly egg production records from 657 Thai native chickens (Pradu Hang Dam) that were obtained during the first to sixth generation and were born during 2007 to 2014 at the Research and Development Network Center for Animal Breeding (Native Chickens), Khon Kaen University. Average annual and monthly egg productions were 117 ± 41 and 10.20 ± 6.40 eggs, respectively. Nine random regression models were analyzed using the Wilmink function (WM), Koops and Grossman function (KG), Legendre polynomials functions with second, third, and fourth orders (LG2, LG3, LG4), and spline functions with 4, 5, 6, and 8 knots (SP4, SP5, SP6, and SP8). All covariance functions were nested within the same additive genetic and permanent environmental random effects, and the variance components were estimated by Restricted Maximum Likelihood (REML). In model comparisons, mean square error (MSE) and the coefficient of detemination (R 2 ) calculated the goodness of fit; and the correlation between observed and predicted values [Formula: see text] was used to calculate the cross-validated predictive abilities. We found that the covariance functions of SP5, SP6, and SP8 proved appropriate for the genetic evaluation of the egg production curves for Thai native chickens. The estimated heritability of monthly egg production ranged from 0.07 to 0.39, and the highest heritability was found during the first to third months of egg production. In conclusion, the spline functions within monthly egg production can be applied to breeding programs for the improvement of both egg number and persistence of egg production. © 2016 Poultry Science Association Inc.
Brain Natriuretic Hormone Predicts Stress Induced Alterations in Diastolic Function

PubMed Central

Choksy, Pratik; Davis, Harry C.; Januzzi, James; Thayer, Julian; Harshfield, Gregory; Robinson, Vincent JB; Kapuku, Gaston K.

2015-01-01

Background Mental stress (MS) reduces diastolic function (DF) and may lead to congestive heart failure with preserved systolic function. Whether brain natriuretic hormone (BNP) mediates the relationship of MS with DF is unknown. Method and Results 160 individuals aged 30 to 50 years underwent 2 hour protocol of 40 minutes rest, videogame stressor and recovery. Hemodynamics, pro-BNP samples and DF indices were obtained throughout the protocol. Separate regression analyses were conducted using rest and stress E/A, E’ and E/E’ as dependent variables. Predictor variables were entered into the stepwise regression models in a hierarchical fashion. At the first level age, sex, race, height, BMI, pro-BNP, and LVM were permitted to enter the models. The second level consisted of SBP, DBP and HR. The final level contained cross-product terms of race by SBP, DBP and HR. E/A ratio was lower during stress compared to rest, and recovery (p<0.01). Resting E/A ratio was predicted by a regression model of age (−.31), pro-BNP (.16), HR (−.40) and DBP (−.23) with an R2 = .33. Stress E/A ratio was predicted by age (−.24), pro-BNP (.08), HR (−.38), and SBP (−.21), total R2 = .22. Resting E’ model consisted of age (−.22), pro-BNP (.26), DBP (−.27) and LVM (−.15) with an R2 = .29. Stress E’ was predicted by age (−.18), pro-BNP (.35) and LVM (−.18) with an R2 = .18. Resting E/E’ was predicted by race (.17, B>W) and DBP (.24) with an R2 = .10. Stress E/E’ consisted of pro-BNP (−.36), height (−.26) and HR (−.21) with R2 = .15. Conclusion pro-BNP predicts both resting and stress DF suggesting that lower BNP during MS may be a maker of diastolic dysfunction in apparently healthy individuals. PMID:24841419
CREATION OF A MODEL TO PREDICT SURVIVAL IN PATIENTS WITH REFRACTORY COELIAC DISEASE USING A MULTINATIONAL REGISTRY

PubMed Central

Rubio-Tapia, Alberto; Malamut, Georgia; Verbeek, Wieke H.M.; van Wanrooij, Roy L.J.; Leffler, Daniel A.; Niveloni, Sonia I.; Arguelles-Grande, Carolina; Lahr, Brian D.; Zinsmeister, Alan R.; Murray, Joseph A.; Kelly, Ciaran P.; Bai, Julio C.; Green, Peter H.; Daum, Severin; Mulder, Chris J.J.; Cellier, Christophe

2016-01-01

Background Refractory coeliac disease is a severe complication of coeliac disease with heterogeneous outcome. Aim To create a prognostic model to estimate survival of patients with refractory coeliac disease. Methods We evaluated predictors of 5-year mortality using Cox proportional hazards regression on subjects from a multinational registry. Bootstrap re-sampling was used to internally validate the individual factors and overall model performance. The mean of the estimated regression coefficients from 400 bootstrap models was used to derive a risk score for 5-year mortality. Results The multinational cohort was composed of 232 patients diagnosed with refractory coeliac disease across 7 centers (range of 11–63 cases per center). The median age was 53 years and 150 (64%) were women. A total of 51 subjects died during 5-year follow-up (cumulative 5-year all-cause mortality = 30%). From a multiple variable Cox proportional hazards model, the following variables were significantly associated with 5-year mortality: age at refractory coeliac disease diagnosis (per 20 year increase, hazard ratio = 2.21; 95% confidence interval: 1.38, 3.55), abnormal intraepithelial lymphocytes (hazard ratio = 2.85; 95% confidence interval: 1.22, 6.62), and albumin (per 0.5 unit increase, hazard ratio = 0.72; 95% confidence interval: 0.61, 0.85). A simple weighted 3-factor risk score was created to estimate 5-year survival. Conclusions Using data from a multinational registry and previously-reported risk factors, we create a prognostic model to predict 5-year mortality among patients with refractory coeliac disease. This new model may help clinicians to guide treatment and follow-up. PMID:27485029
Prediction of Multiple Infections After Severe Burn Trauma: a Prospective Cohort Study

PubMed Central

Yan, Shuangchun; Tsurumi, Amy; Que, Yok-Ai; Ryan, Colleen M.; Bandyopadhaya, Arunava; Morgan, Alexander A.; Flaherty, Patrick J.; Tompkins, Ronald G.; Rahme, Laurence G.

2014-01-01

Objective To develop predictive models for early triage of burn patients based on hyper-susceptibility to repeated infections. Background Infection remains a major cause of mortality and morbidity after severe trauma, demanding new strategies to combat infections. Models for infection prediction are lacking. Methods Secondary analysis of 459 burn patients (≥16 years old) with ≥20% total body surface area burns recruited from six US burn centers. We compared blood transcriptomes with a 180-h cut-off on the injury-to-transcriptome interval of 47 patients (≤1 infection episode) to those of 66 hyper-susceptible patients (multiple [≥2] infection episodes [MIE]). We used LASSO regression to select biomarkers and multivariate logistic regression to built models, accuracy of which were assessed by area under receiver operating characteristic curve (AUROC) and cross-validation. Results Three predictive models were developed covariates of: (1) clinical characteristics; (2) expression profiles of 14 genomic probes; (3) combining (1) and (2). The genomic and clinical models were highly predictive of MIE status (AUROCGenomic = 0.946 [95% CI, 0.906–0.986]); AUROCClinical = 0.864 [CI, 0.794–0.933]; AUROCGenomic/AUROCClinical P = 0.044). Combined model has an increased AUROCCombined of 0.967 (CI, 0.940–0.993) compared to the individual models (AUROCCombined/AUROCClinical P = 0.0069). Hyper-susceptible patients show early alterations in immune-related signaling pathways, epigenetic modulation and chromatin remodeling. Conclusions Early triage of burn patients more susceptible to infections can be made using clinical characteristics and/or genomic signatures. Genomic signature suggests new insights into the pathophysiology of hyper-susceptibility to infection may lead to novel potential therapeutic or prophylactic targets. PMID:24950278
Prediction of morbidity and mortality in patients with type 2 diabetes.

PubMed

Wells, Brian J; Roth, Rachel; Nowacki, Amy S; Arrigain, Susana; Yu, Changhong; Rosenkrans, Wayne A; Kattan, Michael W

2013-01-01

Introduction. The objective of this study was to create a tool that accurately predicts the risk of morbidity and mortality in patients with type 2 diabetes according to an oral hypoglycemic agent. Materials and Methods. The model was based on a cohort of 33,067 patients with type 2 diabetes who were prescribed a single oral hypoglycemic agent at the Cleveland Clinic between 1998 and 2006. Competing risk regression models were created for coronary heart disease (CHD), heart failure, and stroke, while a Cox regression model was created for mortality. Propensity scores were used to account for possible treatment bias. A prediction tool was created and internally validated using tenfold cross-validation. The results were compared to a Framingham model and a model based on the United Kingdom Prospective Diabetes Study (UKPDS) for CHD and stroke, respectively. Results and Discussion. Median follow-up for the mortality outcome was 769 days. The numbers of patients experiencing events were as follows: CHD (3062), heart failure (1408), stroke (1451), and mortality (3661). The prediction tools demonstrated the following concordance indices (c-statistics) for the specific outcomes: CHD (0.730), heart failure (0.753), stroke (0.688), and mortality (0.719). The prediction tool was superior to the Framingham model at predicting CHD and was at least as accurate as the UKPDS model at predicting stroke. Conclusions. We created an accurate tool for predicting the risk of stroke, coronary heart disease, heart failure, and death in patients with type 2 diabetes. The calculator is available online at http://rcalc.ccf.org under the heading "Type 2 Diabetes" and entitled, "Predicting 5-Year Morbidity and Mortality." This may be a valuable tool to aid the clinician's choice of an oral hypoglycemic, to better inform patients, and to motivate dialogue between physician and patient.
Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

PubMed

Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

2011-06-01

Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

PubMed Central

Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

2014-01-01

Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT. PMID:24586971
A consistent framework for Horton regression statistics that leads to a modified Hack's law

USGS Publications Warehouse

Furey, P.R.; Troutman, B.M.

2008-01-01

A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.
Evaluation of Regression Models of Balance Calibration Data Using an Empirical Criterion

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert; Volden, Thomas R.

2012-01-01

An empirical criterion for assessing the significance of individual terms of regression models of wind tunnel strain gage balance outputs is evaluated. The criterion is based on the percent contribution of a regression model term. It considers a term to be significant if its percent contribution exceeds the empirical threshold of 0.05%. The criterion has the advantage that it can easily be computed using the regression coefficients of the gage outputs and the load capacities of the balance. First, a definition of the empirical criterion is provided. Then, it is compared with an alternate statistical criterion that is widely used in regression analysis. Finally, calibration data sets from a variety of balances are used to illustrate the connection between the empirical and the statistical criterion. A review of these results indicated that the empirical criterion seems to be suitable for a crude assessment of the significance of a regression model term as the boundary between a significant and an insignificant term cannot be defined very well. Therefore, regression model term reduction should only be performed by using the more universally applicable statistical criterion.
Widen NomoGram for multinomial logistic regression: an application to staging liver fibrosis in chronic hepatitis C patients.

PubMed

Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M

2017-04-01

The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.
Robust mislabel logistic regression without modeling mislabel probabilities.

PubMed

Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

2018-03-01

Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Complications after procedures of photorefractive keratectomy

NASA Astrophysics Data System (ADS)

Gierek-Ciaciura, Stanislawa

1998-10-01

Purpose: The aim of this study was to investigate the saveness of the PRK procedures. Material and method: 151 eyes after PRK for correction of myopia and 112 after PRK for correction of myopic astigmatism were examined. All PRK procedures have been performed with an excimer laser manufactured by Aesculap Meditec. Results: Haze, regression, decentration infection and overcorrection were found. Conclusions: The most often complication is regression. Corneal inflammation in the early postoperative period may cause the regression or haze. The greater corrected refractive error the greater haze degree. Haze decreases with time.
Application of Semiparametric Spline Regression Model in Analyzing Factors that In uence Population Density in Central Java

NASA Astrophysics Data System (ADS)

Sumantari, Y. D.; Slamet, I.; Sugiyanto

2017-06-01

Semiparametric regression is a statistical analysis method that consists of parametric and nonparametric regression. There are various approach techniques in nonparametric regression. One of the approach techniques is spline. Central Java is one of the most densely populated province in Indonesia. Population density in this province can be modeled by semiparametric regression because it consists of parametric and nonparametric component. Therefore, the purpose of this paper is to determine the factors that in uence population density in Central Java using the semiparametric spline regression model. The result shows that the factors which in uence population density in Central Java is Family Planning (FP) active participants and district minimum wage.
Quantifying predictive capability of electronic health records for the most harmful breast cancer

NASA Astrophysics Data System (ADS)

Wu, Yirong; Fan, Jun; Peissig, Peggy; Berg, Richard; Tafti, Ahmad Pahlavan; Yin, Jie; Yuan, Ming; Page, David; Cox, Jennifer; Burnside, Elizabeth S.

2018-03-01

Improved prediction of the "most harmful" breast cancers that cause the most substantive morbidity and mortality would enable physicians to target more intense screening and preventive measures at those women who have the highest risk; however, such prediction models for the "most harmful" breast cancers have rarely been developed. Electronic health records (EHRs) represent an underused data source that has great research and clinical potential. Our goal was to quantify the value of EHR variables in the "most harmful" breast cancer risk prediction. We identified 794 subjects who had breast cancer with primary non-benign tumors with their earliest diagnosis on or after 1/1/2004 from an existing personalized medicine data repository, including 395 "most harmful" breast cancer cases and 399 "least harmful" breast cancer cases. For these subjects, we collected EHR data comprised of 6 components: demographics, diagnoses, symptoms, procedures, medications, and laboratory results. We developed two regularized prediction models, Ridge Logistic Regression (Ridge-LR) and Lasso Logistic Regression (Lasso-LR), to predict the "most harmful" breast cancer one year in advance. The area under the ROC curve (AUC) was used to assess model performance. We observed that the AUCs of Ridge-LR and Lasso-LR models were 0.818 and 0.839 respectively. For both the Ridge-LR and LassoLR models, the predictive performance of the whole EHR variables was significantly higher than that of each individual component (p<0.001). In conclusion, EHR variables can be used to predict the "most harmful" breast cancer, providing the possibility to personalize care for those women at the highest risk in clinical practice.

Quantifying predictive capability of electronic health records for the most harmful breast cancer.

PubMed

Wu, Yirong; Fan, Jun; Peissig, Peggy; Berg, Richard; Tafti, Ahmad Pahlavan; Yin, Jie; Yuan, Ming; Page, David; Cox, Jennifer; Burnside, Elizabeth S

2018-02-01

Improved prediction of the "most harmful" breast cancers that cause the most substantive morbidity and mortality would enable physicians to target more intense screening and preventive measures at those women who have the highest risk; however, such prediction models for the "most harmful" breast cancers have rarely been developed. Electronic health records (EHRs) represent an underused data source that has great research and clinical potential. Our goal was to quantify the value of EHR variables in the "most harmful" breast cancer risk prediction. We identified 794 subjects who had breast cancer with primary non-benign tumors with their earliest diagnosis on or after 1/1/2004 from an existing personalized medicine data repository, including 395 "most harmful" breast cancer cases and 399 "least harmful" breast cancer cases. For these subjects, we collected EHR data comprised of 6 components: demographics, diagnoses, symptoms, procedures, medications, and laboratory results. We developed two regularized prediction models, Ridge Logistic Regression (Ridge-LR) and Lasso Logistic Regression (Lasso-LR), to predict the "most harmful" breast cancer one year in advance. The area under the ROC curve (AUC) was used to assess model performance. We observed that the AUCs of Ridge-LR and Lasso-LR models were 0.818 and 0.839 respectively. For both the Ridge-LR and Lasso-LR models, the predictive performance of the whole EHR variables was significantly higher than that of each individual component (p<0.001). In conclusion, EHR variables can be used to predict the "most harmful" breast cancer, providing the possibility to personalize care for those women at the highest risk in clinical practice.
Preoperatively staging liver fibrosis using noninvasive method in Hepatitis B virus-infected hepatocellular carcinoma patients

PubMed Central

Gao, Hengyi; Zhu, Feng; Wang, Min; Zhang, Hang; Ye, Dawei; Yang, Jiayin; Jiang, Li; Liu, Chang; Qin, Renyi; Yan, Lunan; Xiao, Guangqin

2017-01-01

Background Advanced liver fibrosis can result in serious complications (even patient’s death) after partial hepatectomy. Preoperatively percutaneous liver biopsy is an invasive and expensive method to assess liver fibrosis. We aim to establish a noninvasive model, on the basis of preoperative biomarkers, to predict liver fibrosis in hepatocellular carcinoma (HCC) patients with hepatitis B virus (HBV) infection. Methods The HBV-infected liver cancer patients who had received hepatectomy were retrospectively and prospectively enrolled in this study. Univariate analysis was used to compare the variables of the patients with mild to moderate liver fibrosis and with severe liver fibrosis. The significant factors were selected into binary logistic regression analysis. Factors determined to be significant were used to establish a noninvasive model. Then the diagnostic accuracy of this novel model was examined based on sensitivity, specificity and area under the receiver-operating characteristic curve (AUC). Results This study included 2,176 HBV-infected HCC patients who had undergone partial hepatectomy (1,682 retrospective subjects and 494 prospective subjects). Regression analysis indicated that total bilirubin and prothrombin time had positive correlation with liver fibrosis. It also demonstrated that blood platelet count and fibrinogen had negative correlation with liver fibrosis. The AUC values of the model based on these four factors for predicting significant fibrosis, advanced fibrosis and cirrhosis were 0.79-0.83, 0.83-0.85 and 0.85-0.88, respectively. Conclusion The results showed that this novel preoperative model was an excellent noninvasive method for assessing liver fibrosis in HBV-infected HCC patients. PMID:28008144
Total inpatient treatment costs in patients with severe burns: towards a more accurate reimbursement model.

PubMed

Mehra, Tarun; Koljonen, Virve; Seifert, Burkhardt; Volbracht, Jörk; Giovanoli, Pietro; Plock, Jan; Moos, Rudolf Maria

2015-01-01

Reimbursement systems have difficulties depicting the actual cost of burn treatment, leaving care providers with a significant financial burden. Our aim was to establish a simple and accurate reimbursement model compatible with prospective payment systems. A total of 370 966 electronic medical records of patients discharged in 2012 to 2013 from Swiss university hospitals were reviewed. A total of 828 cases of burns including 109 cases of severe burns were retained. Costs, revenues and earnings for severe and nonsevere burns were analysed and a linear regression model predicting total inpatient treatment costs was established. The median total costs per case for severe burns was tenfold higher than for nonsevere burns (179 949 CHF [167 353 EUR] vs 11 312 CHF [10 520 EUR], interquartile ranges 96 782-328 618 CHF vs 4 874-27 783 CHF, p <0.001). The median of earnings per case for nonsevere burns was 588 CHF (547 EUR) (interquartile range -6 720 - 5 354 CHF) whereas severe burns incurred a large financial loss to care providers, with median earnings of -33 178 CHF (30 856 EUR) (interquartile range -95 533 - 23 662 CHF). Differences were highly significant (p <0.001). Our linear regression model predicting total costs per case with length of stay (LOS) as independent variable had an adjusted R2 of 0.67 (p <0.001 for LOS). Severe burns are systematically underfunded within the Swiss reimbursement system. Flat-rate DRG-based refunds poorly reflect the actual treatment costs. In conclusion, we suggest a reimbursement model based on a per diem rate for treatment of severe burns.
Development of an anaerobic threshold (HRLT, HRVT) estimation equation using the heart rate threshold (HRT) during the treadmill incremental exercise test.

PubMed

Ham, Joo-Ho; Park, Hun-Young; Kim, Youn-Ho; Bae, Sang-Kon; Ko, Byung-Hoon; Nam, Sang-Seok

2017-09-30

The purpose of this study was to develop a regression model to estimate the heart rate at the lactate threshold (HRLT) and the heart rate at the ventilatory threshold (HRVT) using the heart rate threshold (HRT), and to test the validity of the regression model. We performed a graded exercise test with a treadmill in 220 normal individuals (men: 112, women: 108) aged 20-59 years. HRT, HRLT, and HRVT were measured in all subjects. A regression model was developed to estimate HRLT and HRVT using HRT with 70% of the data (men: 79, women: 76) through randomization (7:3), with the Bernoulli trial. The validity of the regression model developed with the remaining 30% of the data (men: 33, women: 32) was also examined. Based on the regression coefficient, we found that the independent variable HRT was a significant variable in all regression models. The adjusted R2 of the developed regression models averaged about 70%, and the standard error of estimation of the validity test results was 11 bpm, which is similar to that of the developed model. These results suggest that HRT is a useful parameter for predicting HRLT and HRVT. ©2017 The Korean Society for Exercise Nutrition
Prevalence of vitamin D deficiency and associated factors in women and newborns in the immediate postpartum period

PubMed Central

do Prado, Mara Rúbia Maciel Cardoso; Oliveira, Fabiana de Cássia Carvalho; Assis, Karine Franklin; Ribeiro, Sarah Aparecida Vieira; do Prado, Pedro Paulo; Sant'Ana, Luciana Ferreira da Rocha; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro

2015-01-01

Abstract Objective: To assess the prevalence of vitamin D deficiency and its associated factors in women and their newborns in the postpartum period. Methods: This cross-sectional study evaluated vitamin D deficiency/insufficiency in 226 women and their newborns in Viçosa (Minas Gerais, BR) between December 2011 and November 2012. Cord blood and venous maternal blood were collected to evaluate the following biochemical parameters: vitamin D, alkaline phosphatase, calcium, phosphorus and parathyroid hormone. Poisson regression analysis, with a confidence interval of 95%, was applied to assess vitamin D deficiency and its associated factors. Multiple linear regression analysis was performed to identify factors associated with 25(OH)D deficiency in the newborns and women from the study. The criteria for variable inclusion in the multiple linear regression model was the association with the dependent variable in the simple linear regression analysis, considering p<0.20. Significance level was α <5%. Results: From 226 women included, 200 (88.5%) were 20-44 years old; the median age was 28 years. Deficient/insufficient levels of vitamin D were found in 192 (85%) women and in 182 (80.5%) neonates. The maternal 25(OH)D and alkaline phosphatase levels were independently associated with vitamin D deficiency in infants. Conclusions: This study identified a high prevalence of vitamin D deficiency and insufficiency in women and newborns and the association between maternal nutritional status of vitamin D and their infants' vitamin D status. PMID:26100593
Nonlinear isochrones in murine left ventricular pressure-volume loops: how well does the time-varying elastance concept hold?

PubMed

Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P

2006-04-01

The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.
Weather variability and the incidence of cryptosporidiosis: comparison of time series poisson regression and SARIMA models.

PubMed

Hu, Wenbiao; Tong, Shilu; Mengersen, Kerrie; Connell, Des

2007-09-01

Few studies have examined the relationship between weather variables and cryptosporidiosis in Australia. This paper examines the potential impact of weather variability on the transmission of cryptosporidiosis and explores the possibility of developing an empirical forecast system. Data on weather variables, notified cryptosporidiosis cases, and population size in Brisbane were supplied by the Australian Bureau of Meteorology, Queensland Department of Health, and Australian Bureau of Statistics for the period of January 1, 1996-December 31, 2004, respectively. Time series Poisson regression and seasonal auto-regression integrated moving average (SARIMA) models were performed to examine the potential impact of weather variability on the transmission of cryptosporidiosis. Both the time series Poisson regression and SARIMA models show that seasonal and monthly maximum temperature at a prior moving average of 1 and 3 months were significantly associated with cryptosporidiosis disease. It suggests that there may be 50 more cases a year for an increase of 1 degrees C maximum temperature on average in Brisbane. Model assessments indicated that the SARIMA model had better predictive ability than the Poisson regression model (SARIMA: root mean square error (RMSE): 0.40, Akaike information criterion (AIC): -12.53; Poisson regression: RMSE: 0.54, AIC: -2.84). Furthermore, the analysis of residuals shows that the time series Poisson regression appeared to violate a modeling assumption, in that residual autocorrelation persisted. The results of this study suggest that weather variability (particularly maximum temperature) may have played a significant role in the transmission of cryptosporidiosis. A SARIMA model may be a better predictive model than a Poisson regression model in the assessment of the relationship between weather variability and the incidence of cryptosporidiosis.
Automatic Classification of Users’ Health Information Need Context: Logistic Regression Analysis of Mouse-Click and Eye-Tracker Data

PubMed Central

Pian, Wenjing; Khoo, Christopher SG

2017-01-01

Background Users searching for health information on the Internet may be searching for their own health issue, searching for someone else’s health issue, or browsing with no particular health issue in mind. Previous research has found that these three categories of users focus on different types of health information. However, most health information websites provide static content for all users. If the three types of user health information need contexts can be identified by the Web application, the search results or information offered to the user can be customized to increase its relevance or usefulness to the user. Objective The aim of this study was to investigate the possibility of identifying the three user health information contexts (searching for self, searching for others, or browsing with no particular health issue in mind) using just hyperlink clicking behavior; using eye-tracking information; and using a combination of eye-tracking, demographic, and urgency information. Predictive models are developed using multinomial logistic regression. Methods A total of 74 participants (39 females and 35 males) who were mainly staff and students of a university were asked to browse a health discussion forum, Healthboards.com. An eye tracker recorded their examining (eye fixation) and skimming (quick eye movement) behaviors on 2 types of screens: summary result screen displaying a list of post headers, and detailed post screen. The following three types of predictive models were developed using logistic regression analysis: model 1 used only the time spent in scanning the summary result screen and reading the detailed post screen, which can be determined from the user’s mouse clicks; model 2 used the examining and skimming durations on each screen, recorded by an eye tracker; and model 3 added user demographic and urgency information to model 2. Results An analysis of variance (ANOVA) analysis found that users’ browsing durations were significantly different for the three health information contexts (P<.001). The logistic regression model 3 was able to predict the user’s type of health information context with a 10-fold cross validation mean accuracy of 84% (62/74), followed by model 2 at 73% (54/74) and model 1 at 71% (52/78). In addition, correlation analysis found that particular browsing durations were highly correlated with users’ age, education level, and the urgency of their information need. Conclusions A user’s type of health information need context (ie, searching for self, for others, or with no health issue in mind) can be identified with reasonable accuracy using just user mouse clicks that can easily be detected by Web applications. Higher accuracy can be obtained using Google glass or future computing devices with eye tracking function. PMID:29269342
Inequality and adolescent violence: an exploration of community, family, and individual factors.

PubMed Central

Bruce, Marino A.

2004-01-01

PURPOSE: The study seeks to examine whether the relationships among community, family, individual factors, and violent behavior are parallel across race- and gender-specific segments of the adolescent population. METHODS: Data from the National Longitudinal Study of Adolescent Health are analyzed to highlight the complex relationships between inequality, community, family, individual behavior, and violence. RESULTS: The results from robust regression analysis provide evidence that social environmental factors can influence adolescent violence in race- and gender-specific ways. CONCLUSIONS: Findings from this study establish the plausibility of multidimensional models that specify a complex relationship between inequality and adolescent violence. PMID:15101669
Spatial Double Generalized Beta Regression Models: Extensions and Application to Study Quality of Education in Colombia

ERIC Educational Resources Information Center

Cepeda-Cuervo, Edilberto; Núñez-Antón, Vicente

2013-01-01

In this article, a proposed Bayesian extension of the generalized beta spatial regression models is applied to the analysis of the quality of education in Colombia. We briefly revise the beta distribution and describe the joint modeling approach for the mean and dispersion parameters in the spatial regression models' setting. Finally, we motivate…
Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert

2013-01-01

Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.
Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

ERIC Educational Resources Information Center

Waller, Niels; Jones, Jeff

2011-01-01

We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…
Logistic models--an odd(s) kind of regression.

PubMed

Jupiter, Daniel C

2013-01-01

The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Numerically accurate computational techniques for optimal estimator analyses of multi-parameter models

NASA Astrophysics Data System (ADS)

Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.

2018-05-01

Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy simulations using a dataset of a direct numerical simulation of a non-premixed sooting turbulent flame.
Reported Theory Use by Digital Interventions for Hazardous and Harmful Alcohol Consumption, and Association With Effectiveness: Meta-Regression

PubMed Central

Crane, David; Brown, Jamie; Kaner, Eileen; Beyer, Fiona; Muirhead, Colin; Hickman, Matthew; Redmore, James; de Vocht, Frank; Beard, Emma; Michie, Susan

2018-01-01

Background Applying theory to the design and evaluation of interventions is likely to increase effectiveness and improve the evidence base from which future interventions are developed, though few interventions report this. Objective The aim of this paper was to assess how digital interventions to reduce hazardous and harmful alcohol consumption report the use of theory in their development and evaluation, and whether reporting of theory use is associated with intervention effectiveness. Methods Randomized controlled trials were extracted from a Cochrane review on digital interventions for reducing hazardous and harmful alcohol consumption. Reporting of theory use within these digital interventions was investigated using the theory coding scheme (TCS). Reported theory use was analyzed by frequency counts and descriptive statistics. Associations were analyzed with meta-regression models. Results Of 41 trials involving 42 comparisons, half did not mention theory (50% [21/42]), and only 38% (16/42) used theory to select or develop the intervention techniques. Significant heterogeneity existed between studies in the effect of interventions on alcohol reduction (I2=77.6%, P<.001). No significant associations were detected between reporting of theory use and intervention effectiveness in unadjusted models, though the meta-regression was underpowered to detect modest associations. Conclusions Digital interventions offer a unique opportunity to refine and develop new dynamic, temporally sensitive theories, yet none of the studies reported refining or developing theory. Clearer selection, application, and reporting of theory use is needed to accurately assess how useful theory is in this field and to advance the field of behavior change theories. PMID:29490895
Longitudinal Analysis of Gender Differences in Academic Productivity among Medical Faculty across 24 Medical Schools in the United States

PubMed Central

Raj, Anita; Carr, Phyllis L.; Kaplan, Samantha E.; Terrin, Norma; Breeze, Janis L.; Freund, Karen M.

2017-01-01

Purpose This study examines gender differences in academic productivity, as indicated by publications and federal grant funding acquisition, among a longitudinal cohort of medical faculty from 24 medical schools across the United States, 1995 to 2012. Method Data for this research was taken from the National Faculty Study involving a survey with medical faculty recruited from medical schools in 1995, and followed up in 2012. Data included surveys and publication and grant funding databases. Outcomes were number of publications, h-index and principal investigator on a federal grant in the prior two years. Gender differences were assessed using negative binomial regression models for publication and h-index outcomes, and logistic regression for the grant funding outcome; analyses adjusted for race/ethnicity, rank, specialty area and years since first academic appointment. Results Data were available for 1,244 of the 1,275 (98%) subjects eligible for the follow up study. Men were significantly more likely than women to be married/partnered, have children, and hold the rank of professor (P < .0001). Adjusted regression models document that women have a lower rate of publication (relative number = .71; 95% CI = .63, .81; P < .0001) and h-index (relative number = .81; 95% CI = .73, .90; P < .0001) relative to men, though there was no gender difference in grant funding. Conclusions Women faculty acquire federal funding at similar rates as male faculty, yet lag behind in terms of publications and their impact. Medical academia must consider how to help address ongoing gender disparities in publication records. PMID:27276002
Collagen Triple Helix Repeat Containing-1 (CTHRC1) Expression in Oral Squamous Cell Carcinoma (OSCC): Prognostic Value and Clinico-Pathological Implications

PubMed Central

Lee, Chia Ee; Vincent-Chong, Vui King; Ramanathan, Anand; Kallarakkal, Thomas George; Karen-Ng, Lee Peng; Ghani, Wan Maria Nabillah; Rahman, Zainal Ariff Abdul; Ismail, Siti Mazlipah; Abraham, Mannil Thomas; Tay, Keng Kiong; Mustafa, Wan Mahadzir Wan; Cheong, Sok Ching; Zain, Rosnah Binti

2015-01-01

BACKGROUND: Collagen Triple Helix Repeat Containing 1 (CTHRC1) is a protein often found to be over-expressed in various types of human cancers. However, correlation between CTHRC1 expression level with clinico-pathological characteristics and prognosis in oral cancer remains unclear. Therefore, this study aimed to determine mRNA and protein expression of CTHRC1 in oral squamous cell carcinoma (OSCC) and to evaluate the clinical and prognostic impact of CTHRC1 in OSCC. METHODS: In this study, mRNA and protein expression of CTHRC1 in OSCCs were determined by quantitative PCR and immunohistochemistry, respectively. The association between CTHRC1 and clinico-pathological parameters were evaluated by univariate and multivariate binary logistic regression analyses. Correlation between CTHRC1 protein expressions with survival were analysed using Kaplan-Meier and Cox regression models. RESULTS: Current study demonstrated CTHRC1 was significantly overexpressed at the mRNA level in OSCC. Univariate analyses indicated a high-expression of CTHRC1 that was significantly associated with advanced stage pTNM staging, tumour size ≥ 4 cm and positive lymph node metastasis (LNM). However, only positive LNM remained significant after adjusting with other confounder factors in multivariate logistic regression analyses. Kaplan-Meier survival analyses and Cox model demonstrated that patients with high-expression of CTHRC1 protein were associated with poor prognosis and is an independent prognostic factor in OSCC. CONCLUSION: This study indicated that over-expression of CTHRC1 potentially as an independent predictor for positive LNM and poor prognosis in OSCC. PMID:26664254
SNPs selection using support vector regression and genetic algorithms in GWAS

PubMed Central

2014-01-01

Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332
Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities?

PubMed Central

Aldstadt, Jared; Whalen, John; White, Kellee; Castro, Marcia C.; Williams, David R.

2017-01-01

BACKGROUND Multiple and varied benefits have been suggested for increased neighborhood walkability. However, spatial inequalities in neighborhood walkability likely exist and may be attributable, in part, to residential segregation. OBJECTIVE Utilizing a spatial demographic perspective, we evaluated potential spatial inequalities in walkable neighborhood amenities across census tracts in Boston, MA (US). METHODS The independent variables included minority racial/ethnic population percentages and percent of families in poverty. Walkable neighborhood amenities were assessed with a composite measure. Spatial autocorrelation in key study variables were first calculated with the Global Moran’s I statistic. Then, Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were calculated as well as Spearman correlations accounting for spatial autocorrelation. We fit ordinary least squares (OLS) regression and spatial autoregressive models, when appropriate, as a final step. RESULTS Significant positive spatial autocorrelation was found in neighborhood socio-demographic characteristics (e.g. census tract percent Black), but not walkable neighborhood amenities or in the OLS regression residuals. Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were not statistically significant, nor were neighborhood socio-demographic characteristics significantly associated with walkable neighborhood amenities in OLS regression models. CONCLUSIONS Our results suggest that there is residential segregation in Boston and that spatial inequalities do not necessarily show up using a composite measure. COMMENTS Future research in other geographic areas (including international contexts) and using different definitions of neighborhoods (including small-area definitions) should evaluate if spatial inequalities are found using composite measures but also should use measures of specific neighborhood amenities. PMID:29046612
Association of Stressful Life Events with Psychological Problems: A Large-Scale Community-Based Study Using Grouped Outcomes Latent Factor Regression with Latent Predictors

PubMed Central

Hassanzadeh, Akbar; Heidari, Zahra; Hassanzadeh Keshteli, Ammar; Afshar, Hamid

2017-01-01

Objective The current study is aimed at investigating the association between stressful life events and psychological problems in a large sample of Iranian adults. Method In a cross-sectional large-scale community-based study, 4763 Iranian adults, living in Isfahan, Iran, were investigated. Grouped outcomes latent factor regression on latent predictors was used for modeling the association of psychological problems (depression, anxiety, and psychological distress), measured by Hospital Anxiety and Depression Scale (HADS) and General Health Questionnaire (GHQ-12), as the grouped outcomes, and stressful life events, measured by a self-administered stressful life events (SLEs) questionnaire, as the latent predictors. Results The results showed that the personal stressors domain has significant positive association with psychological distress (β = 0.19), anxiety (β = 0.25), depression (β = 0.15), and their collective profile score (β = 0.20), with greater associations in females (β = 0.28) than in males (β = 0.13) (all P < 0.001). In addition, in the adjusted models, the regression coefficients for the association of social stressors domain and psychological problems profile score were 0.37, 0.35, and 0.46 in total sample, males, and females, respectively (P < 0.001). Conclusion Results of our study indicated that different stressors, particularly those socioeconomic related, have an effective impact on psychological problems. It is important to consider the social and cultural background of a population for managing the stressors as an effective approach for preventing and reducing the destructive burden of psychological problems. PMID:29312459

Analysis of threats to research validity introduced by audio recording clinic visits: Selection bias, Hawthorne effect, both, or neither?

PubMed Central

Henry, Stephen G.; Jerant, Anthony; Iosif, Ana-Maria; Feldman, Mitchell D.; Cipri, Camille; Kravitz, Richard L.

2015-01-01

Objective To identify factors associated with participant consent to record visits; to estimate effects of recording on patient-clinician interactions Methods Secondary analysis of data from a randomized trial studying communication about depression; participants were asked for optional consent to audio record study visits. Multiple logistic regression was used to model likelihood of patient and clinician consent. Multivariable regression and propensity score analyses were used to estimate effects of audio recording on 6 dependent variables: discussion of depressive symptoms, preventive health, and depression diagnosis; depression treatment recommendations; visit length; visit difficulty. Results Of 867 visits involving 135 primary care clinicians, 39% were recorded. For clinicians, only working in academic settings (P=0.003) and having worked longer at their current practice (P=0.02) were associated with increased likelihood of consent. For patients, white race (P=0.002) and diabetes (P=0.03) were associated with increased likelihood of consent. Neither multivariable regression nor propensity score analyses revealed any significant effects of recording on the variables examined. Conclusion Few clinician or patient characteristics were significantly associated with consent. Audio recording had no significant effect on any dependent variables. Practice Implications Benefits of recording clinic visits likely outweigh the risks of bias in this setting. PMID:25837372
Effects of Medical Insurance on the Health Status and Life Satisfaction of the Elderly

PubMed Central

GU, Liubao; FENG, Huihui; JIN, Jian

2017-01-01

Background: Population aging has become increasingly serious in China. The demand for medical insurance of the elderly is increasing, and their health status and life satisfaction are becoming significant issues. This study investigates the effects of medical insurance on the health status and life satisfaction of the elderly. Methods: The national baseline survey data of the China Health and Retirement Longitudinal Survey in 2013 were adopted. The Ordered Probit Model was established. The effects of the medical insurance for urban employees, medical insurance for urban residents, and new rural cooperative medical insurance on the health status and life satisfaction of the elderly were investigated. Results: Medical insurance could facilitate the improvement of the health status and life satisfaction of the elderly. Accordingly, the health status and life satisfaction of the elderly who have medical insurance for urban residents improved significantly. The regression coefficients were 0.348 and 0.307. The corresponding regression coefficients of the medical insurance for urban employees were 0.189 and 0.236. The regression coefficients of the new rural cooperative medical insurance were 0.170 and 0.188. Conclusion: Medical insurance can significantly improve the health status and life satisfaction of the elderly. This development is of immense significance for the formulation of equal medical security. PMID:29026784
Analyzing the Impact of Ambient Temperature Indicators on Transformer Life in Different Regions of Chinese Mainland

PubMed Central

Bai, Cui-fen; Gao, Wen-Sheng; Liu, Tong

2013-01-01

Regression analysis is applied to quantitatively analyze the impact of different ambient temperature characteristics on the transformer life at different locations of Chinese mainland. 200 typical locations in Chinese mainland are selected for the study. They are specially divided into six regions so that the subsequent analysis can be done in a regional context. For each region, the local historical ambient temperature and load data are provided as inputs variables of the life consumption model in IEEE Std. C57.91-1995 to estimate the transformer life at every location. Five ambient temperature indicators related to the transformer life are involved into the partial least squares regression to describe their impact on the transformer life. According to a contribution measurement criterion of partial least squares regression, three indicators are conclusively found to be the most important factors influencing the transformer life, and an explicit expression is provided to describe the relationship between the indicators and the transformer life for every region. The analysis result is applicable to the area where the temperature characteristics are similar to Chinese mainland, and the expressions obtained can be applied to the other locations that are not included in this paper if these three indicators are known. PMID:23843729
Analyzing the impact of ambient temperature indicators on transformer life in different regions of Chinese mainland.

PubMed

Bai, Cui-fen; Gao, Wen-Sheng; Liu, Tong

2013-01-01

Regression analysis is applied to quantitatively analyze the impact of different ambient temperature characteristics on the transformer life at different locations of Chinese mainland. 200 typical locations in Chinese mainland are selected for the study. They are specially divided into six regions so that the subsequent analysis can be done in a regional context. For each region, the local historical ambient temperature and load data are provided as inputs variables of the life consumption model in IEEE Std. C57.91-1995 to estimate the transformer life at every location. Five ambient temperature indicators related to the transformer life are involved into the partial least squares regression to describe their impact on the transformer life. According to a contribution measurement criterion of partial least squares regression, three indicators are conclusively found to be the most important factors influencing the transformer life, and an explicit expression is provided to describe the relationship between the indicators and the transformer life for every region. The analysis result is applicable to the area where the temperature characteristics are similar to Chinese mainland, and the expressions obtained can be applied to the other locations that are not included in this paper if these three indicators are known.
Mapping the Structure-Function Relationship in Glaucoma and Healthy Patients Measured with Spectralis OCT and Humphrey Perimetry

PubMed Central

Muñoz–Negrete, Francisco J.; Oblanca, Noelia; Rebolleda, Gema

2018-01-01

Purpose To study the structure-function relationship in glaucoma and healthy patients assessed with Spectralis OCT and Humphrey perimetry using new statistical approaches. Materials and Methods Eighty-five eyes were prospectively selected and divided into 2 groups: glaucoma (44) and healthy patients (41). Three different statistical approaches were carried out: (1) factor analysis of the threshold sensitivities (dB) (automated perimetry) and the macular thickness (μm) (Spectralis OCT), subsequently applying Pearson's correlation to the obtained regions, (2) nonparametric regression analysis relating the values in each pair of regions that showed significant correlation, and (3) nonparametric spatial regressions using three models designed for the purpose of this study. Results In the glaucoma group, a map that relates structural and functional damage was drawn. The strongest correlation with visual fields was observed in the peripheral nasal region of both superior and inferior hemigrids (r = 0.602 and r = 0.458, resp.). The estimated functions obtained with the nonparametric regressions provided the mean sensitivity that corresponds to each given macular thickness. These functions allowed for accurate characterization of the structure-function relationship. Conclusions Both maps and point-to-point functions obtained linking structure and function damage contribute to a better understanding of this relationship and may help in the future to improve glaucoma diagnosis. PMID:29850196
The importance of molecular structures, endpoints' values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders.

PubMed

Li, Jiazhong; Gramatica, Paola

2010-11-01

Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
A Bayesian ridge regression analysis of congestion's impact on urban expressway safety.

PubMed

Shi, Qi; Abdel-Aty, Mohamed; Lee, Jaeyoung

2016-03-01

With the rapid growth of traffic in urban areas, concerns about congestion and traffic safety have been heightened. This study leveraged both Automatic Vehicle Identification (AVI) system and Microwave Vehicle Detection System (MVDS) installed on an expressway in Central Florida to explore how congestion impacts the crash occurrence in urban areas. Multiple congestion measures from the two systems were developed. To ensure more precise estimates of the congestion's effects, the traffic data were aggregated into peak and non-peak hours. Multicollinearity among traffic parameters was examined. The results showed the presence of multicollinearity especially during peak hours. As a response, ridge regression was introduced to cope with this issue. Poisson models with uncorrelated random effects, correlated random effects, and both correlated random effects and random parameters were constructed within the Bayesian framework. It was proven that correlated random effects could significantly enhance model performance. The random parameters model has similar goodness-of-fit compared with the model with only correlated random effects. However, by accounting for the unobserved heterogeneity, more variables were found to be significantly related to crash frequency. The models indicated that congestion increased crash frequency during peak hours while during non-peak hours it was not a major crash contributing factor. Using the random parameter model, the three congestion measures were compared. It was found that all congestion indicators had similar effects while Congestion Index (CI) derived from MVDS data was a better congestion indicator for safety analysis. Also, analyses showed that the segments with higher congestion intensity could not only increase property damage only (PDO) crashes, but also more severe crashes. In addition, the issues regarding the necessity to incorporate specific congestion indicator for congestion's effects on safety and to take care of the multicollinearity between explanatory variables were also discussed. By including a specific congestion indicator, the model performance significantly improved. When comparing models with and without ridge regression, the magnitude of the coefficients was altered in the existence of multicollinearity. These conclusions suggest that the use of appropriate congestion measure and consideration of multicolilnearity among the variables would improve the models and our understanding about the effects of congestion on traffic safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Survival Data and Regression Models

NASA Astrophysics Data System (ADS)

Grégoire, G.

2014-12-01

We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
ℓ(p)-Norm multikernel learning approach for stock market price forecasting.

PubMed

Shao, Xigao; Wu, Kun; Liao, Bifeng

2012-01-01

Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.
Multiple-Instance Regression with Structured Data

NASA Technical Reports Server (NTRS)

Wagstaff, Kiri L.; Lane, Terran; Roper, Alex

2008-01-01

We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.
Poisson regression models outperform the geometrical model in estimating the peak-to-trough ratio of seasonal variation: a simulation study.

PubMed

Christensen, A L; Lundbye-Christensen, S; Dethlefsen, C

2011-12-01

Several statistical methods of assessing seasonal variation are available. Brookhart and Rothman [3] proposed a second-order moment-based estimator based on the geometrical model derived by Edwards [1], and reported that this estimator is superior in estimating the peak-to-trough ratio of seasonal variation compared with Edwards' estimator with respect to bias and mean squared error. Alternatively, seasonal variation may be modelled using a Poisson regression model, which provides flexibility in modelling the pattern of seasonal variation and adjustments for covariates. Based on a Monte Carlo simulation study three estimators, one based on the geometrical model, and two based on log-linear Poisson regression models, were evaluated in regards to bias and standard deviation (SD). We evaluated the estimators on data simulated according to schemes varying in seasonal variation and presence of a secular trend. All methods and analyses in this paper are available in the R package Peak2Trough[13]. Applying a Poisson regression model resulted in lower absolute bias and SD for data simulated according to the corresponding model assumptions. Poisson regression models had lower bias and SD for data simulated to deviate from the corresponding model assumptions than the geometrical model. This simulation study encourages the use of Poisson regression models in estimating the peak-to-trough ratio of seasonal variation as opposed to the geometrical model. Copyright Â© 2011 Elsevier Ireland Ltd. All rights reserved.
Predictors of course in obsessive-compulsive disorder: logistic regression versus Cox regression for recurrent events.

PubMed

Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M

2007-09-01

Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

PubMed

Balabin, Roman M; Lomakina, Ekaterina I

2011-04-21

In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Floating Data and the Problem with Illustrating Multiple Regression.

ERIC Educational Resources Information Center

Sachau, Daniel A.

2000-01-01

Discusses how to introduce basic concepts of multiple regression by creating a large-scale, three-dimensional regression model using the classroom walls and floor. Addresses teaching points that should be covered and reveals student reaction to the model. Finds that the greatest benefit of the model is the low fear, walk-through, nonmathematical…
Regression Model Optimization for the Analysis of Experimental Data

NASA Technical Reports Server (NTRS)

Ulbrich, N.

2009-01-01

A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.
Adjusting for overdispersion in piecewise exponential regression models to estimate excess mortality rate in population-based research.

PubMed

Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard

2016-10-01

In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value <0.001). However, the flexible piecewise exponential model showed the smallest overdispersion parameter (3.2 versus 21.3) for non-flexible piecewise exponential models. We showed that there were no major differences between methods. However, using a flexible piecewise regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Effect of the Modified Glasgow Coma Scale Score Criteria for Mild Traumatic Brain Injury on Mortality Prediction: Comparing Classic and Modified Glasgow Coma Scale Score Model Scores of 13

PubMed Central

Mena, Jorge Humberto; Sanchez, Alvaro Ignacio; Rubiano, Andres M.; Peitzman, Andrew B.; Sperry, Jason L.; Gutierrez, Maria Isabel; Puyana, Juan Carlos

2011-01-01

Objective The Glasgow Coma Scale (GCS) classifies Traumatic Brain Injuries (TBI) as Mild (14–15); Moderate (9–13) or Severe (3–8). The ATLS modified this classification so that a GCS score of 13 is categorized as mild TBI. We investigated the effect of this modification on mortality prediction, comparing patients with a GCS of 13 classified as moderate TBI (Classic Model) to patients with GCS of 13 classified as mild TBI (Modified Model). Methods We selected adult TBI patients from the Pennsylvania Outcome Study database (PTOS). Logistic regressions adjusting for age, sex, cause, severity, trauma center level, comorbidities, and isolated TBI were performed. A second evaluation included the time trend of mortality. A third evaluation also included hypothermia, hypotension, mechanical ventilation, screening for drugs, and severity of TBI. Discrimination of the models was evaluated using the area under receiver operating characteristic curve (AUC). Calibration was evaluated using the Hoslmer-Lemershow goodness of fit (GOF) test. Results In the first evaluation, the AUCs were 0.922 (95 %CI, 0.917–0.926) and 0.908 (95 %CI, 0.903–0.912) for classic and modified models, respectively. Both models showed poor calibration (p<0.001). In the third evaluation, the AUCs were 0.946 (95 %CI, 0.943 – 0.949) and 0.938 (95 %CI, 0.934 –0.940) for the classic and modified models, respectively, with improvements in calibration (p=0.30 and p=0.02 for the classic and modified models, respectively). Conclusion The lack of overlap between ROC curves of both models reveals a statistically significant difference in their ability to predict mortality. The classic model demonstrated better GOF than the modified model. A GCS of 13 classified as moderate TBI in a multivariate logistic regression model performed better than a GCS of 13 classified as mild. PMID:22071923
Bias in logistic regression due to imperfect diagnostic test results and practical correction approaches.

PubMed

Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul

2015-11-04

Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
An Expert System for the Evaluation of Cost Models

DTIC Science & Technology

1990-09-01

contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

DTIC Science & Technology

1974-01-01

REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans

Estimating Contraceptive Prevalence Using Logistics Data for Short-Acting Methods: Analysis Across 30 Countries

PubMed Central

Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana

2015-01-01

Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP-based model for condoms, models were able to estimate public-sector prevalence rates for each short-acting method to within 2 percentage points in at least 85% of countries. Conclusions: Public-sector contraceptive logistics data are strongly correlated with public-sector prevalence rates for short-acting methods, demonstrating the quality of current logistics data and their ability to provide relatively accurate prevalence estimates. The models provide a starting point for generating interim estimates of contraceptive use when timely survey data are unavailable. All models except the condoms CYP model performed well; the regression models were most accurate but the CYP model offers the simplest calculation method. Future work extending the research to other modern methods, relating subnational logistics data with prevalence rates, and tracking that relationship over time is needed. PMID:26374805
Exploring models for the roles of health systems’ responsiveness and social determinants in explaining universal health coverage and health outcomes

PubMed Central

Bonsel, Gouke J.

2016-01-01

Background Intersectoral perspectives of health are present in the rhetoric of the sustainable development goals. Yet its descriptions of systematic approaches for an intersectoral monitoring vision, joining determinants of health, and barriers or facilitators to accessing healthcare services are lacking. Objective To explore models of associations between health outcomes and health service coverage, and health determinants and health systems responsiveness, and thereby to contribute to monitoring, analysis, and assessment approaches informed by an intersectoral vision of health. Design The study is designed as a series of ecological, cross-country regression analyses, covering between 23 and 57 countries with dependent health variables concentrated on the years 2002–2003. Countries cover a range of development contexts. Health outcome and health service coverage dependent variables were derived from World Health Organization (WHO) information sources. Predictor variables representing determinants are derived from the WHO and World Bank databases; variables used for health systems’ responsiveness are derived from the WHO World Health Survey. Responsiveness is a measure of acceptability of health services to the population, complementing financial health protection. Results Health determinants’ indicators – access to improved drinking sources, accountability, and average years of schooling – were statistically significant in particular health outcome regressions. Statistically significant coefficients were more common for mortality rate regressions than for coverage rate regressions. Responsiveness was systematically associated with poorer health and health service coverage. With respect to levels of inequality in health, the indicator of responsiveness problems experienced by the unhealthy poor groups in the population was statistically significant for regressions on measles vaccination inequalities between rich and poor. For the broader determinants, the Gini mattered most for inequalities in child mortality; education mattered more for inequalities in births attended by skilled personnel. Conclusions This paper adds to the literature on comparative health systems research. National and international health monitoring frameworks need to incorporate indicators on trends in and impacts of other policy sectors on health. This will empower the health sector to carry out public health practices that promote health and health equity. PMID:26942516
When ab ≠ c - c': published errors in the reports of single-mediator models.

PubMed

Petrocelli, John V; Clarkson, Joshua J; Whitmire, Melanie B; Moon, Paul E

2013-06-01

Accurate reports of mediation analyses are critical to the assessment of inferences related to causality, since these inferences are consequential for both the evaluation of previous research (e.g., meta-analyses) and the progression of future research. However, upon reexamination, approximately 15% of published articles in psychology contain at least one incorrect statistical conclusion (Bakker & Wicherts, Behavior research methods, 43, 666-678 2011), disparities that beget the question of inaccuracy in mediation reports. To quantify this question of inaccuracy, articles reporting standard use of single-mediator models in three high-impact journals in personality and social psychology during 2011 were examined. More than 24% of the 156 models coded failed an equivalence test (i.e., ab = c - c'), suggesting that one or more regression coefficients in mediation analyses are frequently misreported. The authors cite common sources of errors, provide recommendations for enhanced accuracy in reports of single-mediator models, and discuss implications for alternative methods.
Extension and refinement of the predictive value of different classes of markers in ADNI: Four year follow-up data

PubMed Central

Gomar, Jesus J; Conejero-Goldberg, Concepcion; Davies, Peter; Goldberg, Terry E

2014-01-01

Background This study examined the predictive value of different classes of markers in the progression from Mild Cognitive Impairment (MCI) to Alzheimer’s disease (AD) over an extended 4 year follow-up in ADNI. Methods MCI patients assessed on clinical, cognitive, MRI, PET-FDG, and CSF markers at baseline, and followed on a yearly basis for four years to ascertain progression to AD. Logistic regression models were fitted in clusters including demographics, APOE genotype, cognitive markers, and biomarkers (morphometric, PET-FDG, CSF Abeta and tau). Results The predictive model at four years revealed that two cognitive measures, an episodic memory measure and a clock drawing screening test, were the best predictors of conversion (AUC= 0.78). Conclusions This model of prediction is consistent to the previous model at two years, thus highlighting the importance of cognitive measures in progression from MCI to AD. Cognitive markers were more robust predictors than biomarkers. PMID:24613706
Variable selection and model choice in geoadditive regression models.

PubMed

Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard

2009-06-01

Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.
Bayesian isotonic density regression

PubMed Central

Wang, Lianming; Dunson, David B.

2011-01-01

Density regression models allow the conditional distribution of the response given predictors to change flexibly over the predictor space. Such models are much more flexible than nonparametric mean regression models with nonparametric residual distributions, and are well supported in many applications. A rich variety of Bayesian methods have been proposed for density regression, but it is not clear whether such priors have full support so that any true data-generating model can be accurately approximated. This article develops a new class of density regression models that incorporate stochastic-ordering constraints which are natural when a response tends to increase or decrease monotonely with a predictor. Theory is developed showing large support. Methods are developed for hypothesis testing, with posterior computation relying on a simple Gibbs sampler. Frequentist properties are illustrated in a simulation study, and an epidemiology application is considered. PMID:22822259
Linear regression crash prediction models : issues and proposed solutions.

DOT National Transportation Integrated Search

2010-05-01

The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Should metacognition be measured by logistic regression?

PubMed

Rausch, Manuel; Zehetleitner, Michael

2017-03-01

Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Modelling of capital asset pricing by considering the lagged effects

NASA Astrophysics Data System (ADS)

Sukono; Hidayat, Y.; Bon, A. Talib bin; Supian, S.

2017-01-01

In this paper the problem of modelling the Capital Asset Pricing Model (CAPM) with the effect of the lagged is discussed. It is assumed that asset returns are analysed influenced by the market return and the return of risk-free assets. To analyse the relationship between asset returns, the market return, and the return of risk-free assets, it is conducted by using a regression equation of CAPM, and regression equation of lagged distributed CAPM. Associated with the regression equation lagged CAPM distributed, this paper also developed a regression equation of Koyck transformation CAPM. Results of development show that the regression equation of Koyck transformation CAPM has advantages, namely simple as it only requires three parameters, compared with regression equation of lagged distributed CAPM.
[Comparison of predictive effect between the single auto regressive integrated moving average (ARIMA) model and the ARIMA-generalized regression neural network (GRNN) combination model on the incidence of scarlet fever].

PubMed

Zhu, Yu; Xia, Jie-lai; Wang, Jing

2009-09-01

Application of the 'single auto regressive integrated moving average (ARIMA) model' and the 'ARIMA-generalized regression neural network (GRNN) combination model' in the research of the incidence of scarlet fever. Establish the auto regressive integrated moving average model based on the data of the monthly incidence on scarlet fever of one city, from 2000 to 2006. The fitting values of the ARIMA model was used as input of the GRNN, and the actual values were used as output of the GRNN. After training the GRNN, the effect of the single ARIMA model and the ARIMA-GRNN combination model was then compared. The mean error rate (MER) of the single ARIMA model and the ARIMA-GRNN combination model were 31.6%, 28.7% respectively and the determination coefficient (R(2)) of the two models were 0.801, 0.872 respectively. The fitting efficacy of the ARIMA-GRNN combination model was better than the single ARIMA, which had practical value in the research on time series data such as the incidence of scarlet fever.
ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

PubMed

Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K

2017-01-01

The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches

PubMed Central

Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.

2017-01-01

The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Comparison of Regression Analysis and Transfer Function in Estimating the Parameters of Central Pulse Waves from Brachial Pulse Wave.

PubMed

Chai, Rui; Xu, Li-Sheng; Yao, Yang; Hao, Li-Ling; Qi, Lin

2017-01-01

This study analyzed ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO), and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. Invasively measured parameters were compared with parameters measured from brachial pulse waves by regression model and transfer function model. Accuracy of parameters estimated by regression and transfer function model, was compared too. Findings showed that k value, central pulse wave and brachial pulse wave parameters invasively measured, correlated positively. Regression model parameters including A_slope, DBP, SEVR, and transfer function model parameters had good consistency with parameters invasively measured. They had same effect of consistency. SBP, PP, SV, and CO could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
Increased Risk of Venous Thromboembolism in Women with Uterine Leiomyoma: A Nationwide, Population-Based Case-Control Study

PubMed Central

Huang, Hung-Kai; Kor, Chew-Teng; Chen, Ching-Pei; Chen, Hung-Te; Yang, Po-Ta; Tsai, Chen-Dao; Huang, Ching-Hui

2018-01-01

Background Venous thromboembolism (VTE) is a sex-specific disease that has different presentations between men and women. Women with uterine leiomyoma can present with VTE without exhibiting the traditional risk factors. We investigated the relationship between a history of uterine leiomyoma and the risk of VTE using the National Health Insurance Research Database (NHIRD). Methods We conducted a retrospective, nationwide, population-based case-control study using the NHIRD. We identified 2,282 patients with diagnosed VTE and 392,635 subjects without VTE from 2000 to 2013. After development of an age and index diagnosis year frequency-matched model and propensity score-matched model, 2 models with a case-to-control ratio of 1 to 4 were established. Using the diagnosis of uterine leiomyoma as the exposure factor, conditional logistic regression was performed to examine the association between uterine leiomyoma and VTE. Multiple logistic regression analysis was used to investigate the joint effect of uterine leiomyoma and comorbid diseases on the risk of VTE. Results A strong association was observed between uterine leiomyoma and VTE in the overall patient model, frequency-matched model and propensity score-matched model [p < 0.0001, odds ratio (OR): 1.547; p = 0.0005, OR: 1.486; p = 0.0405, OR: 1.26, respectively]. In the subgroup analyses, women with uterine leiomyoma who were ≥ 45 years old were less likely to experience VTE, but women with uterine leiomyoma and anemia, cancer, coronary artery disease or heart failure were more likely to experience VTE. Conclusions Women with uterine leiomyomas have an increased risk of developing VTE, especially during reproductive periods or in the presence of specific diseases. PMID:29375226
Development of a Multicomponent Prediction Model for Acute Esophagitis in Lung Cancer Patients Receiving Chemoradiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Ruyck, Kim, E-mail: kim.deruyck@UGent.be; Sabbe, Nick; Oberije, Cary

2011-10-01

Purpose: To construct a model for the prediction of acute esophagitis in lung cancer patients receiving chemoradiotherapy by combining clinical data, treatment parameters, and genotyping profile. Patients and Methods: Data were available for 273 lung cancer patients treated with curative chemoradiotherapy. Clinical data included gender, age, World Health Organization performance score, nicotine use, diabetes, chronic disease, tumor type, tumor stage, lymph node stage, tumor location, and medical center. Treatment parameters included chemotherapy, surgery, radiotherapy technique, tumor dose, mean fractionation size, mean and maximal esophageal dose, and overall treatment time. A total of 332 genetic polymorphisms were considered in 112 candidatemore » genes. The predicting model was achieved by lasso logistic regression for predictor selection, followed by classic logistic regression for unbiased estimation of the coefficients. Performance of the model was expressed as the area under the curve of the receiver operating characteristic and as the false-negative rate in the optimal point on the receiver operating characteristic curve. Results: A total of 110 patients (40%) developed acute esophagitis Grade {>=}2 (Common Terminology Criteria for Adverse Events v3.0). The final model contained chemotherapy treatment, lymph node stage, mean esophageal dose, gender, overall treatment time, radiotherapy technique, rs2302535 (EGFR), rs16930129 (ENG), rs1131877 (TRAF3), and rs2230528 (ITGB2). The area under the curve was 0.87, and the false-negative rate was 16%. Conclusion: Prediction of acute esophagitis can be improved by combining clinical, treatment, and genetic factors. A multicomponent prediction model for acute esophagitis with a sensitivity of 84% was constructed with two clinical parameters, four treatment parameters, and four genetic polymorphisms.« less
Long-Term Trajectories of the Development of Speech Sound Production in Pediatric Cochlear Implant Recipients

PubMed Central

Tomblin, J. Bruce; Peng, Shu-Chen; Spencer, Linda J.; Lu, Nelson

2011-01-01

Purpose This study characterized the development of speech sound production in prelingually deaf children with a minimum of 8 years of cochlear implant (CI) experience. Method Twenty-seven pediatric CI recipients' spontaneous speech samples from annual evaluation sessions were phonemically transcribed. Accuracy for these speech samples was evaluated in piecewise regression models. Results As a group, pediatric CI recipients showed steady improvement in speech sound production following implantation, but the improvement rate declined after 6 years of device experience. Piecewise regression models indicated that the slope estimating the participants' improvement rate was statistically greater than 0 during the first 6 years postimplantation, but not after 6 years. The group of pediatric CI recipients' accuracy of speech sound production after 4 years of device experience reasonably predicts their speech sound production after 5–10 years of device experience. Conclusions The development of speech sound production in prelingually deaf children stabilizes after 6 years of device experience, and typically approaches a plateau by 8 years of device use. Early growth in speech before 4 years of device experience did not predict later rates of growth or levels of achievement. However, good predictions could be made after 4 years of device use. PMID:18695018
Detection and quantification of adulteration in sandalwood oil through near infrared spectroscopy.

PubMed

Kuriakose, Saji; Thankappan, Xavier; Joe, Hubert; Venkataraman, Venkateswaran

2010-10-01

The confirmation of authenticity of essential oils and the detection of adulteration are problems of increasing importance in the perfumes, pharmaceutical, flavor and fragrance industries. This is especially true for 'value added' products like sandalwood oil. A methodical study is conducted here to demonstrate the potential use of Near Infrared (NIR) spectroscopy along with multivariate calibration models like principal component regression (PCR) and partial least square regression (PLSR) as rapid analytical techniques for the qualitative and quantitative determination of adulterants in sandalwood oil. After suitable pre-processing of the NIR raw spectral data, the models are built-up by cross-validation. The lowest Root Mean Square Error of Cross-Validation and Calibration (RMSECV and RMSEC % v/v) are used as a decision supporting system to fix the optimal number of factors. The coefficient of determination (R(2)) and the Root Mean Square Error of Prediction (RMSEP % v/v) in the prediction sets are used as the evaluation parameters (R(2) = 0.9999 and RMSEP = 0.01355). The overall result leads to the conclusion that NIR spectroscopy with chemometric techniques could be successfully used as a rapid, simple, instant and non-destructive method for the detection of adulterants, even 1% of the low-grade oils, in the high quality form of sandalwood oil.
Hazards of New Media: Youth’s Exposure to Tobacco Ads/Promotions

PubMed Central

2014-01-01

Background: A gap in knowledge exists about the youth’s exposure to protobacco campaigns via new electronic media outlets. In response, we use national data to delineate the associations between tobacco ads/promotions delivered through new media outlets (i.e., social network sites and text messages) and youth attitudes/beliefs about tobacco and intent to use (among youth who had not yet used tobacco). Methods: Data were derived from the 2011 National Youth Tobacco Survey, a nationally representative sample of U.S. youth enrolled in both public and private schools (N = 15,673). Logistic regression models were used to examine associations between demographic characteristics and reported exposure to tobacco ads/promotions via social networking sites and text messages. Logistic regression models were also used to investigate associations between exposure tobacco ads/promotions and attitudes toward tobacco. Results: We found that highly susceptible youth (i.e., minorities, very young youth, and youth who have not yet used tobacco) have observed tobacco ads/promotions on social networking sites and text messages. These youth are more likely to have favorable attitudes toward tobacco, including the intention to use tobacco among those who had not yet used tobacco. Conclusions: Our findings underscore the need for policy strategies to more effectively monitor and regulate tobacco advertising via new media outlets. PMID:24163285
Impact of Trichiasis Surgery on Physical Functioning in Ethiopian Patients: STAR Trial

PubMed Central

Wolle, Meraf A.; Cassard, Sandra D.; Gower, Emily W.; Munoz, Beatriz E.; Wang, Jiangxia; Alemayehu, Wondu; West, Sheila K.

2010-01-01

Purpose To evaluate the physical functioning of Ethiopian trichiasis surgery patients before and six months after surgery. Design Nested Cohort Study Methods This study was nested within the Surgery for Trichiasis, Antibiotics to Prevent Recurrence (STAR) clinical trial conducted in Ethiopia. Demographic information, ocular examinations, and physical functioning assessments were collected before and 6 months after surgery. A single score for patients’ physical functioning was constructed using Rasch analysis. A multivariate linear regression model was used to determine if change in physical functioning was associated with change in visual acuity. Results Of the 438 participants, 411 (93.8%) had both baseline and follow-up questionnaires. Physical functioning scores at baseline ranged from −6.32 (great difficulty) to +6.01 (no difficulty). The percent of participants reporting no difficulty in physical functioning increased by 32.6%; the proportion of participants in the mild/no visual impairment category increased by 8.6%. A multivariate linear regression model showed that for every line of vision gained, physical functioning improves significantly (0.09 units; 95% CI: 0.02–0.16). Conclusions Surgery to correct trichiasis appears to improve patients’ physical functioning as measured at 6 months. More effort in promoting trichiasis surgery is essential, not only to prevent corneal blindness, but also to enable improved functioning in daily life. PMID:21333268
Adherence of older women with strength training and aerobic exercise

PubMed Central

Picorelli, Alexandra Miranda Assumpção; Pereira, Daniele Sirineu; Felício, Diogo Carvalho; Dos Anjos, Daniela Maria; Pereira, Danielle Aparecida Gomes; Dias, Rosângela Corrêa; Assis, Marcella Guimarães; Pereira, Leani Souza Máximo

2014-01-01

Background Participation of older people in a program of regular exercise is an effective strategy to minimize the physical decline associated with age. The purpose of this study was to assess adherence rates in older women enrolled in two different exercise programs (one aerobic exercise and one strength training) and identify any associated clinical or functional factors. Methods This was an exploratory observational study in a sample of 231 elderly women of mean age 70.5 years. We used a structured questionnaire with standardized tests to evaluate the relevant clinical and functional measures. A specific adherence questionnaire was developed by the researchers to determine motivators and barriers to exercise adherence. Results The adherence rate was 49.70% in the aerobic exercise group and 56.20% in the strength training group. Multiple logistic regression models for motivation were significant (P=0.003) for the muscle strengthening group (R2=0.310) and also significant (P=0.008) for the aerobic exercise group (R2=0.154). A third regression model for barriers to exercise was significant (P=0.003) only for the muscle strengthening group (R2=0.236). The present study shows no direct relationship between worsening health status and poor adherence. Conclusion Factors related to adherence with exercise in the elderly are multifactorial. PMID:24600212

Some links on this page may take you to non-federal websites. Their policies may differ from this site.