NASA Astrophysics Data System (ADS)
Haris, A.; Nafian, M.; Riyanto, A.
2017-07-01
Danish North Sea Fields consist of several formations (Ekofisk, Tor, and Cromer Knoll) that was started from the age of Paleocene to Miocene. In this study, the integration of seismic and well log data set is carried out to determine the chalk sand distribution in the Danish North Sea field. The integration of seismic and well log data set is performed by using the seismic inversion analysis and seismic multi-attribute. The seismic inversion algorithm, which is used to derive acoustic impedance (AI), is model-based technique. The derived AI is then used as external attributes for the input of multi-attribute analysis. Moreover, the multi-attribute analysis is used to generate the linear and non-linear transformation of among well log properties. In the case of the linear model, selected transformation is conducted by weighting step-wise linear regression (SWR), while for the non-linear model is performed by using probabilistic neural networks (PNN). The estimated porosity, which is resulted by PNN shows better suited to the well log data compared with the results of SWR. This result can be understood since PNN perform non-linear regression so that the relationship between the attribute data and predicted log data can be optimized. The distribution of chalk sand has been successfully identified and characterized by porosity value ranging from 23% up to 30%.
Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data
Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.
2013-01-01
Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689
On the equivalence of case-crossover and time series methods in environmental epidemiology.
Lu, Yun; Zeger, Scott L
2007-04-01
The case-crossover design was introduced in epidemiology 15 years ago as a method for studying the effects of a risk factor on a health event using only cases. The idea is to compare a case's exposure immediately prior to or during the case-defining event with that same person's exposure at otherwise similar "reference" times. An alternative approach to the analysis of daily exposure and case-only data is time series analysis. Here, log-linear regression models express the expected total number of events on each day as a function of the exposure level and potential confounding variables. In time series analyses of air pollution, smooth functions of time and weather are the main confounders. Time series and case-crossover methods are often viewed as competing methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for overdispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics.
Protocol Analysis as a Tool in Function and Task Analysis
1999-10-01
Autocontingency The use of log-linear and logistic regression methods to analyse sequential data seems appealing , and is strongly advocated by...collection and analysis of observational data. Behavior Research Methods, Instruments, and Computers, 23(3), 415-429. Patrick, J. D. (1991). Snob : A
Reliability Analysis of the Gradual Degradation of Semiconductor Devices.
1983-07-20
under the heading of linear models or linear statistical models . 3 ,4 We have not used this material in this report. Assuming catastrophic failure when...assuming a catastrophic model . In this treatment we first modify our system loss formula and then proceed to the actual analysis. II. ANALYSIS OF...Failure Time 1 Ti Ti 2 T2 T2 n Tn n and are easily analyzed by simple linear regression. Since we have assumed a log normal/Arrhenius activation
Kilian, Reinhold; Matschinger, Herbert; Löeffler, Walter; Roick, Christiane; Angermeyer, Matthias C
2002-03-01
Transformation of the dependent cost variable is often used to solve the problems of heteroscedasticity and skewness in linear ordinary least square regression of health service cost data. However, transformation may cause difficulties in the interpretation of regression coefficients and the retransformation of predicted values. The study compares the advantages and disadvantages of different methods to estimate regression based cost functions using data on the annual costs of schizophrenia treatment. Annual costs of psychiatric service use and clinical and socio-demographic characteristics of the patients were assessed for a sample of 254 patients with a diagnosis of schizophrenia (ICD-10 F 20.0) living in Leipzig. The clinical characteristics of the participants were assessed by means of the BPRS 4.0, the GAF, and the CAN for service needs. Quality of life was measured by WHOQOL-BREF. A linear OLS regression model with non-parametric standard errors, a log-transformed OLS model and a generalized linear model with a log-link and a gamma distribution were used to estimate service costs. For the estimation of robust non-parametric standard errors, the variance estimator by White and a bootstrap estimator based on 2000 replications were employed. Models were evaluated by the comparison of the R2 and the root mean squared error (RMSE). RMSE of the log-transformed OLS model was computed with three different methods of bias-correction. The 95% confidence intervals for the differences between the RMSE were computed by means of bootstrapping. A split-sample-cross-validation procedure was used to forecast the costs for the one half of the sample on the basis of a regression equation computed for the other half of the sample. All three methods showed significant positive influences of psychiatric symptoms and met psychiatric service needs on service costs. Only the log- transformed OLS model showed a significant negative impact of age, and only the GLM shows a significant negative influences of employment status and partnership on costs. All three models provided a R2 of about.31. The Residuals of the linear OLS model revealed significant deviances from normality and homoscedasticity. The residuals of the log-transformed model are normally distributed but still heteroscedastic. The linear OLS model provided the lowest prediction error and the best forecast of the dependent cost variable. The log-transformed model provided the lowest RMSE if the heteroscedastic bias correction was used. The RMSE of the GLM with a log link and a gamma distribution was higher than those of the linear OLS model and the log-transformed OLS model. The difference between the RMSE of the linear OLS model and that of the log-transformed OLS model without bias correction was significant at the 95% level. As result of the cross-validation procedure, the linear OLS model provided the lowest RMSE followed by the log-transformed OLS model with a heteroscedastic bias correction. The GLM showed the weakest model fit again. None of the differences between the RMSE resulting form the cross- validation procedure were found to be significant. The comparison of the fit indices of the different regression models revealed that the linear OLS model provided a better fit than the log-transformed model and the GLM, but the differences between the models RMSE were not significant. Due to the small number of cases in the study the lack of significance does not sufficiently proof that the differences between the RSME for the different models are zero and the superiority of the linear OLS model can not be generalized. The lack of significant differences among the alternative estimators may reflect a lack of sample size adequate to detect important differences among the estimators employed. Further studies with larger case number are necessary to confirm the results. Specification of an adequate regression models requires a careful examination of the characteristics of the data. Estimation of standard errors and confidence intervals by nonparametric methods which are robust against deviations from the normal distribution and the homoscedasticity of residuals are suitable alternatives to the transformation of the skew distributed dependent variable. Further studies with more adequate case numbers are needed to confirm the results.
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Ma, Wan-Li; Sun, De-Zhi; Shen, Wei-Guo; Yang, Meng; Qi, Hong; Liu, Li-Yan; Shen, Ji-Min; Li, Yi-Fan
2011-07-01
A comprehensive sampling campaign was carried out to study atmospheric concentration of polycyclic aromatic hydrocarbons (PAHs) in Beijing and to evaluate the effectiveness of source control strategies in reducing PAHs pollution after the 29th Olympic Games. The sub-cooled liquid vapor pressure (logP(L)(o))-based model and octanol-air partition coefficient (K(oa))-based model were applied based on each seasonal dateset. Regression analysis among log K(P), logP(L)(o) and log K(oa) exhibited high significant correlations for four seasons. Source factors were identified by principle component analysis and contributions were further estimated by multiple linear regression. Pyrogenic sources and coke oven emission were identified as major sources for both the non-heating and heating seasons. As compared with literatures, the mean PAH concentrations before and after the 29th Olympic Games were reduced by more than 60%, indicating that the source control measures were effective for reducing PAHs pollution in Beijing. Copyright © 2011 Elsevier Ltd. All rights reserved.
Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil
2015-12-07
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.
Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil
2015-01-01
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
Effect of Malmquist bias on correlation studies with IRAS data base
NASA Technical Reports Server (NTRS)
Verter, Frances
1993-01-01
The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
Statistical considerations in the analysis of data from replicated bioassays
USDA-ARS?s Scientific Manuscript database
Multiple-dose bioassay is generally the preferred method for characterizing virulence of insect pathogens. Linear regression of probit mortality on log dose enables estimation of LD50/LC50 and slope, the latter having substantial effect on LD90/95s (doses of considerable interest in pest management)...
Minimizing bias in biomass allometry: Model selection and log transformation of data
Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer
2011-01-01
Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....
A FORTRAN program for multivariate survival analysis on the personal computer.
Mulder, P G
1988-01-01
In this paper a FORTRAN program is presented for multivariate survival or life table regression analysis in a competing risks' situation. The relevant failure rate (for example, a particular disease or mortality rate) is modelled as a log-linear function of a vector of (possibly time-dependent) explanatory variables. The explanatory variables may also include the variable time itself, which is useful for parameterizing piecewise exponential time-to-failure distributions in a Gompertz-like or Weibull-like way as a more efficient alternative to Cox's proportional hazards model. Maximum likelihood estimates of the coefficients of the log-linear relationship are obtained from the iterative Newton-Raphson method. The program runs on a personal computer under DOS; running time is quite acceptable, even for large samples.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
ERIC Educational Resources Information Center
Denham, Bryan E.
2009-01-01
Grounded conceptually in social cognitive theory, this research examines how personal, behavioral, and environmental factors are associated with risk perceptions of anabolic-androgenic steroids. Ordinal logistic regression and logit log-linear models applied to data gathered from high-school seniors (N = 2,160) in the 2005 Monitoring the Future…
Cronin, Matthew A.; Amstrup, Steven C.; Durner, George M.; Noel, Lynn E.; McDonald, Trent L.; Ballard, Warren B.
1998-01-01
There is concern that caribou (Rangifer tarandus) may avoid roads and facilities (i.e., infrastructure) in the Prudhoe Bay oil field (PBOF) in northern Alaska, and that this avoidance can have negative effects on the animals. We quantified the relationship between caribou distribution and PBOF infrastructure during the post-calving period (mid-June to mid-August) with aerial surveys from 1990 to 1995. We conducted four to eight surveys per year with complete coverage of the PBOF. We identified active oil field infrastructure and used a geographic information system (GIS) to construct ten 1 km wide concentric intervals surrounding the infrastructure. We tested whether caribou distribution is related to distance from infrastructure with a chi-squared habitat utilization-availability analysis and log-linear regression. We considered bulls, calves, and total caribou of all sex/age classes separately. The habitat utilization-availability analysis indicated there was no consistent trend of attraction to or avoidance of infrastructure. Caribou frequently were more abundant than expected in the intervals close to infrastructure, and this trend was more pronounced for bulls and for total caribou of all sex/age classes than for calves. Log-linear regression (with Poisson error structure) of numbers of caribou and distance from infrastructure were also done, with and without combining data into the 1 km distance intervals. The analysis without intervals revealed no relationship between caribou distribution and distance from oil field infrastructure, or between caribou distribution and Julian date, year, or distance from the Beaufort Sea coast. The log-linear regression with caribou combined into distance intervals showed the density of bulls and total caribou of all sex/age classes declined with distance from infrastructure. Our results indicate that during the post-calving period: 1) caribou distribution is largely unrelated to distance from infrastructure; 2) caribou regularly use habitats in the PBOF; 3) caribou often occur close to infrastructure; and 4) caribou do not appear to avoid oil field infrastructure.
Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert
2012-01-01
Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES
The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally
2018-02-01
1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUC inf ) of dalbavancin is a key parameter and AUC inf /MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. C max ) C max versus AUC inf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUC inf were performed using published C max data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The C max versus AUC inf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE < 10.3%. The external data evaluation showed that the models predicted AUC inf with a RMSE of 3.02-27.46% with fold difference largely contained within 0.64-1.48. 5. Regardless of the regression models, a single time point strategy of using C max (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUC inf of dalbavancin in patients.
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
Athanasopoulos, Leonidas V; Dritsas, Athanasios; Doll, Helen A; Cokkinos, Dennis V
2010-08-01
This study was conducted to explain the variance in quality of life (QoL) and activity capacity of patients with congestive heart failure from pathophysiological changes as estimated by laboratory data. Peak oxygen consumption (peak VO2) and ventilation (VE)/carbon dioxide output (VCO2) slope derived from cardiopulmonary exercise testing, plasma N-terminal prohormone of B-type natriuretic peptide (NT-proBNP), and echocardiographic markers [left atrium (LA), left ventricular ejection fraction (LVEF)] were measured in 62 patients with congestive heart failure, who also completed the Minnesota Living with Heart Failure Questionnaire and the Specific Activity Questionnaire. All regression models were adjusted for age and sex. On linear regression analysis, peak VO2 with P value less than 0.001, VE/VCO2 slope with P value less than 0.01, LVEF with P value less than 0.001, LA with P=0.001, and logNT-proBNP with P value less than 0.01 were found to be associated with QoL. On stepwise multiple linear regression, peak VO2 and LVEF continued to be predictive, accounting for 40% of the variability in Minnesota Living with Heart Failure Questionnaire score. On linear regression analysis, peak VO2 with P value less than 0.001, VE/VCO2 slope with P value less than 0.001, LVEF with P value less than 0.05, LA with P value less than 0.001, and logNT-proBNP with P value less than 0.001 were found to be associated with activity capacity. On stepwise multiple linear regression, peak VO2 and LA continued to be predictive, accounting for 53% of the variability in Specific Activity Questionnaire score. Peak VO2 is independently associated both with QoL and activity capacity. In addition to peak VO2, LVEF is independently associated with QoL, and LA with activity capacity.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Vucicevic, J; Popovic, M; Nikolic, K; Filipic, S; Obradovic, D; Agbaba, D
2017-03-01
For this study, 31 compounds, including 16 imidazoline/α-adrenergic receptor (IRs/α-ARs) ligands and 15 central nervous system (CNS) drugs, were characterized in terms of the retention factors (k) obtained using biopartitioning micellar and classical reversed phase chromatography (log k BMC and log k wRP , respectively). Based on the retention factor (log k wRP ) and slope of the linear curve (S) the isocratic parameter (φ 0 ) was calculated. Obtained retention factors were correlated with experimental log BB values for the group of examined compounds. High correlations were obtained between logarithm of biopartitioning micellar chromatography (BMC) retention factor and effective permeability (r(log k BMC /log BB): 0.77), while for RP-HPLC system the correlations were lower (r(log k wRP /log BB): 0.58; r(S/log BB): -0.50; r(φ 0 /P e ): 0.61). Based on the log k BMC retention data and calculated molecular parameters of the examined compounds, quantitative structure-permeability relationship (QSPR) models were developed using partial least squares, stepwise multiple linear regression, support vector machine and artificial neural network methodologies. A high degree of structural diversity of the analysed IRs/α-ARs ligands and CNS drugs provides wide applicability domain of the QSPR models for estimation of blood-brain barrier penetration of the related compounds.
On the null distribution of Bayes factors in linear regression
USDA-ARS?s Scientific Manuscript database
We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
NASA Astrophysics Data System (ADS)
de Andrés, Javier; Landajo, Manuel; Lorca, Pedro; Labra, Jose; Ordóñez, Patricia
Artificial neural networks have proven to be useful tools for solving financial analysis problems such as financial distress prediction and audit risk assessment. In this paper we focus on the performance of robust (least absolute deviation-based) neural networks on measuring liquidity of firms. The problem of learning the bivariate relationship between the components (namely, current liabilities and current assets) of the so-called current ratio is analyzed, and the predictive performance of several modelling paradigms (namely, linear and log-linear regressions, classical ratios and neural networks) is compared. An empirical analysis is conducted on a representative data base from the Spanish economy. Results indicate that classical ratio models are largely inadequate as a realistic description of the studied relationship, especially when used for predictive purposes. In a number of cases, especially when the analyzed firms are microenterprises, the linear specification is improved by considering the flexible non-linear structures provided by neural networks.
Holtschlag, David J.; Shively, Dawn; Whitman, Richard L.; Haack, Sheridan K.; Fogarty, Lisa R.
2008-01-01
Regression analyses and hydrodynamic modeling were used to identify environmental factors and flow paths associated with Escherichia coli (E. coli) concentrations at Memorial and Metropolitan Beaches on Lake St. Clair in Macomb County, Mich. Lake St. Clair is part of the binational waterway between the United States and Canada that connects Lake Huron with Lake Erie in the Great Lakes Basin. Linear regression, regression-tree, and logistic regression models were developed from E. coli concentration and ancillary environmental data. Linear regression models on log10 E. coli concentrations indicated that rainfall prior to sampling, water temperature, and turbidity were positively associated with bacteria concentrations at both beaches. Flow from Clinton River, changes in water levels, wind conditions, and log10 E. coli concentrations 2 days before or after the target bacteria concentrations were statistically significant at one or both beaches. In addition, various interaction terms were significant at Memorial Beach. Linear regression models for both beaches explained only about 30 percent of the variability in log10 E. coli concentrations. Regression-tree models were developed from data from both Memorial and Metropolitan Beaches but were found to have limited predictive capability in this study. The results indicate that too few observations were available to develop reliable regression-tree models. Linear logistic models were developed to estimate the probability of E. coli concentrations exceeding 300 most probable number (MPN) per 100 milliliters (mL). Rainfall amounts before bacteria sampling were positively associated with exceedance probabilities at both beaches. Flow of Clinton River, turbidity, and log10 E. coli concentrations measured before or after the target E. coli measurements were related to exceedances at one or both beaches. The linear logistic models were effective in estimating bacteria exceedances at both beaches. A receiver operating characteristic (ROC) analysis was used to determine cut points for maximizing the true positive rate prediction while minimizing the false positive rate. A two-dimensional hydrodynamic model was developed to simulate horizontal current patterns on Lake St. Clair in response to wind, flow, and water-level conditions at model boundaries. Simulated velocity fields were used to track hypothetical massless particles backward in time from the beaches along flow paths toward source areas. Reverse particle tracking for idealized steady-state conditions shows changes in expected flow paths and traveltimes with wind speeds and directions from 24 sectors. The results indicate that three to four sets of contiguous wind sectors have similar effects on flow paths in the vicinity of the beaches. In addition, reverse particle tracking was used for transient conditions to identify expected flow paths for 10 E. coli sampling events in 2004. These results demonstrate the ability to track hypothetical particles from the beaches, backward in time, to likely source areas. This ability, coupled with a greater frequency of bacteria sampling, may provide insight into changes in bacteria concentrations between source and sink areas.
New method for calculating a mathematical expression for streamflow recession
Rutledge, Albert T.
1991-01-01
An empirical method has been devised to calculate the master recession curve, which is a mathematical expression for streamflow recession during times of negligible direct runoff. The method is based on the assumption that the storage-delay factor, which is the time per log cycle of streamflow recession, varies linearly with the logarithm of streamflow. The resulting master recession curve can be nonlinear. The method can be executed by a computer program that reads a data file of daily mean streamflow, then allows the user to select several near-linear segments of streamflow recession. The storage-delay factor for each segment is one of the coefficients of the equation that results from linear least-squares regression. Using results for each recession segment, a mathematical expression of the storage-delay factor as a function of the log of streamflow is determined by linear least-squares regression. The master recession curve, which is a second-order polynomial expression for time as a function of log of streamflow, is then derived using the coefficients of this function.
Moran, John L; Solomon, Patricia J
2012-05-16
For the analysis of length-of-stay (LOS) data, which is characteristically right-skewed, a number of statistical estimators have been proposed as alternatives to the traditional ordinary least squares (OLS) regression with log dependent variable. Using a cohort of patients identified in the Australian and New Zealand Intensive Care Society Adult Patient Database, 2008-2009, 12 different methods were used for estimation of intensive care (ICU) length of stay. These encompassed risk-adjusted regression analysis of firstly: log LOS using OLS, linear mixed model [LMM], treatment effects, skew-normal and skew-t models; and secondly: unmodified (raw) LOS via OLS, generalised linear models [GLMs] with log-link and 4 different distributions [Poisson, gamma, negative binomial and inverse-Gaussian], extended estimating equations [EEE] and a finite mixture model including a gamma distribution. A fixed covariate list and ICU-site clustering with robust variance were utilised for model fitting with split-sample determination (80%) and validation (20%) data sets, and model simulation was undertaken to establish over-fitting (Copas test). Indices of model specification using Bayesian information criterion [BIC: lower values preferred] and residual analysis as well as predictive performance (R2, concordance correlation coefficient (CCC), mean absolute error [MAE]) were established for each estimator. The data-set consisted of 111663 patients from 131 ICUs; with mean(SD) age 60.6(18.8) years, 43.0% were female, 40.7% were mechanically ventilated and ICU mortality was 7.8%. ICU length-of-stay was 3.4(5.1) (median 1.8, range (0.17-60)) days and demonstrated marked kurtosis and right skew (29.4 and 4.4 respectively). BIC showed considerable spread, from a maximum of 509801 (OLS-raw scale) to a minimum of 210286 (LMM). R2 ranged from 0.22 (LMM) to 0.17 and the CCC from 0.334 (LMM) to 0.149, with MAE 2.2-2.4. Superior residual behaviour was established for the log-scale estimators. There was a general tendency for over-prediction (negative residuals) and for over-fitting, the exception being the GLM negative binomial estimator. The mean-variance function was best approximated by a quadratic function, consistent with log-scale estimation; the link function was estimated (EEE) as 0.152(0.019, 0.285), consistent with a fractional-root function. For ICU length of stay, log-scale estimation, in particular the LMM, appeared to be the most consistently performing estimator(s). Neither the GLM variants nor the skew-regression estimators dominated.
Schmidt, Rebecca J; Hansen, Robin L; Hartiala, Jaana; Allayee, Hooman; Sconberg, Jaime L; Schmidt, Linda C; Volk, Heather E; Tassone, Flora
2015-08-01
Vitamin D is essential for proper neurodevelopment and cognitive and behavioral function. We examined associations between autism spectrum disorder (ASD) and common, functional polymorphisms in vitamin D pathways. Children aged 24-60 months enrolled from 2003 to 2009 in the population-based CHARGE case-control study were evaluated clinically and confirmed to have ASD (n=474) or typical development (TD, n=281). Maternal, paternal, and child DNA samples for 384 (81%) families of children with ASD and 234 (83%) families of TD children were genotyped for: TaqI, BsmI, FokI, and Cdx2 in the vitamin D receptor (VDR) gene, and CYP27B1 rs4646536, GC rs4588, and CYP2R1 rs10741657. Case-control logistic regression, family-based log-linear, and hybrid log-linear analyses were conducted to produce risk estimates and 95% confidence intervals (CI) for each allelic variant. Paternal VDR TaqI homozygous variant genotype was significantly associated with ASD in case-control analysis (odds ratio [OR] [CI]: 6.3 [1.9-20.7]) and there was a trend towards increased risk associated with VDR BsmI (OR [CI]: 4.7 [1.6-13.4]). Log-linear triad analyses detected parental imprinting, with greater effects of paternally-derived VDR alleles. Child GC AA-genotype/A-allele was associated with ASD in log-linear and ETDT analyses. A significant association between decreased ASD risk and child CYP2R1 AA-genotype was found in hybrid log-linear analysis. There were limitations of low statistical power for less common alleles due to missing paternal genotypes. This study provides preliminary evidence that paternal and child vitamin D metabolism could play a role in the etiology of ASD; further research in larger study populations is warranted. Copyright © 2015. Published by Elsevier Ireland Ltd.
"Geo-statistics methods and neural networks in geophysical applications: A case study"
NASA Astrophysics Data System (ADS)
Rodriguez Sandoval, R.; Urrutia Fucugauchi, J.; Ramirez Cruz, L. C.
2008-12-01
The study is focus in the Ebano-Panuco basin of northeastern Mexico, which is being explored for hydrocarbon reservoirs. These reservoirs are in limestones and there is interest in determining porosity and permeability in the carbonate sequences. The porosity maps presented in this study are estimated from application of multiattribute and neural networks techniques, which combine geophysics logs and 3-D seismic data by means of statistical relationships. The multiattribute analysis is a process to predict a volume of any underground petrophysical measurement from well-log and seismic data. The data consist of a series of target logs from wells which tie a 3-D seismic volume. The target logs are neutron porosity logs. From the 3-D seismic volume a series of sample attributes is calculated. The objective of this study is to derive a set of attributes and the target log values. The selected set is determined by a process of forward stepwise regression. The analysis can be linear or nonlinear. In the linear mode the method consists of a series of weights derived by least-square minimization. In the nonlinear mode, a neural network is trained using the select attributes as inputs. In this case we used a probabilistic neural network PNN. The method is applied to a real data set from PEMEX. For better reservoir characterization the porosity distribution was estimated using both techniques. The case shown a continues improvement in the prediction of the porosity from the multiattribute to the neural network analysis. The improvement is in the training and the validation, which are important indicators of the reliability of the results. The neural network showed an improvement in resolution over the multiattribute analysis. The final maps provide more realistic results of the porosity distribution.
Schønning, Kristian; Johansen, Kim; Nielsen, Lone Gilmor; Weis, Nina; Westh, Henrik
2018-07-01
Quantification of HBV DNA is used for initiating and monitoring antiviral treatment. Analytical test performance consequently impacts treatment decisions. To compare the analytical performance of the Aptima HBV Quant Assay (Aptima) and the COBAS Ampliprep/COBAS TaqMan HBV Test v2.0 (CAPCTMv2) for the quantification of HBV DNA in plasma samples. The performance of the two tests was compared on 129 prospective plasma samples, and on 63 archived plasma samples of which 53 were genotyped. Linearity of the two assays was assessed on dilutions series of three clinical samples (Genotype B, C, and D). Bland-Altman analysis of 120 clinical samples, which quantified in both tests, showed an average quantification bias (Aptima - CAPCTMv2) of -0.19 Log IU/mL (SD: 0.33 Log IU/mL). A single sample quantified more than three standard deviations higher in Aptima than in CAPCTMv2. Only minor differences were observed between genotype A (N = 4; average difference -0.01 Log IU/mL), B (N = 8; -0.13 Log IU/mL), C (N = 8; -0.31 Log IU/mL), D (N = 25; -0.22 Log IU/mL), and E (N = 7; -0.03 Log IU/mL). Deming regression showed that the two tests were excellently correlated (slope of the regression line 1.03; 95% CI: 0.998-1.068). Linearity of the tests was evaluated on dilution series and showed an excellent correlation of the two tests. Both tests were precise with %CV less than 3% for HBV DNA ≥3 Log IU/mL. The Aptima and CAPCTMv2 tests are highly correlated, and both tests are useful for monitoring patients chronically infected with HBV. Copyright © 2018 Elsevier B.V. All rights reserved.
Huang, Jian; Zhang, Cun-Hui
2013-01-01
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100
Al-Chalabi, Ammar; Calvo, Andrea; Chio, Adriano; Colville, Shuna; Ellis, Cathy M; Hardiman, Orla; Heverin, Mark; Howard, Robin S; Huisman, Mark H B; Keren, Noa; Leigh, P Nigel; Mazzini, Letizia; Mora, Gabriele; Orrell, Richard W; Rooney, James; Scott, Kirsten M; Scotton, William J; Seelen, Meinie; Shaw, Christopher E; Sidle, Katie S; Swingler, Robert; Tsuda, Miho; Veldink, Jan H; Visser, Anne E; van den Berg, Leonard H; Pearce, Neil
2014-11-01
Amyotrophic lateral sclerosis shares characteristics with some cancers, such as onset being more common in later life, progression usually being rapid, the disease affecting a particular cell type, and showing complex inheritance. We used a model originally applied to cancer epidemiology to investigate the hypothesis that amyotrophic lateral sclerosis is a multistep process. We generated incidence data by age and sex from amyotrophic lateral sclerosis population registers in Ireland (registration dates 1995-2012), the Netherlands (2006-12), Italy (1995-2004), Scotland (1989-98), and England (2002-09), and calculated age and sex-adjusted incidences for each register. We regressed the log of age-specific incidence against the log of age with least squares regression. We did the analyses within each register, and also did a combined analysis, adjusting for register. We identified 6274 cases of amyotrophic lateral sclerosis from a catchment population of about 34 million people. We noted a linear relationship between log incidence and log age in all five registers: England r(2)=0·95, Ireland r(2)=0·99, Italy r(2)=0·95, the Netherlands r(2)=0·99, and Scotland r(2)=0·97; overall r(2)=0·99. All five registers gave similar estimates of the linear slope ranging from 4·5 to 5·1, with overlapping confidence intervals. The combination of all five registers gave an overall slope of 4·8 (95% CI 4·5-5·0), with similar estimates for men (4·6, 4·3-4·9) and women (5·0, 4·5-5·5). A linear relationship between the log incidence and log age of onset of amyotrophic lateral sclerosis is consistent with a multistage model of disease. The slope estimate suggests that amyotrophic lateral sclerosis is a six-step process. Identification of these steps could lead to preventive and therapeutic avenues. UK Medical Research Council; UK Economic and Social Research Council; Ireland Health Research Board; The Netherlands Organisation for Health Research and Development (ZonMw); the Ministry of Health and Ministry of Education, University, and Research in Italy; the Motor Neurone Disease Association of England, Wales, and Northern Ireland; and the European Commission (Seventh Framework Programme). Copyright © 2014 Elsevier Ltd. All rights reserved.
Detecting trends in raptor counts: power and type I error rates of various statistical tests
Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.
1996-01-01
We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.
A Tutorial on Multilevel Survival Analysis: Methods, Models and Applications.
Austin, Peter C
2017-08-01
Data that have a multilevel structure occur frequently across a range of disciplines, including epidemiology, health services research, public health, education and sociology. We describe three families of regression models for the analysis of multilevel survival data. First, Cox proportional hazards models with mixed effects incorporate cluster-specific random effects that modify the baseline hazard function. Second, piecewise exponential survival models partition the duration of follow-up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model that incorporates the duration of exposure within each interval. By incorporating cluster-specific random effects, generalised linear mixed models can be used to analyse these data. Third, after partitioning the duration of follow-up into mutually exclusive intervals, one can use discrete time survival models that use a complementary log-log generalised linear model to model the occurrence of the outcome of interest within each interval. Random effects can be incorporated to account for within-cluster homogeneity in outcomes. We illustrate the application of these methods using data consisting of patients hospitalised with a heart attack. We illustrate the application of these methods using three statistical programming languages (R, SAS and Stata).
An empirical model for estimating annual consumption by freshwater fish populations
Liao, H.; Pierce, C.L.; Larscheid, J.G.
2005-01-01
Population consumption is an important process linking predator populations to their prey resources. Simple tools are needed to enable fisheries managers to estimate population consumption. We assembled 74 individual estimates of annual consumption by freshwater fish populations and their mean annual population size, 41 of which also included estimates of mean annual biomass. The data set included 14 freshwater fish species from 10 different bodies of water. From this data set we developed two simple linear regression models predicting annual population consumption. Log-transformed population size explained 94% of the variation in log-transformed annual population consumption. Log-transformed biomass explained 98% of the variation in log-transformed annual population consumption. We quantified the accuracy of our regressions and three alternative consumption models as the mean percent difference from observed (bioenergetics-derived) estimates in a test data set. Predictions from our population-size regression matched observed consumption estimates poorly (mean percent difference = 222%). Predictions from our biomass regression matched observed consumption reasonably well (mean percent difference = 24%). The biomass regression was superior to an alternative model, similar in complexity, and comparable to two alternative models that were more complex and difficult to apply. Our biomass regression model, log10(consumption) = 0.5442 + 0.9962??log10(biomass), will be a useful tool for fishery managers, enabling them to make reasonably accurate annual population consumption predictions from mean annual biomass estimates. ?? Copyright by the American Fisheries Society 2005.
Cook, James P; Mahajan, Anubha; Morris, Andrew P
2017-02-01
Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.
Kapke, G E; Watson, G; Sheffler, S; Hunt, D; Frederick, C
1997-01-01
Several assays for quantification of DNA have been developed and are currently used in research and clinical laboratories. However, comparison of assay results has been difficult owing to the use of different standards and units of measurements as well as differences between assays in dynamic range and quantification limits. Although a few studies have compared results generated by different assays, there has been no consensus on conversion factors and thorough analysis has been precluded by small sample size and limited dynamic range studied. In this study, we have compared the Chiron branched DNA (bDNA) and Abbott liquid hybridization assays for quantification of hepatitis B virus (HBV) DNA in clinical specimens and have derived conversion factors to facilitate comparison of assay results. Additivity and variance stabilizing (AVAS) regression, a form of non-linear regression analysis, was performed on assay results for specimens from HBV clinical trials. Our results show that there is a strong linear relationship (R2 = 0.96) between log Chiron and log Abbott assay results. Conversion factors derived from regression analyses were found to be non-constant and ranged from 6-40. Analysis of paired assay results below and above each assay's limit of quantification (LOQ) indicated that a significantly (P < 0.01) larger proportion of observations were below the Abbott assay LOQ but above the Chiron assay LOQ, indicating that the Chiron assay is significantly more sensitive than the Abbott assay. Testing of replicate specimens showed that the Chiron assay consistently yielded lower per cent coefficients of variance (% CVs) than the Abbott assay, indicating that the Chiron assay provides superior precision.
Speech Data Analysis for Semantic Indexing of Video of Simulated Medical Crises
2015-05-01
scheduled approximately twice per week and are recorded as video data. During each session, the physician/instructor must manually review and anno - tate...spectrum, y, using regression line: y = ln(1 + Jx), (2.3) where x is the auditory power spectral amplitude, J is a singal-dependent pos- itive constant...The amplitude-warping transform is linear-like for J 1 and logarithmic-like for J 1. 3. RASTA filtering: reintegrate the log critical-band
A hierarchical model for estimating change in American Woodcock populations
Sauer, J.R.; Link, W.A.; Kendall, W.L.; Kelley, J.R.; Niven, D.K.
2008-01-01
The Singing-Ground Survey (SGS) is a primary source of information on population change for American woodcock (Scolopax minor). We analyzed the SGS using a hierarchical log-linear model and compared the estimates of change and annual indices of abundance to a route regression analysis of SGS data. We also grouped SGS routes into Bird Conservation Regions (BCRs) and estimated population change and annual indices using BCRs within states and provinces as strata. Based on the hierarchical model?based estimates, we concluded that woodcock populations were declining in North America between 1968 and 2006 (trend = -0.9%/yr, 95% credible interval: -1.2, -0.5). Singing-Ground Survey results are generally similar between analytical approaches, but the hierarchical model has several important advantages over the route regression. Hierarchical models better accommodate changes in survey efficiency over time and space by treating strata, years, and observers as random effects in the context of a log-linear model, providing trend estimates that are derived directly from the annual indices. We also conducted a hierarchical model analysis of woodcock data from the Christmas Bird Count and the North American Breeding Bird Survey. All surveys showed general consistency in patterns of population change, but the SGS had the shortest credible intervals. We suggest that population management and conservation planning for woodcock involving interpretation of the SGS use estimates provided by the hierarchical model.
Serum Spot 14 concentration is negatively associated with thyroid-stimulating hormone level
Chen, Yen-Ting; Tseng, Fen-Yu; Chen, Pei-Lung; Chi, Yu-Chao; Han, Der-Sheng; Yang, Wei-Shiung
2016-01-01
Abstract Spot 14 (S14) is a protein involved in fatty acid synthesis and was shown to be induced by thyroid hormone in rat liver. However, the presence of S14 in human serum and its relations with thyroid function status have not been investigated. The objectives of this study were to compare serum S14 concentrations in patients with hyperthyroidism or euthyroidism and to evaluate the associations between serum S14 and free thyroxine (fT4) or thyroid-stimulating hormone (TSH) levels. We set up an immunoassay for human serum S14 concentrations and compared its levels between hyperthyroid and euthyroid subjects. Twenty-six hyperthyroid patients and 29 euthyroid individuals were recruited. Data of all patients were pooled for the analysis of the associations between the levels of S14 and fT4, TSH, or quartile of TSH. The hyperthyroid patients had significantly higher serum S14 levels than the euthyroid subjects (median [Q1, Q3]: 975 [669, 1612] ng/mL vs 436 [347, 638] ng/mL, P < 0.001). In univariate linear regression, the log-transformed S14 level (logS14) was positively associated with fT4 but negatively associated with creatinine (Cre), total cholesterol (T-C), triglyceride (TG), low-density lipoprotein cholesterol (LDL-C), and TSH. The positive associations between logS14 and fT4 and the negative associations between logS14 and Cre, TG, T-C, or TSH remained significant after adjustment with sex and age. These associations were prominent in females but not in males. The logS14 levels were negatively associated with the TSH levels grouped by quartile (ß = −0.3020, P < 0.001). The association between logS14 and TSH quartile persisted after adjustment with sex and age (ß = −0.2828, P = 0.001). In stepwise multivariate regression analysis, only TSH grouped by quartile remained significantly associated with logS14 level. We developed an ELISA to measure serum S14 levels in human. Female patients with hyperthyroidism had higher serum S14 levels than the female subjects with euthyroidism. The serum logS14 concentrations were negatively associated with TSH levels. Changes of serum S14 level in the whole thyroid function spectrum deserve further investigation. PMID:27749565
The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression?
Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J; Ma, Keping
2013-01-01
Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees.
Rosenblum, Michael; van der Laan, Mark J.
2010-01-01
Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636
Rothenberg, Stephen J; Rothenberg, Jesse C
2005-09-01
Statistical evaluation of the dose-response function in lead epidemiology is rarely attempted. Economic evaluation of health benefits of lead reduction usually assumes a linear dose-response function, regardless of the outcome measure used. We reanalyzed a previously published study, an international pooled data set combining data from seven prospective lead studies examining contemporaneous blood lead effect on IQ (intelligence quotient) of 7-year-old children (n = 1,333). We constructed alternative linear multiple regression models with linear blood lead terms (linear-linear dose response) and natural-log-transformed blood lead terms (log-linear dose response). We tested the two lead specifications for nonlinearity in the models, compared the two lead specifications for significantly better fit to the data, and examined the effects of possible residual confounding on the functional form of the dose-response relationship. We found that a log-linear lead-IQ relationship was a significantly better fit than was a linear-linear relationship for IQ (p = 0.009), with little evidence of residual confounding of included model variables. We substituted the log-linear lead-IQ effect in a previously published health benefits model and found that the economic savings due to U.S. population lead decrease between 1976 and 1999 (from 17.1 microg/dL to 2.0 microg/dL) was 2.2 times (319 billion dollars) that calculated using a linear-linear dose-response function (149 billion dollars). The Centers for Disease Control and Prevention action limit of 10 microg/dL for children fails to protect against most damage and economic cost attributable to lead exposure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Yangho; Lee, Byung-Kook, E-mail: bklee@sch.ac.kr
Introduction: The objective of this study was to evaluate associations between blood lead, cadmium, and mercury levels with estimated glomerular filtration rate in a general population of South Korean adults. Methods: This was a cross-sectional study based on data obtained in the Korean National Health and Nutrition Examination Survey (KNHANES) (2008-2010). The final analytical sample consisted of 5924 participants. Estimated glomerular filtration rate (eGFR) was calculated using the MDRD Study equation as an indicator of glomerular function. Results: In multiple linear regression analysis of log2-transformed blood lead as a continuous variable on eGFR, after adjusting for covariates including cadmium andmore » mercury, the difference in eGFR levels associated with doubling of blood lead were -2.624 mL/min per 1.73 m Superscript-Two (95% CI: -3.803 to -1.445). In multiple linear regression analysis using quartiles of blood lead as the independent variable, the difference in eGFR levels comparing participants in the highest versus the lowest quartiles of blood lead was -3.835 mL/min per 1.73 m Superscript-Two (95% CI: -5.730 to -1.939). In a multiple linear regression analysis using blood cadmium and mercury, as continuous or categorical variables, as independent variables, neither metal was a significant predictor of eGFR. Odds ratios (ORs) and 95% CI values for reduced eGFR calculated for log2-transformed blood metals and quartiles of the three metals showed similar trends after adjustment for covariates. Discussion: In this large, representative sample of South Korean adults, elevated blood lead level was consistently associated with lower eGFR levels and with the prevalence of reduced eGFR even in blood lead levels below 10 {mu}g/dL. In conclusion, elevated blood lead level was associated with lower eGFR in a Korean general population, supporting the role of lead as a risk factor for chronic kidney disease.« less
Three-parameter modeling of the soil sorption of acetanilide and triazine herbicide derivatives.
Freitas, Mirlaine R; Matias, Stella V B G; Macedo, Renato L G; Freitas, Matheus P; Venturin, Nelson
2014-02-01
Herbicides have widely variable toxicity and many of them are persistent soil contaminants. Acetanilide and triazine family of herbicides have widespread use, but increasing interest for the development of new herbicides has been rising to increase their effectiveness and to diminish environmental hazard. The environmental risk of new herbicides can be accessed by estimating their soil sorption (logKoc), which is usually correlated to the octanol/water partition coefficient (logKow). However, earlier findings have shown that this correlation is not valid for some acetanilide and triazine herbicides. Thus, easily accessible quantitative structure-property relationship models are required to predict logKoc of analogues of the these compounds. Octanol/water partition coefficient, molecular weight and volume were calculated and then regressed against logKoc for two series of acetanilide and triazine herbicides using multiple linear regression, resulting in predictive and validated models.
Log-normal frailty models fitted as Poisson generalized linear mixed models.
Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver
2016-12-01
The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS
Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.
2017-01-01
Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971
van Os-Medendorp, Harmieke; van Leent-de Wit, Ilse; de Bruin-Weller, Marjolein; Knulst, André
2015-05-23
Two online self-management programs for patients with atopic dermatitis (AD) or food allergy (FA) were developed with the aim of helping patients cope with their condition, follow the prescribed treatment regimen, and deal with the consequences of their illness in daily life. Both programs consist of several modules containing information, personal stories by fellow patients, videos, and exercises with feedback. Health care professionals can refer their patients to the programs. However, the use of the program in daily practice is unknown. The aim of this study was to explore the use and characteristics of users of the online self-management programs "Living with eczema," and "Living with food allergy," and to investigate factors related to the use of the trainings. A cross-sectional design was carried out in which the outcome parameters were the number of log-ins by patients, the number of hits on the system's core features, disease severity, quality of life, and domains of self-management. Descriptive statistics were used to summarize sample characteristics and to describe number of log-ins and hits per module and per functionality. Correlation and regression analyses were used to explore the relation between the number of log-ins and patient characteristics. Since the start, 299 adult patients have been referred to the online AD program; 173 logged in for at least one occasion. Data from 75 AD patients were available for analyses. Mean number of log-ins was 3.1 (range 1-11). Linear regression with the number of log-ins as dependent variable showed that age and quality of life contributed most to the model, with betas of .35 ( P=.002) and .26 (P=.05), respectively, and an R(2) of .23. Two hundred fourteen adult FA patients were referred to the online FA training, 124 logged in for at least one occasion and data from 45 patients were available for analysis. Mean number of log-ins was 3.0 (range 1-11). Linear regression with the number of log-ins as dependent variable revealed that adding the self-management domain "social integration and support" to the model led to an R(2) of .13. The modules with information about the disease, diagnosis, and treatment were most visited. Most hits were on the information parts of the modules (55-58%), followed by exercises (30-32%). The online self-management programs "Living with eczema" and "Living with food allergy" were used by patients in addition to the usual face-to-face care. Almost 60% of all referred patients logged in, with an average of three log-ins. All modules seemed to be relevant, but there is room for improvement in the use of the training. Age, quality of life, and lower social integration and support were related to the use of the training, but only part of the variance in use could be explained by these variables.
Schalasta, Gunnar; Börner, Anna; Speicher, Andrea; Enders, Martin
2018-03-28
Proper management of patients with chronic hepatitis B virus (HBV) infection requires monitoring of plasma or serum HBV DNA levels using a highly sensitive nucleic acid amplification test. Because commercially available assays differ in performance, we compared herein the performance of the Hologic Aptima HBV Quant assay (Aptima) to that of the Roche Cobas TaqMan HBV test for use with the high pure system (HPS/CTM). Assay performance was assessed using HBV reference panels as well as plasma and serum samples from chronically HBV-infected patients. Method correlation, analytical sensitivity, precision/reproducibility, linearity, bias and influence of genotype were evaluated. Data analysis was performed using linear regression, Deming correlation analysis and Bland-Altman analysis. Agreement between the assays for the two reference panels was good, with a difference in assay values vs. target <0.5 log. Qualitative assay results for 159 clinical samples showed good concordance (88.1%; κ=0.75; 95% confidence interval: 0.651-0.845). For the 106 samples quantitated by both assays, viral load results were highly correlated (R=0.92) and differed on average by 0.09 log, with 95.3% of the samples being within the 95% limit of agreement of the assays. Linearity for viral loads 1-7 log was excellent for both assays (R2>0.98). The two assays had similar bias and precision across the different genotypes tested at low viral loads (25-1000 IU/mL). Aptima has a performance comparable with that of HPS/CTM, making it suitable for use for HBV infection monitoring. Aptima runs on a fully automated platform (the Panther system) and therefore offers a significantly improved workflow compared with HPS/CTM.
Using Log Linear Analysis for Categorical Family Variables.
ERIC Educational Resources Information Center
Moen, Phyllis
The Goodman technique of log linear analysis is ideal for family research, because it is designed for categorical (non-quantitative) variables. Variables are dichotomized (for example, married/divorced, childless/with children) or otherwise categorized (for example, level of permissiveness, life cycle stage). Contingency tables are then…
Reimus, Paul W; Callahan, Timothy J; Ware, S Doug; Haga, Marc J; Counce, Dale A
2007-08-15
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ((3)HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient (D(m)/D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of (D(m)/D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log(D(m)/D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log(D(m)/D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
NASA Astrophysics Data System (ADS)
Reimus, Paul W.; Callahan, Timothy J.; Ware, S. Doug; Haga, Marc J.; Counce, Dale A.
2007-08-01
Diffusion cell experiments were conducted to measure nonsorbing solute matrix diffusion coefficients in forty-seven different volcanic rock matrix samples from eight different locations (with multiple depth intervals represented at several locations) at the Nevada Test Site. The solutes used in the experiments included bromide, iodide, pentafluorobenzoate (PFBA), and tritiated water ( 3HHO). The porosity and saturated permeability of most of the diffusion cell samples were measured to evaluate the correlation of these two variables with tracer matrix diffusion coefficients divided by the free-water diffusion coefficient ( Dm/ D*). To investigate the influence of fracture coating minerals on matrix diffusion, ten of the diffusion cells represented paired samples from the same depth interval in which one sample contained a fracture surface with mineral coatings and the other sample consisted of only pure matrix. The log of ( Dm/ D*) was found to be positively correlated with both the matrix porosity and the log of matrix permeability. A multiple linear regression analysis indicated that both parameters contributed significantly to the regression at the 95% confidence level. However, the log of the matrix diffusion coefficient was more highly-correlated with the log of matrix permeability than with matrix porosity, which suggests that matrix diffusion coefficients, like matrix permeabilities, have a greater dependence on the interconnectedness of matrix porosity than on the matrix porosity itself. The regression equation for the volcanic rocks was found to provide satisfactory predictions of log( Dm/ D*) for other types of rocks with similar ranges of matrix porosity and permeability as the volcanic rocks, but it did a poorer job predicting log( Dm/ D*) for rocks with lower porosities and/or permeabilities. The presence of mineral coatings on fracture walls did not appear to have a significant effect on matrix diffusion in the ten paired diffusion cell experiments.
Asquith, William H.; Thompson, David B.
2008-01-01
The U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, investigated a refinement of the regional regression method and developed alternative equations for estimation of peak-streamflow frequency for undeveloped watersheds in Texas. A common model for estimation of peak-streamflow frequency is based on the regional regression method. The current (2008) regional regression equations for 11 regions of Texas are based on log10 transformations of all regression variables (drainage area, main-channel slope, and watershed shape). Exclusive use of log10-transformation does not fully linearize the relations between the variables. As a result, some systematic bias remains in the current equations. The bias results in overestimation of peak streamflow for both the smallest and largest watersheds. The bias increases with increasing recurrence interval. The primary source of the bias is the discernible curvilinear relation in log10 space between peak streamflow and drainage area. Bias is demonstrated by selected residual plots with superimposed LOWESS trend lines. To address the bias, a statistical framework based on minimization of the PRESS statistic through power transformation of drainage area is described and implemented, and the resulting regression equations are reported. Compared to log10-exclusive equations, the equations derived from PRESS minimization have PRESS statistics and residual standard errors less than the log10 exclusive equations. Selected residual plots for the PRESS-minimized equations are presented to demonstrate that systematic bias in regional regression equations for peak-streamflow frequency estimation in Texas can be reduced. Because the overall error is similar to the error associated with previous equations and because the bias is reduced, the PRESS-minimized equations reported here provide alternative equations for peak-streamflow frequency estimation.
ELASTIC NET FOR COX'S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM.
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox's proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox's proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems.
Farsa, Oldřich
2013-01-01
The log BB parameter is the logarithm of the ratio of a compound's equilibrium concentrations in the brain tissue versus the blood plasma. This parameter is a useful descriptor in assessing the ability of a compound to permeate the blood-brain barrier. The aim of this study was to develop a Hansch-type linear regression QSAR model that correlates the parameter log BB and the retention time of drugs and other organic compounds on a reversed-phase HPLC containing an embedded amide moiety. The retention time was expressed by the capacity factor log k'. The second aim was to estimate the brain's absorption of 2-(azacycloalkyl)acetamidophenoxyacetic acids, which are analogues of piracetam, nefiracetam, and meclofenoxate. Notably, these acids may be novel nootropics. Two simple regression models that relate log BB and log k' were developed from an assay performed using a reversed-phase HPLC that contained an embedded amide moiety. Both the quadratic and linear models yielded statistical parameters comparable to previously published models of log BB dependence on various structural characteristics. The models predict that four members of the substituted phenoxyacetic acid series have a strong chance of permeating the barrier and being absorbed in the brain. The results of this study show that a reversed-phase HPLC system containing an embedded amide moiety is a functional in vitro surrogate of the blood-brain barrier. These results suggest that racetam-type nootropic drugs containing a carboxylic moiety could be more poorly absorbed than analogues devoid of the carboxyl group, especially if the compounds penetrate the barrier by a simple diffusion mechanism.
Porcaro, Antonio B; Ghimenton, Claudio; Petrozziello, Aldo; Sava, Teodoro; Migliorini, Filippo; Romano, Mario; Caruso, Beatrice; Cocco, Claudio; Antoniolli, Stefano Zecchinini; Lacola, Vincenzo; Rubilotta, Emanuele; Monaco, Carmelo
2012-10-01
To evaluate estradiol (E(2)) physiopathology along the pituitary-testicular-prostate axis at the time of initial diagnosis of prostate cancer (PC) and subsequent cluster selection of the patient population. Records of the diagnosed (n=105) and operated (n=91) patients were retrospectively reviewed. Age, percentage of positive cores at-biopsy (P+), biopsy Gleason score (bGS), E(2), prolactin (PRL), luteinizing hormone (LH), follicle-stimulating hormone (FSH), total testosterone (TT), free-testosterone (FT), prostate-specific antigen (PSA), pathology Gleason score (pGS), estimated tumor volume in relation to percentage of prostate volume (V+), overall prostate weight (Wi), clinical stage (cT), biopsy Gleason pattern (bGP) and pathology stage (pT), were the investigated variables. None of the patients had previously undergone hormonal manipulations. E(2) correlation and prediction by multiple linear regression analysis (MLRA) was performed. At diagnosis, the log E(2)/log bGS ratio clustered the population into groups A (log E(2)/log bGS ≤ 2.25), B (2.25
Weighted SGD for ℓ p Regression with Randomized Preconditioning.
Yang, Jiyan; Chow, Yin-Lam; Ré, Christopher; Mahoney, Michael W
2016-01-01
In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger performance guarantees but are applicable to a narrower class of problems. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems-e.g., ℓ 2 and ℓ 1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system.By rewriting a deterministic ℓ p regression problem as a stochastic optimization problem, we connect pwSGD to several existing ℓ p solvers including RLA methods with algorithmic leveraging (RLA for short).We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Such SGD convergence rates are superior to other related SGD algorithm such as the weighted randomized Kaczmarz algorithm.Particularly, when solving ℓ 1 regression with size n by d , pwSGD returns an approximate solution with ε relative error in the objective value in (log n ·nnz( A )+poly( d )/ ε 2 ) time. This complexity is uniformly better than that of RLA methods in terms of both ε and d when the problem is unconstrained. In the presence of constraints, pwSGD only has to solve a sequence of much simpler and smaller optimization problem over the same constraints. In general this is more efficient than solving the constrained subproblem required in RLA.For ℓ 2 regression, pwSGD returns an approximate solution with ε relative error in the objective value and the solution vector measured in prediction norm in (log n ·nnz( A )+poly( d ) log(1/ ε )/ ε ) time. We show that for unconstrained ℓ 2 regression, this complexity is comparable to that of RLA and is asymptotically better over several state-of-the-art solvers in the regime where the desired accuracy ε , high dimension n and low dimension d satisfy d ≥ 1/ ε and n ≥ d 2 / ε . We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets, and the results are consistent with our theoretical findings and demonstrate that pwSGD converges to a medium-precision solution, e.g., ε = 10 -3 , more quickly.
Weighted SGD for ℓp Regression with Randomized Preconditioning*
Yang, Jiyan; Chow, Yin-Lam; Ré, Christopher; Mahoney, Michael W.
2018-01-01
In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger performance guarantees but are applicable to a narrower class of problems. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems—e.g., ℓ2 and ℓ1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system.By rewriting a deterministic ℓp regression problem as a stochastic optimization problem, we connect pwSGD to several existing ℓp solvers including RLA methods with algorithmic leveraging (RLA for short).We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Such SGD convergence rates are superior to other related SGD algorithm such as the weighted randomized Kaczmarz algorithm.Particularly, when solving ℓ1 regression with size n by d, pwSGD returns an approximate solution with ε relative error in the objective value in 𝒪(log n·nnz(A)+poly(d)/ε2) time. This complexity is uniformly better than that of RLA methods in terms of both ε and d when the problem is unconstrained. In the presence of constraints, pwSGD only has to solve a sequence of much simpler and smaller optimization problem over the same constraints. In general this is more efficient than solving the constrained subproblem required in RLA.For ℓ2 regression, pwSGD returns an approximate solution with ε relative error in the objective value and the solution vector measured in prediction norm in 𝒪(log n·nnz(A)+poly(d) log(1/ε)/ε) time. We show that for unconstrained ℓ2 regression, this complexity is comparable to that of RLA and is asymptotically better over several state-of-the-art solvers in the regime where the desired accuracy ε, high dimension n and low dimension d satisfy d ≥ 1/ε and n ≥ d2/ε. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets, and the results are consistent with our theoretical findings and demonstrate that pwSGD converges to a medium-precision solution, e.g., ε = 10−3, more quickly. PMID:29782626
Box-Cox transformation of firm size data in statistical analysis
NASA Astrophysics Data System (ADS)
Chen, Ting Ting; Takaishi, Tetsuya
2014-03-01
Firm size data usually do not show the normality that is often assumed in statistical analysis such as regression analysis. In this study we focus on two firm size data: the number of employees and sale. Those data deviate considerably from a normal distribution. To improve the normality of those data we transform them by the Box-Cox transformation with appropriate parameters. The Box-Cox transformation parameters are determined so that the transformed data best show the kurtosis of a normal distribution. It is found that the two firm size data transformed by the Box-Cox transformation show strong linearity. This indicates that the number of employees and sale have the similar property as a firm size indicator. The Box-Cox parameters obtained for the firm size data are found to be very close to zero. In this case the Box-Cox transformations are approximately a log-transformation. This suggests that the firm size data we used are approximately log-normal distributions.
Lundblad, Runar; Abdelnoor, Michel; Svennevig, Jan Ludvig
2004-09-01
Simple linear resection and endoventricular patch plasty are alternative techniques to repair postinfarction left ventricular aneurysm. The aim of the study was to compare these 2 methods with regard to early mortality and long-term survival. We retrospectively reviewed 159 patients undergoing operations between 1989 and 2003. The epidemiologic design was of an exposed (simple linear repair, n = 74) versus nonexposed (endoventricular patch plasty, n = 85) cohort with 2 endpoints: early mortality and long-term survival. The crude effect of aneurysm repair technique versus endpoint was estimated by odds ratio, rate ratio, or relative risk and their 95% confidence intervals. Stratification analysis by using the Mantel-Haenszel method was done to quantify confounders and pinpoint effect modifiers. Adjustment for multiconfounders was performed by using logistic regression and Cox regression analysis. Survival curves were analyzed with the Breslow test and the log-rank test. Early mortality was 8.2% for all patients, 13.5% after linear repair and 3.5% after endoventricular patch plasty. When adjusted for multiconfounders, the risk of early mortality was significantly higher after simple linear repair than after endoventricular patch plasty (odds ratio, 4.4; 95% confidence interval, 1.1-17.8). Mean follow-up was 5.8 +/- 3.8 years (range, 0-14.0 years). Overall 5-year cumulative survival was 78%, 70.1% after linear repair and 91.4% after endoventricular patch plasty. The risk of total mortality was significantly higher after linear repair than after endoventricular patch plasty when controlled for multiconfounders (relative risk, 4.5; 95% confidence interval, 2.0-9.7). Linear repair dominated early in the series and patch plasty dominated later, giving a possible learning-curve bias in favor of patch plasty that could not be adjusted for in the regression analysis. Postinfarction left ventricular aneurysm can be repaired with satisfactory early and late results. Surgical risk was lower and long-term survival was higher after endoventricular patch plasty than simple linear repair. Differences in outcome should be interpreted with care because of the retrospective study design and the chronology of the 2 repair methods.
A Spreadsheet for a 2 x 3 x 2 Log-Linear Analysis. AIR 1991 Annual Forum Paper.
ERIC Educational Resources Information Center
Saupe, Joe L.
This paper describes a personal computer spreadsheet set up to carry out hierarchical log-linear analyses, a type of analysis useful for institutional research into multidimensional frequency tables formed from categorical variables such as faculty rank, student class level, gender, or retention status. The spreadsheet provides a concrete vehicle…
Sun, Lili; Zhou, Liping; Yu, Yu; Lan, Yukun; Li, Zhiliang
2007-01-01
Polychlorinated diphenyl ethers (PCDEs) have received more and more concerns as a group of ubiquitous potential persistent organic pollutants (POPs). By using molecular electronegativity distance vector (MEDV-4), multiple linear regression (MLR) models are developed for sub-cooled liquid vapor pressures (P(L)), n-octanol/water partition coefficients (K(OW)) and sub-cooled liquid water solubilities (S(W,L)) of 209 PCDEs and diphenyl ether. The correlation coefficients (R) and the leave-one-out cross-validation (LOO) correlation coefficients (R(CV)) of all the 6-descriptor models for logP(L), logK(OW) and logS(W,L) are more than 0.98. By using stepwise multiple regression (SMR), the descriptors are selected and the resulting models are 5-descriptor model for logP(L), 4-descriptor model for logK(OW), and 6-descriptor model for logS(W,L), respectively. All these models exhibit excellent estimate capabilities for internal sample set and good predictive capabilities for external samples set. The consistency between observed and estimated/predicted values for logP(L) is the best (R=0.996, R(CV)=0.996), followed by logK(OW) (R=0.992, R(CV)=0.992) and logS(W,L) (R=0.983, R(CV)=0.980). By using MEDV-4 descriptors, the QSPR models can be used for prediction and the model predictions can hence extend the current database of experimental values.
Rampersaud, E; Morris, R W; Weinberg, C R; Speer, M C; Martin, E R
2007-01-01
Genotype-based likelihood-ratio tests (LRT) of association that examine maternal and parent-of-origin effects have been previously developed in the framework of log-linear and conditional logistic regression models. In the situation where parental genotypes are missing, the expectation-maximization (EM) algorithm has been incorporated in the log-linear approach to allow incomplete triads to contribute to the LRT. We present an extension to this model which we call the Combined_LRT that incorporates additional information from the genotypes of unaffected siblings to improve assignment of incompletely typed families to mating type categories, thereby improving inference of missing parental data. Using simulations involving a realistic array of family structures, we demonstrate the validity of the Combined_LRT under the null hypothesis of no association and provide power comparisons under varying levels of missing data and using sibling genotype data. We demonstrate the improved power of the Combined_LRT compared with the family-based association test (FBAT), another widely used association test. Lastly, we apply the Combined_LRT to a candidate gene analysis in Autism families, some of which have missing parental genotypes. We conclude that the proposed log-linear model will be an important tool for future candidate gene studies, for many complex diseases where unaffected siblings can often be ascertained and where epigenetic factors such as imprinting may play a role in disease etiology.
Prenatal Lead Exposure and Fetal Growth: Smaller Infants Have Heightened Susceptibility
Rodosthenous, Rodosthenis S.; Burris, Heather H.; Svensson, Katherine; Amarasiriwardena, Chitra J.; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A.; Wright, Robert O.; Téllez-Rojo, Martha M.; Baccarelli, Andrea A.
2016-01-01
Background As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. Objectives To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. Methods We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. Results While linear regression showed a negative association between maternal BLL and BWGA z-score (β=−0.06 z-score units per log2 BLL increase; 95% CI: −0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [−0.08, −0.13] z-score units per log2 BLL increase; all P values <0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99–2.65) for having a SGA infant compared to the lowest BLL quartile. Conclusions While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. PMID:27923585
Prenatal lead exposure and fetal growth: Smaller infants have heightened susceptibility.
Rodosthenous, Rodosthenis S; Burris, Heather H; Svensson, Katherine; Amarasiriwardena, Chitra J; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A; Wright, Robert O; Téllez-Rojo, Martha M; Baccarelli, Andrea A
2017-02-01
As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. While linear regression showed a negative association between maternal BLL and BWGA z-score (β=-0.06 z-score units per log 2 BLL increase; 95% CI: -0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [-0.08, -0.13] z-score units per log 2 BLL increase; all P values<0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99-2.65) for having a SGA infant compared to the lowest BLL quartile. While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Vandenhove, H; Van Hees, M; Wouters, K; Wannijn, J
2007-01-01
Present study aims to quantify the influence of soil parameters on soil solution uranium concentration for (238)U spiked soils. Eighteen soils collected under pasture were selected such that they covered a wide range for those parameters hypothesised as being potentially important in determining U sorption. Maximum soil solution uranium concentrations were observed at alkaline pH, high inorganic carbon content and low cation exchange capacity, organic matter content, clay content, amorphous Fe and phosphate levels. Except for the significant correlation between the solid-liquid distribution coefficients (K(d), L kg(-1)) and the organic matter content (R(2)=0.70) and amorphous Fe content (R(2)=0.63), there was no single soil parameter significantly explaining the soil solution uranium concentration (which varied 100-fold). Above pH=6, log(K(d)) was linearly related with pH [log(K(d))=-1.18 pH+10.8, R(2)=0.65]. Multiple linear regression analysis did result in improved predictions of the soil solution uranium concentration but the model was complex.
Linearly Supporting Feature Extraction for Automated Estimation of Stellar Atmospheric Parameters
NASA Astrophysics Data System (ADS)
Li, Xiangru; Lu, Yu; Comte, Georges; Luo, Ali; Zhao, Yongheng; Wang, Yongjun
2015-05-01
We describe a scheme to extract linearly supporting (LSU) features from stellar spectra to automatically estimate the atmospheric parameters {{T}{\\tt{eff} }}, log g, and [Fe/H]. “Linearly supporting” means that the atmospheric parameters can be accurately estimated from the extracted features through a linear model. The successive steps of the process are as follow: first, decompose the spectrum using a wavelet packet (WP) and represent it by the derived decomposition coefficients; second, detect representative spectral features from the decomposition coefficients using the proposed method Least Absolute Shrinkage and Selection Operator (LARS)bs; third, estimate the atmospheric parameters {{T}{\\tt{eff} }}, log g, and [Fe/H] from the detected features using a linear regression method. One prominent characteristic of this scheme is its ability to evaluate quantitatively the contribution of each detected feature to the atmospheric parameter estimate and also to trace back the physical significance of that feature. This work also shows that the usefulness of a component depends on both the wavelength and frequency. The proposed scheme has been evaluated on both real spectra from the Sloan Digital Sky Survey (SDSS)/SEGUE and synthetic spectra calculated from Kurucz's NEWODF models. On real spectra, we extracted 23 features to estimate {{T}{\\tt{eff} }}, 62 features for log g, and 68 features for [Fe/H]. Test consistencies between our estimates and those provided by the Spectroscopic Parameter Pipeline of SDSS show that the mean absolute errors (MAEs) are 0.0062 dex for log {{T}{\\tt{eff} }} (83 K for {{T}{\\tt{eff} }}), 0.2345 dex for log g, and 0.1564 dex for [Fe/H]. For the synthetic spectra, the MAE test accuracies are 0.0022 dex for log {{T}{\\tt{eff} }} (32 K for {{T}{\\tt{eff} }}), 0.0337 dex for log g, and 0.0268 dex for [Fe/H].
Farsa, Oldřich
2013-01-01
The log BB parameter is the logarithm of the ratio of a compound’s equilibrium concentrations in the brain tissue versus the blood plasma. This parameter is a useful descriptor in assessing the ability of a compound to permeate the blood-brain barrier. The aim of this study was to develop a Hansch-type linear regression QSAR model that correlates the parameter log BB and the retention time of drugs and other organic compounds on a reversed-phase HPLC containing an embedded amide moiety. The retention time was expressed by the capacity factor log k′. The second aim was to estimate the brain’s absorption of 2-(azacycloalkyl)acetamidophenoxyacetic acids, which are analogues of piracetam, nefiracetam, and meclofenoxate. Notably, these acids may be novel nootropics. Two simple regression models that relate log BB and log k′ were developed from an assay performed using a reversed-phase HPLC that contained an embedded amide moiety. Both the quadratic and linear models yielded statistical parameters comparable to previously published models of log BB dependence on various structural characteristics. The models predict that four members of the substituted phenoxyacetic acid series have a strong chance of permeating the barrier and being absorbed in the brain. The results of this study show that a reversed-phase HPLC system containing an embedded amide moiety is a functional in vitro surrogate of the blood-brain barrier. These results suggest that racetam-type nootropic drugs containing a carboxylic moiety could be more poorly absorbed than analogues devoid of the carboxyl group, especially if the compounds penetrate the barrier by a simple diffusion mechanism. PMID:23641330
Demonstration of the Web-based Interspecies Correlation Estimation (Web-ICE) modeling application
The Web-based Interspecies Correlation Estimation (Web-ICE) modeling application is available to the risk assessment community through a user-friendly internet platform (http://epa.gov/ceampubl/fchain/webice/). ICE models are log-linear least square regressions that predict acute...
ELASTIC NET FOR COX’S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM
Wu, Yichao
2012-01-01
For least squares regression, Efron et al. (2004) proposed an efficient solution path algorithm, the least angle regression (LAR). They showed that a slight modification of the LAR leads to the whole LASSO solution path. Both the LAR and LASSO solution paths are piecewise linear. Recently Wu (2011) extended the LAR to generalized linear models and the quasi-likelihood method. In this work we extend the LAR further to handle Cox’s proportional hazards model. The goal is to develop a solution path algorithm for the elastic net penalty (Zou and Hastie (2005)) in Cox’s proportional hazards model. This goal is achieved in two steps. First we extend the LAR to optimizing the log partial likelihood plus a fixed small ridge term. Then we define a path modification, which leads to the solution path of the elastic net regularized log partial likelihood. Our solution path is exact and piecewise determined by ordinary differential equation systems. PMID:23226932
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Flow-covariate prediction of stream pesticide concentrations.
Mosquin, Paul L; Aldworth, Jeremy; Chen, Wenlin
2018-01-01
Potential peak functions (e.g., maximum rolling averages over a given duration) of annual pesticide concentrations in the aquatic environment are important exposure parameters (or target quantities) for ecological risk assessments. These target quantities require accurate concentration estimates on nonsampled days in a monitoring program. We examined stream flow as a covariate via universal kriging to improve predictions of maximum m-day (m = 1, 7, 14, 30, 60) rolling averages and the 95th percentiles of atrazine concentration in streams where data were collected every 7 or 14 d. The universal kriging predictions were evaluated against the target quantities calculated directly from the daily (or near daily) measured atrazine concentration at 32 sites (89 site-yr) as part of the Atrazine Ecological Monitoring Program in the US corn belt region (2008-2013) and 4 sites (62 site-yr) in Ohio by the National Center for Water Quality Research (1993-2008). Because stream flow data are strongly skewed to the right, 3 transformations of the flow covariate were considered: log transformation, short-term flow anomaly, and normalized Box-Cox transformation. The normalized Box-Cox transformation resulted in predictions of the target quantities that were comparable to those obtained from log-linear interpolation (i.e., linear interpolation on the log scale) for 7-d sampling. However, the predictions appeared to be negatively affected by variability in regression coefficient estimates across different sample realizations of the concentration time series. Therefore, revised models incorporating seasonal covariates and partially or fully constrained regression parameters were investigated, and they were found to provide much improved predictions in comparison with those from log-linear interpolation for all rolling average measures. Environ Toxicol Chem 2018;37:260-273. © 2017 SETAC. © 2017 SETAC.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Wu, Chih Cheng; Lee, Grace W M; Yang, Shinhao; Yu, Kuo-Pin; Lou, Chia Ling
2006-10-15
Although negative air ionizer is commonly used for indoor air cleaning, few studies examine the concentration gradient of negative air ion (NAI) in indoor environments. This study investigated the concentration gradient of NAI at various relative humidities and distances form the source in indoor air. The NAI was generated by single-electrode negative electric discharge; the discharge was kept at dark discharge and 30.0 kV. The NAI concentrations were measured at various distances (10-900 cm) from the discharge electrode in order to identify the distribution of NAI in an indoor environment. The profile of NAI concentration was monitored at different relative humidities (38.1-73.6% RH) and room temperatures (25.2+/-1.4 degrees C). Experimental results indicate that the influence of relative humidity on the concentration gradient of NAI was complicated. There were four trends for the relationship between NAI concentration and relative humidity at different distances from the discharge electrode. The changes of NAI concentration with an increase in relative humidity at different distances were quite steady (10-30 cm), strongly declining (70-360 cm), approaching stability (420-450 cm) and moderately increasing (560-900 cm). Additionally, the regression analysis of NAI concentrations and distances from the discharge electrode indicated a logarithmic linear (log-linear) relationship; the distance of log-linear tendency (lambda) decreased with an increase in relative humidity such that the log-linear distance of 38.1% RH was 2.9 times that of 73.6% RH. Moreover, an empirical curve fit based on this study for the concentration gradient of NAI generated by negative electric discharge in indoor air was developed for estimating the NAI concentration at different relative humidities and distances from the source of electric discharge.
Inflammation, homocysteine and carotid intima-media thickness.
Baptista, Alexandre P; Cacdocar, Sanjiva; Palmeiro, Hugo; Faísca, Marília; Carrasqueira, Herménio; Morgado, Elsa; Sampaio, Sandra; Cabrita, Ana; Silva, Ana Paula; Bernardo, Idalécio; Gome, Veloso; Neves, Pedro L
2008-01-01
Cardiovascular disease is the main cause of morbidity and mortality in chronic renal patients. Carotid intima-media thickness (CIMT) is one of the most accurate markers of atherosclerosis risk. In this study, the authors set out to evaluate a population of chronic renal patients to determine which factors are associated with an increase in intima-media thickness. We included 56 patients (F=22, M=34), with a mean age of 68.6 years, and an estimated glomerular filtration rate of 15.8 ml/min (calculated by the MDRD equation). Various laboratory and inflammatory parameters (hsCRP, IL-6 and TNF-alpha) were evaluated. All subjects underwent measurement of internal carotid artery intima-media thickness by high-resolution real-time B-mode ultrasonography using a 10 MHz linear transducer. Intima-media thickness was used as a dependent variable in a simple linear regression model, with the various laboratory parameters as independent variables. Only parameters showing a significant correlation with CIMT were evaluated in a multiple regression model: age (p=0.001), hemoglobin (p=00.3), logCRP (p=0.042), logIL-6 (p=0.004) and homocysteine (p=0.002). In the multiple regression model we found that age (p=0.001) and homocysteine (p=0.027) were independently correlated with CIMT. LogIL-6 did not reach statistical significance (p=0.057), probably due to the small population size. The authors conclude that age and homocysteine correlate with carotid intima-media thickness, and thus can be considered as markers/risk factors in chronic renal patients.
Suárez-Ortegón, M F; Arbeláez, A; Mosquera, M; Méndez, F; Aguilar-de Plata, C
2012-08-01
Ferritin levels have been associated with metabolic syndrome and insulin resistance. The aim of the present study was to evaluate the prediction of ferritin levels by variables related to cardiometabolic disease risk in a multivariate analysis. For this aim, 123 healthy women (72 premenopausal and 51 posmenopausal) were recruited. Data were collected through procedures of anthropometric measurements, questionnaires for personal/familial antecedents, and dietary intake (24-h recall), and biochemical determinations (ferritin, C reactive protein (CRP), glucose, insulin, and lipid profile) in blood serum samples obtained. Multiple linear regression analysis was used and variables with no normal distribution were log-transformed for this analysis. In premenopausal women, a model to explain log-ferritin levels was found with log-CRP levels, heart attack familial history, and waist circumference as independent predictors. Ferritin behaves as other cardiovascular markers in terms of prediction of its levels by documented predictors of cardiometabolic disease and related disorders. This is the first report of a relationship between heart attack familial history and ferritin levels. Further research is required to evaluate the mechanism to explain the relationship of central body fat and heart attack familial history with body iron stores values.
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients
Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil
2018-03-27
Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License
LAS bioconcentration is isomer specific
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tolls, J.; Haller, M.; Graaf, I. de
1995-12-31
The authors measured parent compound specific bioconcentration data for linear alkylbenzene sulfonates in Pimephales promelas. They did so by using cold, custom synthesized sulfophenyl alkanes. They observed that, within homologous series of isomers, the uptake rate constants (k{sub 1}) and the bioconcentration factor (BCF) increase with increasing number of carbon atoms in the alkyl chain (n{sub C-atoms}). In contrast, the elimination rate constant k{sub 2} appears to be independent of the alkyl chain length. Regressions of log BCF vs n{sub C-atoms} yielded different slopes for the homologous groups of the 5- and the 2-sulfophenyl alkane isomers. Regression of all logmore » BCF-data vs log 1/CMC yielded a good description of the data. However, when regressing the data for both homologous series separately again very different slopes are obtained. The results therefore indicate that hydrophobicity-bioconcentration relationships may be different for different homologous groups of sulfophenyl alkanes.« less
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
A Comparison of Strategies for Estimating Conditional DIF
ERIC Educational Resources Information Center
Moses, Tim; Miao, Jing; Dorans, Neil J.
2010-01-01
In this study, the accuracies of four strategies were compared for estimating conditional differential item functioning (DIF), including raw data, logistic regression, log-linear models, and kernel smoothing. Real data simulations were used to evaluate the estimation strategies across six items, DIF and No DIF situations, and four sample size…
Hoffman, Jennifer C.; Anton, Peter A.; Baldwin, Gayle Cocita; Elliott, Julie; Anisman-Posner, Deborah; Tanner, Karen; Grogan, Tristan; Elashoff, David; Sugar, Catherine; Yang, Otto O.
2014-01-01
Abstract Seminal plasma HIV-1 RNA level is an important determinant of the risk of HIV-1 sexual transmission. We investigated potential associations between seminal plasma cytokine levels and viral concentration in the seminal plasma of HIV-1-infected men. This was a prospective, observational study of paired blood and semen samples from 18 HIV-1 chronically infected men off antiretroviral therapy. HIV-1 RNA levels and cytokine levels in seminal plasma and blood plasma were measured and analyzed using simple linear regressions to screen for associations between cytokines and seminal plasma HIV-1 levels. Forward stepwise regression was performed to construct the final multivariate model. The median HIV-1 RNA concentrations were 4.42 log10 copies/ml (IQR 2.98, 4.70) and 2.96 log10 copies/ml (IQR 2, 4.18) in blood and seminal plasma, respectively. In stepwise multivariate linear regression analysis, blood HIV-1 RNA level (p<0.0001) was most strongly associated with seminal plasma HIV-1 RNA level. After controlling for blood HIV-1 RNA level, seminal plasma HIV-1 RNA level was positively associated with interferon (IFN)-γ (p=0.03) and interleukin (IL)-17 (p=0.03) and negatively associated with IL-5 (p=0.0007) in seminal plasma. In addition to blood HIV-1 RNA level, cytokine profiles in the male genital tract are associated with HIV-1 RNA levels in semen. The Th1 and Th17 cytokines IFN-γ and IL-17 are associated with increased seminal plasma HIV-1 RNA, while the Th2 cytokine IL-5 is associated with decreased seminal plasma HIV-1 RNA. These results support the importance of genital tract immunomodulation in HIV-1 transmission. PMID:25209674
Albumin, a marker for post-operative myocardial damage in cardiac surgery.
van Beek, Dianne E C; van der Horst, Iwan C C; de Geus, A Fred; Mariani, Massimo A; Scheeren, Thomas W L
2018-06-06
Low serum albumin (SA) is a prognostic factor for poor outcome after cardiac surgery. The aim of this study was to estimate the association between pre-operative SA, early post-operative SA and postoperative myocardial injury. This single center cohort study included adult patients undergoing cardiac surgery during 4 consecutive years. Postoperative myocardial damage was defined by calculating the area under the curve (AUC) of troponin (Tn) values during the first 72 h after surgery and its association with SA analyzed using linear regression and with multivariable linear regression to account for patient related and procedural confounders. The association between SA and the secondary outcomes (peri-operative myocardial infarction [PMI], requiring ventilation >24 h, rhythm disturbances, 30-day mortality) was studied using (multivariable) log binomial regression analysis. In total 2757 patients were included. The mean pre-operative SA was 29 ± 13 g/l and the mean post-operative SA was 26 ± 6 g/l. Post-operative SA levels (on average 26 min after surgery) were inversely associated with postoperative myocardial damage in both univariable analysis (regression coefficient - 0.019, 95%CI -0.022/-0.015, p < 0.005) and after adjustment for patient related and surgical confounders (regression coefficient - 0.014 [95% CI -0.020/-0.008], p < 0.0005). Post-operative albumin levels were significantly correlated with the amount of postoperative myocardial damage in patients undergoing cardiac surgery independent of typical confounders. Copyright © 2018. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Kim, Jeong-Man; Choi, Jang-Young; Lee, Kyu-Seok; Lee, Sung-Ho
2017-05-01
This study focuses on the design and analysis of a linear oscillatory single-phase permanent magnet generator for free-piston stirling engine (FPSE) systems. In order to implement the design of linear oscillatory generator (LOG) for suitable FPSEs, we conducted electromagnetic analysis of LOGs with varying design parameters. Then, detent force analysis was conducted using assisted PM. Using the assisted PM gave us the advantage of using mechanical strength by detent force. To improve the efficiency, we conducted characteristic analysis of eddy-current loss with respect to the PM segment. Finally, the experimental result was analyzed to confirm the prediction of the FEA.
Model synthesis in frequency analysis of Missouri floods
Hauth, Leland D.
1974-01-01
Synthetic flood records for 43 small-stream sites aided in definition of techniques for estimating the magnitude and frequency of floods in Missouri. The long-term synthetic flood records were generated by use of a digital computer model of the rainfall-runoff process. A relatively short period of concurrent rainfall and runoff data observed at each of the 43 sites was used to calibrate the model, and rainfall records covering from 66 to 78 years for four Missouri sites and pan-evaporation data were used to generate the synthetic records. Flood magnitude and frequency characteristics of both the synthetic records and observed long-term flood records available for 109 large-stream sites were used in a multiple-regression analysis to define relations for estimating future flood characteristics at ungaged sites. That analysis indicated that drainage basin size and slope were the most useful estimating variables. It also indicated that a more complex regression model than the commonly used log-linear one was needed for the range of drainage basin sizes available in this study.
Tang, Ronggui; Ding, Changfeng; Ma, Yibing; Wan, Mengxue; Zhang, Taolin; Wang, Xingxiang
2018-06-02
To explore the main controlling factors in soil and build a predictive model between the lead concentrations in earthworms (Pb earthworm ) and the soil physicochemical parameters, 13 soils with low level of lead contamination were used to conduct toxicity experiments using earthworms. The results indicated that a relatively high bioaccumulation factor appeared in the soils with low pH values. The lead concentrations between earthworms and soils after log transformation had a significantly positive correlation (R 2 = 0.46, P < 0.0001, n = 39). Stepwise multiple linear regression analysis derived a fitting empirical model between Pb earthworm and the soil physicochemical properties: log(Pb earthworm ) = 0.96log(Pb soil ) - 0.74log(OC) - 0.22pH + 0.95, (R 2 = 0.66, n = 39). Furthermore, path analysis confirmed that the Pb concentrations in the soil (Pb soil ), soil pH, and soil organic carbon (OC) were the primary controlling factors of Pb earthworm with high pathway parameters (0.71, - 0.51, and - 0.49, respectively). The predictive model based on Pb earthworm in a nationwide range of soils with low-level lead contamination could provide a reference for the establishment of safety thresholds in Pb-contaminated soils from the perspective of soil-animal systems.
Smooth individual level covariates adjustment in disease mapping.
Huque, Md Hamidul; Anderson, Craig; Walton, Richard; Woolford, Samuel; Ryan, Louise
2018-05-01
Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available "indiCAR" model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log-linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non-log-linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth-indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two-step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth-indiCAR through simulation. Our results indicate that the smooth-indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Normality of raw data in general linear models: The most widespread myth in statistics
Kery, Marc; Hatfield, Jeff S.
2003-01-01
In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.
Evaluation of third-degree and fourth-degree laceration rates as quality indicators.
Friedman, Alexander M; Ananth, Cande V; Prendergast, Eri; D'Alton, Mary E; Wright, Jason D
2015-04-01
To examine the patterns and predictors of third-degree and fourth-degree laceration in women undergoing vaginal delivery. We identified a population-based cohort of women in the United States who underwent a vaginal delivery between 1998 and 2010 using the Nationwide Inpatient Sample. Multivariable log-linear regression models were developed to account for patient, obstetric, and hospital factors related to lacerations. Between-hospital variability of laceration rates was calculated using generalized log-linear mixed models. Among 7,096,056 women who underwent vaginal delivery in 3,070 hospitals, 3.3% (n=232,762) had a third-degree laceration and 1.1% (n=76,347) had a fourth-degree laceration. In an adjusted model for fourth-degree lacerations, important risk factors included shoulder dystocia and forceps and vacuum deliveries with and without episiotomy. Other demographic, obstetric, medical, and hospital variables, although statistically significant, were not major determinants of lacerations. Risk factors in a multivariable model for third-degree lacerations were similar to those in the fourth-degree model. Regression analysis of hospital rates (n=3,070) of lacerations demonstrated limited between-hospital variation. Risk of third-degree and fourth-degree laceration was most strongly related to operative delivery and shoulder dystocia. Between-hospital variation was limited. Given these findings and that the most modifiable practice related to lacerations would be reduction in operative vaginal deliveries (and a possible increase in cesarean delivery), third-degree and fourth-degree laceration rates may be a quality metric of limited utility.
Morikawa, Go; Suzuka, Chihiro; Shoji, Atsushi; Shibusawa, Yoichi; Yanagida, Akio
2016-01-05
A high-throughput method for determining the octanol/water partition coefficient (P(o/w)) of a large variety of compounds exhibiting a wide range in hydrophobicity was established. The method combines a simple shake-flask method with a novel two-phase solvent system comprising an acetonitrile-phosphate buffer (0.1 M, pH 7.4)-1-octanol (25:25:4, v/v/v; AN system). The AN system partition coefficients (K(AN)) of 51 standard compounds for which log P(o/w) (at pH 7.4; log D) values had been reported were determined by single two-phase partitioning in test tubes, followed by measurement of the solute concentration in both phases using an automatic flow injection-ultraviolet detection system. The log K(AN) values were closely related to reported log D values, and the relationship could be expressed by the following linear regression equation: log D=2.8630 log K(AN) -0.1497(n=51). The relationship reveals that log D values (+8 to -8) for a large variety of highly hydrophobic and/or hydrophilic compounds can be estimated indirectly from the narrow range of log K(AN) values (+3 to -3) determined using the present method. Furthermore, log K(AN) values for highly polar compounds for which no log D values have been reported, such as amino acids, peptides, proteins, nucleosides, and nucleotides, can be estimated using the present method. The wide-ranging log D values (+5.9 to -7.5) of these molecules were estimated for the first time from their log K(AN) values and the above regression equation. Copyright © 2015 Elsevier B.V. All rights reserved.
Lee, Ho-Won; Muniyappa, Ranganath; Yan, Xu; Yue, Lilly Q.; Linden, Ellen H.; Chen, Hui; Hansen, Barbara C.
2011-01-01
The euglycemic glucose clamp is the reference method for assessing insulin sensitivity in humans and animals. However, clamps are ill-suited for large studies because of extensive requirements for cost, time, labor, and technical expertise. Simple surrogate indexes of insulin sensitivity/resistance including quantitative insulin-sensitivity check index (QUICKI) and homeostasis model assessment (HOMA) have been developed and validated in humans. However, validation studies of QUICKI and HOMA in both rats and mice suggest that differences in metabolic physiology between rodents and humans limit their value in rodents. Rhesus monkeys are a species more similar to humans than rodents. Therefore, in the present study, we evaluated data from 199 glucose clamp studies obtained from a large cohort of 86 monkeys with a broad range of insulin sensitivity. Data were used to evaluate simple surrogate indexes of insulin sensitivity/resistance (QUICKI, HOMA, Log HOMA, 1/HOMA, and 1/Fasting insulin) with respect to linear regression, predictive accuracy using a calibration model, and diagnostic performance using receiver operating characteristic. Most surrogates had modest linear correlations with SIClamp (r ≈ 0.4–0.64) with comparable correlation coefficients. Predictive accuracy determined by calibration model analysis demonstrated better predictive accuracy of QUICKI than HOMA and Log HOMA. Receiver operating characteristic analysis showed equivalent sensitivity and specificity of most surrogate indexes to detect insulin resistance. Thus, unlike in rodents but similar to humans, surrogate indexes of insulin sensitivity/resistance including QUICKI and log HOMA may be reasonable to use in large studies of rhesus monkeys where it may be impractical to conduct glucose clamp studies. PMID:21209021
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
NASA Astrophysics Data System (ADS)
Díez, I.; Santolaria, A.; Gorostiaga, J. M.
2003-04-01
Subtidal vegetation distribution patterns in relation to environmental conditions (pollution, wave exposure, sedimentation, substratum slope and depth) were studied along the western Basque coast, northern Spain, by applying canonical correspondence analysis and log-linear regressions. A total of 90 species of macrophytes were recorded by systematic sampling along 21 transects. Mesophyllum lichenoides and Cystoseira baccata were the most abundant (accounting for 47% of the overall algal cover). Gelidium sesquipedale, Pterosiphonia complanata, Zanardinia prototypus, Codium decorticatum and Asparagopsis armata ( Falkenbergia phase) were other macrophytes with significant cover. Ordination analysis indicates that the five environmental variables explored account between them for 52% of the species data variance. Pollution, sedimentation and wave exposure were the principal factors explaining differences in flora composition and abundance (24, 14 and 12% of the explained variance, respectively). Log-linear regressions and canonical correspondence analyses reveal that C. baccata and G. sesquipedale exhibit a negative relationship with pollution, while sediment loading negatively affects G. sesquipedale, and C. baccata cannot stand high wave exposure levels. In contrast, P. complanata and C. decorticatum show a positive relationship with pollution and can bear high levels of sedimentation and wave exposure. M. lichenoides and Z. prototypus present a wide tolerance range for all these factors. Macroalgal cover, species richness and diversity remain practically constant from unpolluted to slightly polluted sites, but they decrease sharply under moderately polluted conditions. In the same way, algal cover decreases as sediment loading increases, but diversity and species richness show the highest values at intermediate levels of sedimentation. In relation to wave exposure, maximum algal cover was achieved at very exposed habitats whereas diversity and species richness were higher under semi-exposed conditions.
Schüle, Steffen Andreas; Gabriel, Katharina M A; Bolte, Gabriele
2017-06-01
The environmental justice framework states that besides environmental burdens also resources may be social unequally distributed both on the individual and on the neighbourhood level. This ecological study investigated whether neighbourhood socioeconomic position (SEP) was associated with neighbourhood public green space availability in a large German city with more than 1 million inhabitants. Two different measures were defined for green space availability. Firstly, percentage of green space within neighbourhoods was calculated with the additional consideration of various buffers around the boundaries. Secondly, percentage of green space was calculated based on various radii around the neighbourhood centroid. An index of neighbourhood SEP was calculated with principal component analysis. Log-gamma regression from the group of generalized linear models was applied in order to consider the non-normal distribution of the response variable. All models were adjusted for population density. Low neighbourhood SEP was associated with decreasing neighbourhood green space availability including 200m up to 1000m buffers around the neighbourhood boundaries. Low neighbourhood SEP was also associated with decreasing green space availability based on catchment areas measured from neighbourhood centroids with different radii (1000m up to 3000 m). With an increasing radius the strength of the associations decreased. Social unequally distributed green space may amplify environmental health inequalities in an urban context. Thus, the identification of vulnerable neighbourhoods and population groups plays an important role for epidemiological research and healthy city planning. As a methodical aspect, log-gamma regression offers an adequate parametric modelling strategy for positively distributed environmental variables. Copyright © 2017 Elsevier GmbH. All rights reserved.
Ku, Po-Wen; Steptoe, Andrew; Liao, Yung; Hsueh, Ming-Chun; Chen, Li-Jung
2018-05-25
The appropriate limit to the amount of daily sedentary time (ST) required to minimize mortality is uncertain. This meta-analysis aimed to quantify the dose-response association between daily ST and all-cause mortality and to explore the cut-off point above which health is impaired in adults aged 18-64 years old. We also examined whether there are differences between studies using self-report ST and those with device-based ST. Prospective cohort studies providing effect estimates of daily ST (exposure) on all-cause mortality (outcome) were identified via MEDLINE, PubMed, Scopus, Web of Science, and Google Scholar databases until January 2018. Dose-response relationships between daily ST and all-cause mortality were examined using random-effects meta-regression models. Based on the pooled data for more than 1 million participants from 19 studies, the results showed a log-linear dose-response association between daily ST and all-cause mortality. Overall, more time spent in sedentary behaviors is associated with increased mortality risks. However, the method of measuring ST moderated the association between daily ST and mortality risk (p < 0.05). The cut-off of daily ST in studies with self-report ST was 7 h/day in comparison with 9 h/day for those with device-based ST. Higher amounts of daily ST are log-linearly associated with increased risk of all-cause mortality in adults. On the basis of a limited number of studies using device-based measures, the findings suggest that it may be appropriate to encourage adults to engage in less sedentary behaviors, with fewer than 9 h a day being relevant for all-cause mortality.
Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data
Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.
2017-01-01
Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564
Lawrence, Stephen J.
2012-01-01
Regression analyses show that E. coli density in samples was strongly related to turbidity, streamflow characteristics, and season at both sites. The regression equation chosen for the Norcross data showed that 78 percent of the variability in E. coli density (in log base 10 units) was explained by the variability in turbidity values (in log base 10 units), streamflow event (dry-weather flow or stormflow), season (cool or warm), and an interaction term that is the cross product of streamflow event and turbidity. The regression equation chosen for the Atlanta data showed that 76 percent of the variability in E. coli density (in log base 10 units) was explained by the variability in turbidity values (in log base 10 units), water temperature, streamflow event, and an interaction term that is the cross product of streamflow event and turbidity. Residual analysis and model confirmation using new data indicated the regression equations selected at both sites predicted E. coli density within the 90 percent prediction intervals of the equations and could be used to predict E. coli density in real time at both sites.
NASA Astrophysics Data System (ADS)
Schaperow, J.; Cooper, M. G.; Cooley, S. W.; Alam, S.; Smith, L. C.; Lettenmaier, D. P.
2017-12-01
As climate regimes shift, streamflows and our ability to predict them will change, as well. Elasticity of summer minimum streamflow is estimated for 138 unimpaired headwater river basins across the maritime western US mountains to better understand how climatologic variables and geologic characteristics interact to determine the response of summer low flows to winter precipitation (PPT), spring snow water equivalent (SWE), and summertime potential evapotranspiration (PET). Elasticities are calculated using log log linear regression, and linear reservoir storage coefficients are used to represent basin geology. Storage coefficients are estimated using baseflow recession analysis. On average, SWE, PET, and PPT explain about 1/3 of the summertime low flow variance. Snow-dominated basins with long timescales of baseflow recession are least sensitive to changes in SWE, PPT, and PET, while rainfall-dominated, faster draining basins are most sensitive. There are also implications for the predictability of summer low flows. The R2 between streamflow and SWE drops from 0.62 to 0.47 from snow-dominated to rain-dominated basins, while there is no corresponding increase in R2 between streamflow and PPT.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
The relative toxic response of 27 selected phenols in the 96-hr acute flowthrough Pimephales promelas (fathead minnow) and the 48- to 60-hr chronic static Tetrahymena pyriformis (ciliate protozoan) test systems was evaluated. Log Kow-dependent linear regression analyses revealed ...
Microbial Transformation of Esters of Chlorinated Carboxylic Acids
Paris, D. F.; Wolfe, N. L.; Steen, W. C.
1984-01-01
Two groups of compounds were selected for microbial transformation studies. In the first group were carboxylic acid esters having a fixed aromatic moiety and an increasing length of the alkyl component. Ethyl esters of chlorine-substituted carboxylic acids were in the second group. Microorganisms from environmental waters and a pure culture of Pseudomonas putida U were used. The bacterial populations were monitored by plate counts, and disappearance of the parent compound was followed by gas-liquid chromatography as a function of time. The products of microbial hydrolysis were the respective carboxylic acids. Octanol-water partition coefficients (Kow) for the compounds were measured. These values spanned three orders of magnitude, whereas microbial transformation rate constants (kb) varied only 50-fold. The microbial rate constants of the carboxylic acid esters with a fixed aromatic moiety increased with an increasing length of alkyl substituents. The regression coefficient for the linear relationships between log kb and log Kow was high for group 1 compounds, indicating that these parameters correlated well. The regression coefficient for the linear relationships for group 2 compounds, however, was low, indicating that these parameters correlated poorly. PMID:16346459
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...
2015-12-04
Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less
Spreco, A; Eriksson, O; Dahlström, Ö; Timpka, T
2017-07-01
Methods for the detection of influenza epidemics and prediction of their progress have seldom been comparatively evaluated using prospective designs. This study aimed to perform a prospective comparative trial of algorithms for the detection and prediction of increased local influenza activity. Data on clinical influenza diagnoses recorded by physicians and syndromic data from a telenursing service were used. Five detection and three prediction algorithms previously evaluated in public health settings were calibrated and then evaluated over 3 years. When applied on diagnostic data, only detection using the Serfling regression method and prediction using the non-adaptive log-linear regression method showed acceptable performances during winter influenza seasons. For the syndromic data, none of the detection algorithms displayed a satisfactory performance, while non-adaptive log-linear regression was the best performing prediction method. We conclude that evidence was found for that available algorithms for influenza detection and prediction display satisfactory performance when applied on local diagnostic data during winter influenza seasons. When applied on local syndromic data, the evaluated algorithms did not display consistent performance. Further evaluations and research on combination of methods of these types in public health information infrastructures for 'nowcasting' (integrated detection and prediction) of influenza activity are warranted.
Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies.
Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre
2018-03-15
Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. We propose a methodology based on Cox mixed models and written under the R language. This semiparametric model is indeed flexible enough to fit duration data. To compare log-linear and Cox mixed models in terms of goodness-of-fit on real data sets, we also provide a procedure based on simulations and quantile-quantile plots. We present two examples from a data set of speech and gesture interactions, which illustrate the limitations of linear and log-linear mixed models, as compared to Cox models. The linear models are not validated on our data, whereas Cox models are. Moreover, in the second example, the Cox model exhibits a significant effect that the linear model does not. We provide methods to select the best-fitting models for repeated duration data and to compare statistical methodologies. In this study, we show that Cox models are best suited to the analysis of our data set.
Factors Associated With Surgery Clerkship Performance and Subsequent USMLE Step Scores.
Dong, Ting; Copeland, Annesley; Gangidine, Matthew; Schreiber-Gregory, Deanna; Ritter, E Matthew; Durning, Steven J
2018-03-12
We conducted an in-depth empirical investigation to achieve a better understanding of the surgery clerkship from multiple perspectives, including the influence of clerkship sequence on performance, the relationship between self-logged work hours and performance, as well as the association between surgery clerkship performance with subsequent USMLE Step exams' scores. The study cohort consisted of medical students graduating between 2015 and 2018 (n = 687). The primary measures of interest were clerkship sequence (internal medicine clerkship before or after surgery clerkship), self-logged work hours during surgery clerkship, surgery NBME subject exam score, surgery clerkship overall grade, and Step 1, Step 2 CK, and Step 3 exam scores. We reported the descriptive statistics and conducted correlation analysis, stepwise linear regression analysis, and variable selection analysis of logistic regression to answer the research questions. Students who completed internal medicine clerkship prior to surgery clerkship had better performance on surgery subject exam. The subject exam score explained an additional 28% of the variance of the Step 2 CK score, and the clerkship overall score accounted for an additional 24% of the variance after the MCAT scores and undergraduate GPA were controlled. Our finding suggests that the clerkship sequence does matter when it comes to performance on the surgery NBME subject exam. Performance on the surgery subject exam is predictive of subsequent performance on future USMLE Step exams. Copyright © 2018 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Jardínez, Christiaan; Vela, Alberto; Cruz-Borbolla, Julián; Alvarez-Mendez, Rodrigo J; Alvarado-Rodríguez, José G
2016-12-01
The relationship between the chemical structure and biological activity (log IC 50 ) of 40 derivatives of 1,4-dihydropyridines (DHPs) was studied using density functional theory (DFT) and multiple linear regression analysis methods. With the aim of improving the quantitative structure-activity relationship (QSAR) model, the reduced density gradient s( r) of the optimized equilibrium geometries was used as a descriptor to include weak non-covalent interactions. The QSAR model highlights the correlation between the log IC 50 with highest molecular orbital energy (E HOMO ), molecular volume (V), partition coefficient (log P), non-covalent interactions NCI(H4-G) and the dual descriptor [Δf(r)]. The model yielded values of R 2 =79.57 and Q 2 =69.67 that were validated with the next four internal analytical validations DK=0.076, DQ=-0.006, R P =0.056, and R N =0.000, and the external validation Q 2 boot =64.26. The QSAR model found can be used to estimate biological activity with high reliability in new compounds based on a DHP series. Graphical abstract The good correlation between the log IC 50 with the NCI (H4-G) estimated by the reduced density gradient approach of the DHP derivatives.
Rothenberg, Stephen J.; Rothenberg, Jesse C.
2005-01-01
Statistical evaluation of the dose–response function in lead epidemiology is rarely attempted. Economic evaluation of health benefits of lead reduction usually assumes a linear dose–response function, regardless of the outcome measure used. We reanalyzed a previously published study, an international pooled data set combining data from seven prospective lead studies examining contemporaneous blood lead effect on IQ (intelligence quotient) of 7-year-old children (n = 1,333). We constructed alternative linear multiple regression models with linear blood lead terms (linear–linear dose response) and natural-log–transformed blood lead terms (log-linear dose response). We tested the two lead specifications for nonlinearity in the models, compared the two lead specifications for significantly better fit to the data, and examined the effects of possible residual confounding on the functional form of the dose–response relationship. We found that a log-linear lead–IQ relationship was a significantly better fit than was a linear–linear relationship for IQ (p = 0.009), with little evidence of residual confounding of included model variables. We substituted the log-linear lead–IQ effect in a previously published health benefits model and found that the economic savings due to U.S. population lead decrease between 1976 and 1999 (from 17.1 μg/dL to 2.0 μg/dL) was 2.2 times ($319 billion) that calculated using a linear–linear dose–response function ($149 billion). The Centers for Disease Control and Prevention action limit of 10 μg/dL for children fails to protect against most damage and economic cost attributable to lead exposure. PMID:16140626
Chen, Yasheng; An, Hongyu; Zhu, Hongtu; Jewells, Valerie; Armao, Diane; Shen, Dinggang; Gilmore, John H.; Lin, Weili
2011-01-01
Although diffusion tensor imaging (DTI) has provided substantial insights into early brain development, most DTI studies based on fractional anisotropy (FA) and mean diffusivity (MD) may not capitalize on the information derived from the three principal diffusivities (e.g. eigenvalues). In this study, we explored the spatial and temporal evolution of white matter structures during early brain development using two geometrical diffusion measures, namely, linear (Cl) and planar (Cp) diffusion anisotropies, from 71 longitudinal datasets acquired from 29 healthy, full-term pediatric subjects. The growth trajectories were estimated with generalized estimating equations (GEE) using linear fitting with logarithm of age (days). The presence of the white matter structures in Cl and Cp was observed in neonates, suggesting that both the cylindrical and fanning or crossing structures in various white matter regions may already have been formed at birth. Moreover, we found that both Cl and Cp evolved in a temporally nonlinear and spatially inhomogeneous manner. The growth velocities of Cl in central white matter were significantly higher when compared to peripheral, or more laterally located, white matter: central growth velocity Cl = 0.0465±0.0273/log(days), versus peripheral growth velocity Cl=0.0198±0.0127/log(days), p<10−6. In contrast, the growth velocities of Cp in central white matter were significantly lower than that in peripheral white matter: central growth velocity Cp= 0.0014±0.0058/log(days), versus peripheral growth velocity Cp = 0.0289±0.0101/log(days), p<10−6. Depending on the underlying white matter site which is analyzed, our findings suggest that ongoing physiologic and microstructural changes in the developing brain may exert different effects on the temporal evolution of these two geometrical diffusion measures. Thus, future studies utilizing DTI with correlative histological analysis in the study of early brain development are warranted. PMID:21784163
Grantz, Erin; Haggard, Brian; Scott, J Thad
2018-06-12
We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.
ERIC Educational Resources Information Center
Si, Yajuan; Reiter, Jerome P.
2013-01-01
In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian,…
Competing regression models for longitudinal data.
Alencar, Airlane P; Singer, Julio M; Rocha, Francisco Marcelo M
2012-03-01
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest-posttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Tutorial on Multilevel Survival Analysis: Methods, Models and Applications
Austin, Peter C.
2017-01-01
Summary Data that have a multilevel structure occur frequently across a range of disciplines, including epidemiology, health services research, public health, education and sociology. We describe three families of regression models for the analysis of multilevel survival data. First, Cox proportional hazards models with mixed effects incorporate cluster-specific random effects that modify the baseline hazard function. Second, piecewise exponential survival models partition the duration of follow-up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model that incorporates the duration of exposure within each interval. By incorporating cluster-specific random effects, generalised linear mixed models can be used to analyse these data. Third, after partitioning the duration of follow-up into mutually exclusive intervals, one can use discrete time survival models that use a complementary log–log generalised linear model to model the occurrence of the outcome of interest within each interval. Random effects can be incorporated to account for within-cluster homogeneity in outcomes. We illustrate the application of these methods using data consisting of patients hospitalised with a heart attack. We illustrate the application of these methods using three statistical programming languages (R, SAS and Stata). PMID:29307954
Marrero-Ponce, Yovani; Martínez-Albelo, Eugenio R; Casañola-Martín, Gerardo M; Castillo-Garit, Juan A; Echevería-Díaz, Yunaimy; Zaldivar, Vicente Romero; Tygat, Jan; Borges, José E Rodriguez; García-Domenech, Ramón; Torrens, Francisco; Pérez-Giménez, Facundo
2010-11-01
Novel bond-level molecular descriptors are proposed, based on linear maps similar to the ones defined in algebra theory. The kth edge-adjacency matrix (E(k)) denotes the matrix of bond linear indices (non-stochastic) with regard to canonical basis set. The kth stochastic edge-adjacency matrix, ES(k), is here proposed as a new molecular representation easily calculated from E(k). Then, the kth stochastic bond linear indices are calculated using ES(k) as operators of linear transformations. In both cases, the bond-type formalism is developed. The kth non-stochastic and stochastic total linear indices are calculated by adding the kth non-stochastic and stochastic bond linear indices, respectively, of all bonds in molecule. First, the new bond-based molecular descriptors (MDs) are tested for suitability, for the QSPRs, by analyzing regressions of novel indices for selected physicochemical properties of octane isomers (first round). General performance of the new descriptors in this QSPR studies is evaluated with regard to the well-known sets of 2D/3D MDs. From the analysis, we can conclude that the non-stochastic and stochastic bond-based linear indices have an overall good modeling capability proving their usefulness in QSPR studies. Later, the novel bond-level MDs are also used for the description and prediction of the boiling point of 28 alkyl-alcohols (second round), and to the modeling of the specific rate constant (log k), partition coefficient (log P), as well as the antibacterial activity of 34 derivatives of 2-furylethylenes (third round). The comparison with other approaches (edge- and vertices-based connectivity indices, total and local spectral moments, and quantum chemical descriptors as well as E-state/biomolecular encounter parameters) exposes a good behavior of our method in this QSPR studies. Finally, the approach described in this study appears to be a very promising structural invariant, useful not only for QSPR studies but also for similarity/diversity analysis and drug discovery protocols.
Oh, Eric J; Shepherd, Bryan E; Lumley, Thomas; Shaw, Pamela A
2018-04-15
For time-to-event outcomes, a rich literature exists on the bias introduced by covariate measurement error in regression models, such as the Cox model, and methods of analysis to address this bias. By comparison, less attention has been given to understanding the impact or addressing errors in the failure time outcome. For many diseases, the timing of an event of interest (such as progression-free survival or time to AIDS progression) can be difficult to assess or reliant on self-report and therefore prone to measurement error. For linear models, it is well known that random errors in the outcome variable do not bias regression estimates. With nonlinear models, however, even random error or misclassification can introduce bias into estimated parameters. We compare the performance of 2 common regression models, the Cox and Weibull models, in the setting of measurement error in the failure time outcome. We introduce an extension of the SIMEX method to correct for bias in hazard ratio estimates from the Cox model and discuss other analysis options to address measurement error in the response. A formula to estimate the bias induced into the hazard ratio by classical measurement error in the event time for a log-linear survival model is presented. Detailed numerical studies are presented to examine the performance of the proposed SIMEX method under varying levels and parametric forms of the error in the outcome. We further illustrate the method with observational data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic. Copyright © 2017 John Wiley & Sons, Ltd.
Separate-channel analysis of two-channel microarrays: recovering inter-spot information.
Smyth, Gordon K; Altman, Naomi S
2013-05-26
Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Serum Fetuin-A Levels and Thyroid Function inMiddle-aged and Elderly Chinese.
Deng, Xin Ru; Ding, Lin; Wang, Tian Ge; Xu, Min; Lu, Jie Li; Li, Mian; Zhao, Zhi Yun; Chen, Yu Hong; Bi, Yu Fang; Xu, Yi Ping; Xu, Yu
2017-06-01
Serum fetuin-A levels are reportedly elevated in hyperthyroidism. However, there are few relevant epidemiologic studies. We conducted a cross-sectional study in Songnan community, China in 2009 to investigate the association between serum fetuin-A concentrations and thyroid function. A total of 2,984 participants aged 40 years and older were analyzed. Multivariable linear regression analysis revealed that serum fetuin-A concentra- tions were positively associated with log (free triiodothyronine) and were inversely associated with log (thyroid peroxidase antibody) after adjustment (both P < 0.05). Compared with the participants in the lowest tertile of free triiodo-thyronine and free thyroxine level, those in the highest tertile had higher fetuin-A concentrations. Additionally, high serum fetuin-A concentrations were related to high thyroid function (odds ratio 1.27, 95% confidence interval 1.01-1.61), after adjustment for conventional risk factors. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lipfert, F.W.
1992-11-01
1980 data from up to 149 metropolitan areas were used to define cross-sectional associations between community air pollution and excess human mortality. The regression model proposed by Oezkaynak and Thurston, which accounted for age, race, education, poverty, and population density, was evaluated and several new models were developed. The new models also accounted for population change, drinking water hardness, and smoking, and included a more detailed description of race. Cause-of-death categories analyzed include all causes, all non-external causes, major cardiovascular diseases, and chronic obstructive pulmonary diseases (COPD). Both annual mortality rates and their logarithms were analyzed. The data on particulatesmore » were averaged across all monitoring stations available for each SMSA and the TSP data were restricted to the year 1980. The associations between mortality and air pollution were found to be dependent on the socioeconomic factors included in the models, the specific locations included din the data set, and the type of statistical model used. Statistically significant associations were found between TSP and mortality due to non-external causes with log-linear models, but not with a linear model, and between TS and COPD mortality for both linear and log-linear models. When the sulfate contribution to TSP was subtracted, the relationship with COPD mortality was strengthened. Scatter plots and quintile analyses suggested a TSP threshold for COPD mortality at around 65 ug/m{sup 3} (annual average). SO{sub 4}{sup {minus}2}, Mn, PM{sup 15}, and PM{sub 2.5} were not significantly associated with mortality using the new models.« less
Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.
Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E
2018-03-01
Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Plasma myelin basic protein assay using Gilford enzyme immunoassay cuvettes.
Groome, N P
1981-10-01
The assay of myelin basic protein in body fluids has potential clinical importance as a routine indicator of demyelination. Preliminary details of a competitive enzyme immunoassay for this protein have previously been published by the author (Groome, N. P. (1980) J. Neurochem. 35, 1409-1417). The present paper now describes the adaptation of this assay for use on human plasma and various aspects of routine data processing. A commercially available cuvette system was found to have advantages over microtitre plates but required a permuted arrangement of sample replicates for consistent results. For dose interpolation, the standard curve could be fitted to a three parameter non-linear equation by regression analysis or linearised by the logit/log transformation.
Wu, Zilan; Lin, Tian; Li, Zhongxia; Jiang, Yuqing; Li, Yuanyuan; Yao, Xiaohong; Gao, Huiwang; Guo, Zhigang
2017-11-01
We measured 15 parent polycyclic aromatic hydrocarbons (PAHs) in atmosphere and water during a research cruise from the East China Sea (ECS) to the northwestern Pacific Ocean (NWP) in the spring of 2015 to investigate the occurrence, air-sea gas exchange, and gas-particle partitioning of PAHs with a particular focus on the influence of East Asian continental outflow. The gaseous PAH composition and identification of sources were consistent with PAHs from the upwind area, indicating that the gaseous PAHs (three-to five-ring PAHs) were influenced by upwind land pollution. In addition, air-sea exchange fluxes of gaseous PAHs were estimated to be -54.2-107.4 ng m -2 d -1 , and was indicative of variations of land-based PAH inputs. The logarithmic gas-particle partition coefficient (logK p ) of PAHs regressed linearly against the logarithmic subcooled liquid vapor pressure (logP L 0 ), with a slope of -0.25. This was significantly larger than the theoretical value (-1), implying disequilibrium between the gaseous and particulate PAHs over the NWP. The non-equilibrium of PAH gas-particle partitioning was shielded from the volatilization of three-ring gaseous PAHs from seawater and lower soot concentrations in particular when the oceanic air masses prevailed. Modeling PAH absorption into organic matter and adsorption onto soot carbon revealed that the status of PAH gas-particle partitioning deviated more from the modeling K p for oceanic air masses than those for continental air masses, which coincided with higher volatilization of three-ring PAHs and confirmed the influence of air-sea exchange. Meanwhile, significant linear regressions between logK p and logK oa (logK sa ) for PAHs were observed for continental air masses, suggesting the dominant effect of East Asian continental outflow on atmospheric PAHs over the NWP during the sampling campaign. Copyright © 2017 Elsevier Ltd. All rights reserved.
Papadimitriou, Konstantinos I.; Liu, Shih-Chii; Indiveri, Giacomo; Drakakis, Emmanuel M.
2014-01-01
The field of neuromorphic silicon synapse circuits is revisited and a parsimonious mathematical framework able to describe the dynamics of this class of log-domain circuits in the aggregate and in a systematic manner is proposed. Starting from the Bernoulli Cell Formalism (BCF), originally formulated for the modular synthesis and analysis of externally linear, time-invariant logarithmic filters, and by means of the identification of new types of Bernoulli Cell (BC) operators presented here, a generalized formalism (GBCF) is established. The expanded formalism covers two new possible and practical combinations of a MOS transistor (MOST) and a linear capacitor. The corresponding mathematical relations codifying each case are presented and discussed through the tutorial treatment of three well-known transistor-level examples of log-domain neuromorphic silicon synapses. The proposed mathematical tool unifies past analysis approaches of the same circuits under a common theoretical framework. The speed advantage of the proposed mathematical framework as an analysis tool is also demonstrated by a compelling comparative circuit analysis example of high order, where the GBCF and another well-known log-domain circuit analysis method are used for the determination of the input-output transfer function of the high (4th) order topology. PMID:25653579
Papadimitriou, Konstantinos I; Liu, Shih-Chii; Indiveri, Giacomo; Drakakis, Emmanuel M
2014-01-01
The field of neuromorphic silicon synapse circuits is revisited and a parsimonious mathematical framework able to describe the dynamics of this class of log-domain circuits in the aggregate and in a systematic manner is proposed. Starting from the Bernoulli Cell Formalism (BCF), originally formulated for the modular synthesis and analysis of externally linear, time-invariant logarithmic filters, and by means of the identification of new types of Bernoulli Cell (BC) operators presented here, a generalized formalism (GBCF) is established. The expanded formalism covers two new possible and practical combinations of a MOS transistor (MOST) and a linear capacitor. The corresponding mathematical relations codifying each case are presented and discussed through the tutorial treatment of three well-known transistor-level examples of log-domain neuromorphic silicon synapses. The proposed mathematical tool unifies past analysis approaches of the same circuits under a common theoretical framework. The speed advantage of the proposed mathematical framework as an analysis tool is also demonstrated by a compelling comparative circuit analysis example of high order, where the GBCF and another well-known log-domain circuit analysis method are used for the determination of the input-output transfer function of the high (4(th)) order topology.
Barth, Nancy A.; Veilleux, Andrea G.
2012-01-01
The U.S. Geological Survey (USGS) is currently updating at-site flood frequency estimates for USGS streamflow-gaging stations in the desert region of California. The at-site flood-frequency analysis is complicated by short record lengths (less than 20 years is common) and numerous zero flows/low outliers at many sites. Estimates of the three parameters (mean, standard deviation, and skew) required for fitting the log Pearson Type 3 (LP3) distribution are likely to be highly unreliable based on the limited and heavily censored at-site data. In a generalization of the recommendations in Bulletin 17B, a regional analysis was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the LP3 distribution. A regional skew value of zero from a previously published report was used with a new estimated mean squared error (MSE) of 0.20. A weighted least squares (WLS) regression method was used to develop both a regional standard deviation and a mean model based on annual peak-discharge data for 33 USGS stations throughout California’s desert region. At-site standard deviation and mean values were determined by using an expected moments algorithm (EMA) method for fitting the LP3 distribution to the logarithms of annual peak-discharge data. Additionally, a multiple Grubbs-Beck (MGB) test, a generalization of the test recommended in Bulletin 17B, was used for detecting multiple potentially influential low outliers in a flood series. The WLS regression found that no basin characteristics could explain the variability of standard deviation. Consequently, a constant regional standard deviation model was selected, resulting in a log-space value of 0.91 with a MSE of 0.03 log units. Yet drainage area was found to be statistically significant at explaining the site-to-site variability in mean. The linear WLS regional mean model based on drainage area had a Pseudo- 2 R of 51 percent and a MSE of 0.32 log units. The regional parameter estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final equations are functions of drainage area.Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent.
Van Gemert-Pijnen, Julia Ewc; Kelders, Saskia M; Bohlmeijer, Ernst T
2014-01-31
Web-based interventions for the early treatment of depressive symptoms can be considered effective in reducing mental complaints. However, there is a limited understanding of which elements in an intervention contribute to effectiveness. For efficiency and effectiveness of interventions, insight is needed into the use of content and persuasive features. The aims of this study were (1) to illustrate how log data can be used to understand the uptake of the content of a Web-based intervention that is based on the acceptance and commitment therapy (ACT) and (2) to discover how log data can be of value for improving the incorporation of content in Web-based interventions. Data from 206 participants (out of the 239) who started the first nine lessons of the Web-based intervention, Living to the Full, were used for a secondary analysis of a subset of the log data of the parent study about adherence to the intervention. The log files used in this study were per lesson: login, start mindfulness, download mindfulness, view success story, view feedback message, start multimedia, turn on text-message coach, turn off text-message coach, and view text message. Differences in usage between lessons were explored with repeated measures ANOVAs (analysis of variance). Differences between groups were explored with one-way ANOVAs. To explore the possible predictive value of the login per lesson quartiles on the outcome measures, four linear regressions were used with login quartiles as predictor and with the outcome measures (Center for Epidemiologic Studies-Depression [CES-D] and the Hospital Anxiety and Depression Scale-Anxiety [HADS-A] on post-intervention and follow-up) as dependent variables. A significant decrease in logins and in the use of content and persuasive features over time was observed. The usage of features varied significantly during the treatment process. The usage of persuasive features increased during the third part of the ACT (commitment to value-based living), which might indicate that at that stage motivational support was relevant. Higher logins over time (9 weeks) corresponded with a higher usage of features (in most cases significant); when predicting depressive symptoms at post-intervention, the linear regression yielded a significant model with login quartile as a significant predictor (explained variance is 2.7%). A better integration of content and persuasive features in the design of the intervention and a better intra-usability of features within the system are needed to identify which combination of features works best for whom. Pattern recognition can be used to tailor the intervention based on usage patterns from the earlier lessons and to support the uptake of content essential for therapy. An adaptable interface for a modular composition of therapy features supposes a dynamic approach for Web-based treatment; not a predefined path for all, but a flexible way to go through all features that have to be used.
2014-01-01
Background Web-based interventions for the early treatment of depressive symptoms can be considered effective in reducing mental complaints. However, there is a limited understanding of which elements in an intervention contribute to effectiveness. For efficiency and effectiveness of interventions, insight is needed into the use of content and persuasive features. Objective The aims of this study were (1) to illustrate how log data can be used to understand the uptake of the content of a Web-based intervention that is based on the acceptance and commitment therapy (ACT) and (2) to discover how log data can be of value for improving the incorporation of content in Web-based interventions. Methods Data from 206 participants (out of the 239) who started the first nine lessons of the Web-based intervention, Living to the Full, were used for a secondary analysis of a subset of the log data of the parent study about adherence to the intervention. The log files used in this study were per lesson: login, start mindfulness, download mindfulness, view success story, view feedback message, start multimedia, turn on text-message coach, turn off text-message coach, and view text message. Differences in usage between lessons were explored with repeated measures ANOVAs (analysis of variance). Differences between groups were explored with one-way ANOVAs. To explore the possible predictive value of the login per lesson quartiles on the outcome measures, four linear regressions were used with login quartiles as predictor and with the outcome measures (Center for Epidemiologic Studies—Depression [CES-D] and the Hospital Anxiety and Depression Scale—Anxiety [HADS-A] on post-intervention and follow-up) as dependent variables. Results A significant decrease in logins and in the use of content and persuasive features over time was observed. The usage of features varied significantly during the treatment process. The usage of persuasive features increased during the third part of the ACT (commitment to value-based living), which might indicate that at that stage motivational support was relevant. Higher logins over time (9 weeks) corresponded with a higher usage of features (in most cases significant); when predicting depressive symptoms at post-intervention, the linear regression yielded a significant model with login quartile as a significant predictor (explained variance is 2.7%). Conclusions A better integration of content and persuasive features in the design of the intervention and a better intra-usability of features within the system are needed to identify which combination of features works best for whom. Pattern recognition can be used to tailor the intervention based on usage patterns from the earlier lessons and to support the uptake of content essential for therapy. An adaptable interface for a modular composition of therapy features supposes a dynamic approach for Web-based treatment; not a predefined path for all, but a flexible way to go through all features that have to be used. PMID:24486914
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Cho, In-Jeong; Chang, Hyuk-Jae; Heo, Ran; Kim, In-Cheol; Sung, Ji Min; Chang, Byung-Chul; Shim, Chi Young; Hong, Geu-Ru; Chung, Namsik
2017-01-01
Substantial aortic calcification is known to be associated with aortic stiffening and subsequent left ventricular (LV) hypertrophy. This study examined whether the thoracic aorta calcium score (TACS) is related to LV hypertrophy and whether it leads to an adverse prognosis in patients with severe aortic stenosis (AS) after aortic valve replacement (AVR). We retrospectively reviewed 47 patients (mean age, 64 ± 11 years) with isolated severe AS who underwent noncontrast computed tomography of the entire thoracic aorta and who received AVR. TACS was quantified using the volume method with values becoming log transformed ( log [TACS+1]). Transthoracic echocardiography was performed before and 1 year after the operation. Preoperative LV mass index (LVMI) displayed significant positive correlations with male gender (r = 0.430, p = 0.010) and log (TACS+1) (r = 0.556, p = 0.003). In multivariate linear regression analysis, only log (TACS+1) was independently associated with LVMI, even after adjusting for age, gender, transaortic mean pressure gradient, and coronary or valve calcium score. Independent determinants for postoperative LVMI included log (TACS+1) and preoperative LVMI after 1 year of follow-up echocardiography, adjusting for age, gender, indexed effective orifice area, and coronary or valve calcium score. During a median follow-up period of 54 months after AVR, there were 10 events (21%), which included 4 deaths from all-causes, 3 strokes, 2 inpatient admissions for heart failure, and 1 myocardial infarction. The event-free survival rate was significantly lower for patients with TACS of 2,257 mm 3 or higher compared with those whose TACS was lower than 2,257 mm 3 (log-rank p < 0.001). High TACS was associated with increased LVMI among patients with severe AS. Further, high TACS usefully predicted less regression of LVMI and poor clinical outcomes after AVR. TACS may serve as a useful proxy for predicting LV remodeling and adverse prognosis in patients with severe AS undergoing AVR. Copyright © 2017 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Cheng, Ta-Chun; Tung, Yi-Ching; Chu, Pei-Yu; Chuang, Chih-Hung; Hsieh, Yuan-Chin; Huang, Chien-Chiao; Wang, Yeng-Tseng; Kao, Chien-Han; Roffler, Steve R.; Cheng, Tian-Lu
2016-01-01
Molecular weight markers that can tolerate denaturing conditions and be auto-detected by secondary antibodies offer great efficacy and convenience for Western Blotting. Here, we describe M&R LE protein markers which contain linear epitopes derived from the heavy chain constant regions of mouse and rabbit immunoglobulin G (IgG Fc LE). These markers can be directly recognized and stained by a wide range of anti-mouse and anti-rabbit secondary antibodies. We selected three mouse (M1, M2 and M3) linear IgG1 and three rabbit (R1, R2 and R3) linear IgG heavy chain epitope candidates based on their respective crystal structures. Western blot analysis indicated that M2 and R2 linear epitopes are effectively recognized by anti-mouse and anti-rabbit secondary antibodies, respectively. We fused the M2 and R2 epitopes (M&R LE) and incorporated the polypeptide in a range of 15–120 kDa auto-detecting markers (M&R LE protein marker). The M&R LE protein marker can be auto-detected by anti-mouse and anti-rabbit IgG secondary antibodies in standard immunoblots. Linear regression analysis of the M&R LE protein marker plotted as gel mobility versus the log of the marker molecular weights revealed good linearity with a correlation coefficient R2 value of 0.9965, indicating that the M&R LE protein marker displays high accuracy for determining protein molecular weights. This accurate, regular and auto-detected M&R LE protein marker may provide a simple, efficient and economical tool for protein analysis. PMID:27494183
Lin, Wen-Wei; Chen, I-Ju; Cheng, Ta-Chun; Tung, Yi-Ching; Chu, Pei-Yu; Chuang, Chih-Hung; Hsieh, Yuan-Chin; Huang, Chien-Chiao; Wang, Yeng-Tseng; Kao, Chien-Han; Roffler, Steve R; Cheng, Tian-Lu
2016-01-01
Molecular weight markers that can tolerate denaturing conditions and be auto-detected by secondary antibodies offer great efficacy and convenience for Western Blotting. Here, we describe M&R LE protein markers which contain linear epitopes derived from the heavy chain constant regions of mouse and rabbit immunoglobulin G (IgG Fc LE). These markers can be directly recognized and stained by a wide range of anti-mouse and anti-rabbit secondary antibodies. We selected three mouse (M1, M2 and M3) linear IgG1 and three rabbit (R1, R2 and R3) linear IgG heavy chain epitope candidates based on their respective crystal structures. Western blot analysis indicated that M2 and R2 linear epitopes are effectively recognized by anti-mouse and anti-rabbit secondary antibodies, respectively. We fused the M2 and R2 epitopes (M&R LE) and incorporated the polypeptide in a range of 15-120 kDa auto-detecting markers (M&R LE protein marker). The M&R LE protein marker can be auto-detected by anti-mouse and anti-rabbit IgG secondary antibodies in standard immunoblots. Linear regression analysis of the M&R LE protein marker plotted as gel mobility versus the log of the marker molecular weights revealed good linearity with a correlation coefficient R2 value of 0.9965, indicating that the M&R LE protein marker displays high accuracy for determining protein molecular weights. This accurate, regular and auto-detected M&R LE protein marker may provide a simple, efficient and economical tool for protein analysis.
Model-based Bayesian inference for ROC data analysis
NASA Astrophysics Data System (ADS)
Lei, Tianhu; Bae, K. Ty
2013-03-01
This paper presents a study of model-based Bayesian inference to Receiver Operating Characteristics (ROC) data. The model is a simple version of general non-linear regression model. Different from Dorfman model, it uses a probit link function with a covariate variable having zero-one two values to express binormal distributions in a single formula. Model also includes a scale parameter. Bayesian inference is implemented by Markov Chain Monte Carlo (MCMC) method carried out by Bayesian analysis Using Gibbs Sampling (BUGS). Contrast to the classical statistical theory, Bayesian approach considers model parameters as random variables characterized by prior distributions. With substantial amount of simulated samples generated by sampling algorithm, posterior distributions of parameters as well as parameters themselves can be accurately estimated. MCMC-based BUGS adopts Adaptive Rejection Sampling (ARS) protocol which requires the probability density function (pdf) which samples are drawing from be log concave with respect to the targeted parameters. Our study corrects a common misconception and proves that pdf of this regression model is log concave with respect to its scale parameter. Therefore, ARS's requirement is satisfied and a Gaussian prior which is conjugate and possesses many analytic and computational advantages is assigned to the scale parameter. A cohort of 20 simulated data sets and 20 simulations from each data set are used in our study. Output analysis and convergence diagnostics for MCMC method are assessed by CODA package. Models and methods by using continuous Gaussian prior and discrete categorical prior are compared. Intensive simulations and performance measures are given to illustrate our practice in the framework of model-based Bayesian inference using MCMC method.
Statistical analysis of dendritic spine distributions in rat hippocampal cultures
2013-01-01
Background Dendritic spines serve as key computational structures in brain plasticity. Much remains to be learned about their spatial and temporal distribution among neurons. Our aim in this study was to perform exploratory analyses based on the population distributions of dendritic spines with regard to their morphological characteristics and period of growth in dissociated hippocampal neurons. We fit a log-linear model to the contingency table of spine features such as spine type and distance from the soma to first determine which features were important in modeling the spines, as well as the relationships between such features. A multinomial logistic regression was then used to predict the spine types using the features suggested by the log-linear model, along with neighboring spine information. Finally, an important variant of Ripley’s K-function applicable to linear networks was used to study the spatial distribution of spines along dendrites. Results Our study indicated that in the culture system, (i) dendritic spine densities were "completely spatially random", (ii) spine type and distance from the soma were independent quantities, and most importantly, (iii) spines had a tendency to cluster with other spines of the same type. Conclusions Although these results may vary with other systems, our primary contribution is the set of statistical tools for morphological modeling of spines which can be used to assess neuronal cultures following gene manipulation such as RNAi, and to study induced pluripotent stem cells differentiated to neurons. PMID:24088199
Factors relating to windblown dust in associations between ...
Introduction: In effect estimates of city-specific PM2.5-mortality associations across United States (US), there exists a substantial amount of spatial heterogeneity. Some of this heterogeneity may be due to mass distribution of PM; areas where PM2.5 is likely to be dominated by large size fractions (above 1 micron; e.g., the contribution of windblown dust), may have a weaker association with mortality. Methods: Log rate ratios (betas) for the PM2.5-mortality association—derived from a model adjusting for time, an interaction with age-group, day of week, and natural splines of current temperature, current dew point, and unconstrained temperature at lags 1, 2, and 3, for 313 core-based statistical areas (CBSA) and their metropolitan divisions (MD) over 1999-2005—were used as the outcome. Using inverse variance weighted linear regression, we examined change in log rate ratios in association with PM10-PM2.5 correlation as a marker of windblown dust/higher PM size fraction; linearity of associations was assessed in models using splines with knots at quintile values. Results: Weighted mean PM2.5 association (0.96 percent increase in total non-accidental mortality for a 10 ug/m3 increment in PM2.5) increased by 0.34 (95% confidence interval: 0.20, 0.48) per interquartile change (0.25) in the PM10-PM2.5 correlation, and explained approximately 8% of the observed heterogeneity; the association was linear based on spline analysis. Conclusions: Preliminary results pro
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.
Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less
Kalkanis, Alexandros; Kalkanis, Dimitrios; Drougas, Dimitrios; Vavougios, George D; Datseris, Ioannis; Judson, Marc A; Georgiou, Evangelos
2016-03-01
The objective of our study was to assess the possible relationship between splenic F-18-fluorodeoxyglucose (18F-FDG) uptake and other established biochemical markers of sarcoidosis activity. Thirty treatment-naive sarcoidosis patients were prospectively enrolled in this study. They underwent biochemical laboratory tests, including serum interleukin-2 receptor (sIL-2R), serum C-reactive protein, serum angiotensin-I converting enzyme, and 24-h urine calcium levels, and a whole-body combined 18F-FDG PET/computed tomography (PET/CT) scan as a part of an ongoing study at our institute. These biomarkers were statistically compared in these patients. A statistically significant linear dependence was detected between sIL-2R and log-transformed spleen-average standard uptake value (SUV avg) (R2=0.488, P<0.0001) and log-transformed spleen-maximum standard uptake value (SUV max) (R2=0.490, P<0.0001). sIL-2R levels and splenic size correlated linearly (Pearson's r=0.373, P=0.042). Multivariate linear regression analysis revealed that this correlation remained significant after age and sex adjustment (β=0.001, SE=0.001, P=0.024). No statistically significant associations were detected between (a) any two serum biomarkers or (b) between spleen-SUV measurements and any serum biomarker other than sIL-2R. Our analysis revealed an association between sIL-2R levels and spleen 18F-FDG uptake and size, whereas all other serum biomarkers were not significantly associated with each other or with PET 18F-FDG uptake. Our results suggest that splenic inflammation may be related to the systemic inflammatory response in sarcoidosis that may be associated with elevated sIL-2R levels.
Symmetric log-domain diffeomorphic Registration: a demons-based approach.
Vercauteren, Tom; Pennec, Xavier; Perchant, Aymeric; Ayache, Nicholas
2008-01-01
Modern morphometric studies use non-linear image registration to compare anatomies and perform group analysis. Recently, log-Euclidean approaches have contributed to promote the use of such computational anatomy tools by permitting simple computations of statistics on a rather large class of invertible spatial transformations. In this work, we propose a non-linear registration algorithm perfectly fit for log-Euclidean statistics on diffeomorphisms. Our algorithm works completely in the log-domain, i.e. it uses a stationary velocity field. This implies that we guarantee the invertibility of the deformation and have access to the true inverse transformation. This also means that our output can be directly used for log-Euclidean statistics without relying on the heavy computation of the log of the spatial transformation. As it is often desirable, our algorithm is symmetric with respect to the order of the input images. Furthermore, we use an alternate optimization approach related to Thirion's demons algorithm to provide a fast non-linear registration algorithm. First results show that our algorithm outperforms both the demons algorithm and the recently proposed diffeomorphic demons algorithm in terms of accuracy of the transformation while remaining computationally efficient.
Latent log-linear models for handwritten digit classification.
Deselaers, Thomas; Gass, Tobias; Heigold, Georg; Ney, Hermann
2012-06-01
We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Gupta, Deepak K; Claggett, Brian; Wells, Quinn; Cheng, Susan; Li, Man; Maruthur, Nisa; Selvin, Elizabeth; Coresh, Josef; Konety, Suma; Butler, Kenneth R; Mosley, Thomas; Boerwinkle, Eric; Hoogeveen, Ron; Ballantyne, Christie M; Solomon, Scott D
2015-01-01
Background Natriuretic peptides promote natriuresis, diuresis, and vasodilation. Experimental deficiency of natriuretic peptides leads to hypertension (HTN) and cardiac hypertrophy, conditions more common among African Americans. Hospital-based studies suggest that African Americans may have reduced circulating natriuretic peptides, as compared to Caucasians, but definitive data from community-based cohorts are lacking. Methods and Results We examined plasma N-terminal pro B-type natriuretic peptide (NTproBNP) levels according to race in 9137 Atherosclerosis Risk in Communities (ARIC) Study participants (22% African American) without prevalent cardiovascular disease at visit 4 (1996–1998). Multivariable linear and logistic regression analyses were performed adjusting for clinical covariates. Among African Americans, percent European ancestry was determined from genetic ancestry informative markers and then examined in relation to NTproBNP levels in multivariable linear regression analysis. NTproBNP levels were significantly lower in African Americans (median, 43 pg/mL; interquartile range [IQR], 18, 88) than Caucasians (median, 68 pg/mL; IQR, 36, 124; P<0.0001). In multivariable models, adjusted log NTproBNP levels were 40% lower (95% confidence interval [CI], −43, −36) in African Americans, compared to Caucasians, which was consistent across subgroups of age, gender, HTN, diabetes, insulin resistance, and obesity. African-American race was also significantly associated with having nondetectable NTproBNP (adjusted OR, 5.74; 95% CI, 4.22, 7.80). In multivariable analyses in African Americans, a 10% increase in genetic European ancestry was associated with a 7% (95% CI, 1, 13) increase in adjusted log NTproBNP. Conclusions African Americans have lower levels of plasma NTproBNP than Caucasians, which may be partially owing to genetic variation. Low natriuretic peptide levels in African Americans may contribute to the greater risk for HTN and its sequalae in this population. PMID:25999400
USDA-ARS?s Scientific Manuscript database
Using linear regression models, we studied the main and two-way interaction effects of the predictor variables gender, age, BMI, and 64 folate/vitamin B-12/homocysteine/lipid/cholesterol-related single nucleotide polymorphisms (SNP) on log-transformed plasma homocysteine normalized by red blood cell...
Ushigome, Emi; Fukui, Michiaki; Hamaguchi, Masahide; Tanaka, Toru; Atsuta, Haruhiko; Ohnishi, Masayoshi; Tsunoda, Sei; Yamazaki, Masahiro; Hasegawa, Goji; Nakamura, Naoto
2014-06-01
Epidemiological studies have shown that elevated heart rate (HR) is associated with an increased risk of diabetic nephropathy, as well as cardiovascular events and mortality, in patients with type 2 diabetes mellitus. Recently, the advantages of the self-measurement of blood pressure (BP) at home have been recognized. The aim of this study was to investigate the relationship between home-measured HR and albuminuria in patients with type 2 diabetes mellitus. We designed a cross-sectional multicenter analysis of 1245 patients with type 2 diabetes mellitus. We investigated the relationship between the logarithm of urinary albumin excretion (log UAE) and home-measured HR or other factors that may be related to nephropathy using univariate and multivariate analyses. Multivariate linear regression analysis indicated that age, duration of diabetes mellitus, morning HR (β=0.131, P<0.001), morning systolic BP (β=0.311, P<0.001), hemoglobin A1C, triglycerides, daily consumption of alcohol, use of angiotensin II receptor blockers and use of beta-blockers were independently associated with the log UAE. Multivariate logistic regression analysis indicated that the odds ratio (95% confidence interval) associated with 1 beat per min and 1 mm Hg increases in the morning HR and morning systolic BP for albuminuria were 1.024 ((1.008-1.040), P=0.004) and 1.039 ((1.029-1.048), P<0.001), respectively. In conclusion, home-measured HR was significantly associated with albuminuria independent of the known risk factors for nephropathy, including home-measured systolic BP, in patients with type 2 diabetes mellitus.
A structure-activity analysis of the variation in oxime efficacy against nerve agents
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maxwell, Donald M.; Koplovitz, Irwin; Worek, Franz
2008-09-01
A structure-activity analysis was used to evaluate the variation in oxime efficacy of 2-PAM, obidoxime, HI-6 and ICD585 against nerve agents. In vivo oxime protection and in vitro oxime reactivation were used as indicators of oxime efficacy against VX, sarin, VR and cyclosarin. Analysis of in vivo oxime protection was conducted with oxime protective ratios (PR) from guinea pigs receiving oxime and atropine therapy after sc administration of nerve agent. Analysis of in vitro reactivation was conducted with second-order rate contants (k{sub r2}) for oxime reactivation of agent-inhibited acetylcholinesterase (AChE) from guinea pig erythrocytes. In vivo oxime PR and inmore » vitro k{sub r2} decreased as the volume of the alkylmethylphosphonate moiety of nerve agents increased from VX to cyclosarin. This effect was greater with 2-PAM and obidoxime (> 14-fold decrease in PR) than with HI-6 and ICD585 (< 3.7-fold decrease in PR). The decrease in oxime PR and k{sub r2} as the volume of the agent moiety conjugated to AChE increased was consistent with a steric hindrance mechanism. Linear regression of log (PR-1) against log (k{sub r2} {center_dot} [oxime dose]) produced two offset parallel regression lines that delineated a significant difference between the coupling of oxime reactivation and oxime protection for HI-6 and ICD585 compared to 2-PAM and obidoxime. HI-6 and ICD585 appeared to be 6.8-fold more effective than 2-PAM and obidoxime at coupling oxime reactivation to oxime protection, which suggested that the isonicotinamide group that is common to both of these oximes, but absent from 2-PAM and obidoxime, is important for oxime efficacy.« less
Tiwari, Anjani K; Ojha, Himanshu; Kaul, Ankur; Dutta, Anupama; Srivastava, Pooja; Shukla, Gauri; Srivastava, Rakesh; Mishra, Anil K
2009-07-01
Nuclear magnetic resonance imaging is a very useful tool in modern medical diagnostics, especially when gadolinium (III)-based contrast agents are administered to the patient with the aim of increasing the image contrast between normal and diseased tissues. With the use of soft modelling techniques such as quantitative structure-activity relationship/quantitative structure-property relationship after a suitable description of their molecular structure, we have studied a series of phosphonic acid for designing new MRI contrast agent. Quantitative structure-property relationship studies with multiple linear regression analysis were applied to find correlation between different calculated molecular descriptors of the phosphonic acid-based chelating agent and their stability constants. The final quantitative structure-property relationship mathematical models were found as--quantitative structure-property relationship Model for phosphonic acid series (Model 1)--log K(ML) = {5.00243(+/-0.7102)}- MR {0.0263(+/-0.540)}n = 12 l r l = 0.942 s = 0.183 F = 99.165 quantitative structure-property relationship Model for phosphonic acid series (Model 2)--log K(ML) = {5.06280(+/-0.3418)}- MR {0.0252(+/- .198)}n = 12 l r l = 0.956 s = 0.186 F = 99.256.
Support vector regression to predict porosity and permeability: Effect of sample size
NASA Astrophysics Data System (ADS)
Al-Anazi, A. F.; Gates, I. D.
2012-02-01
Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function type and loss functions used.
Estimating Driving Performance Based on EEG Spectrum Analysis
NASA Astrophysics Data System (ADS)
Lin, Chin-Teng; Wu, Ruei-Cheng; Jung, Tzyy-Ping; Liang, Sheng-Fu; Huang, Teng-Yi
2005-12-01
The growing number of traffic accidents in recent years has become a serious concern to society. Accidents caused by driver's drowsiness behind the steering wheel have a high fatality rate because of the marked decline in the driver's abilities of perception, recognition, and vehicle control abilities while sleepy. Preventing such accidents caused by drowsiness is highly desirable but requires techniques for continuously detecting, estimating, and predicting the level of alertness of drivers and delivering effective feedbacks to maintain their maximum performance. This paper proposes an EEG-based drowsiness estimation system that combines electroencephalogram (EEG) log subband power spectrum, correlation analysis, principal component analysis, and linear regression models to indirectly estimate driver's drowsiness level in a virtual-reality-based driving simulator. Our results demonstrated that it is feasible to accurately estimate quantitatively driving performance, expressed as deviation between the center of the vehicle and the center of the cruising lane, in a realistic driving simulator.
NASA Technical Reports Server (NTRS)
Bigger, J. T. Jr; Steinman, R. C.; Rolnitzky, L. M.; Fleiss, J. L.; Albrecht, P.; Cohen, R. J.
1996-01-01
BACKGROUND. The purposes of the present study were (1) to establish normal values for the regression of log(power) on log(frequency) for, RR-interval fluctuations in healthy middle-aged persons, (2) to determine the effects of myocardial infarction on the regression of log(power) on log(frequency), (3) to determine the effect of cardiac denervation on the regression of log(power) on log(frequency), and (4) to assess the ability of power law regression parameters to predict death after myocardial infarction. METHODS AND RESULTS. We studied three groups: (1) 715 patients with recent myocardial infarction; (2) 274 healthy persons age and sex matched to the infarct sample; and (3) 19 patients with heart transplants. Twenty-four-hour RR-interval power spectra were computed using fast Fourier transforms and log(power) was regressed on log(frequency) between 10(-4) and 10(-2) Hz. There was a power law relation between log(power) and log(frequency). That is, the function described a descending straight line that had a slope of approximately -1 in healthy subjects. For the myocardial infarction group, the regression line for log(power) on log(frequency) was shifted downward and had a steeper negative slope (-1.15). The transplant (denervated) group showed a larger downward shift in the regression line and a much steeper negative slope (-2.08). The correlation between traditional power spectral bands and slope was weak, and that with log(power) at 10(-4) Hz was only moderate. Slope and log(power) at 10(-4) Hz were used to predict mortality and were compared with the predictive value of traditional power spectral bands. Slope and log(power) at 10(-4) Hz were excellent predictors of all-cause mortality or arrhythmic death. To optimize the prediction of death, we calculated a log(power) intercept that was uncorrelated with the slope of the power law regression line. We found that the combination of slope and zero-correlation log(power) was an outstanding predictor, with a relative risk of > 10, and was better than any combination of the traditional power spectral bands. The combination of slope and log(power) at 10(-4) Hz also was an excellent predictor of death after myocardial infarction. CONCLUSIONS. Myocardial infarction or denervation of the heart causes a steeper slope and decreased height of the power law regression relation between log(power) and log(frequency) of RR-interval fluctuations. Individually and, especially, combined, the power law regression parameters are excellent predictors of death of any cause or arrhythmic death and predict these outcomes better than the traditional power spectral bands.
Zhang, Qun; Zhang, Qunzhi; Sornette, Didier
2016-01-01
We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the "S&P 500 1987" bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs.
Nam, R K; Klotz, L H; Jewett, M A; Danjoux, C; Trachtenberg, J
1998-01-01
To study the rate of change in prostate specific antigen (PSA velocity) in patients with prostate cancer initially managed by 'watchful waiting'. Serial PSA levels were determined in 141 patients with prostate cancer confirmed by biopsy, who were initially managed expectantly and enrolled between May 1990 and December 1995. Sixty-seven patients eventually underwent surgery (mean age 59 years) because they chose it (the decision for surgery was not based on PSA velocity). A cohort of 74 patients remained on 'watchful waiting' (mean age 69 years). Linear regression and logarithmic transformations were used to segregate those patients who showed a rapid rise, defined as a > 50% rise in PSA per year (or a doubling time of < 2 years) and designated 'rapid risers'. An initial analysis based on a minimum of two PSA values showed that 31% were rapid risers. Only 15% of patients with more than three serial PSA determinations over > or = 6 months showed a rapid rise in PSA level. There was no advantage of log-linear analysis over linear regression models. Three serial PSA determinations over > or = 6 months in patients with clinically localized prostate cancer identifies a subset (15%) of patients with a rapidly rising PSA level. Shorter PSA surveillance with fewer PSA values may falsely identify patients with rapid rises in PSA level. However, further follow-up is required to determine if a rapid rise in PSA level identifies a subset of patients with an aggressive biological phenotype who are either still curable or who have already progressed to incurability through metastatic disease.
(Draft) Community air pollution and mortality: Analysis of 1980 data from US metropolitan areas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lipfert, F.W.
1992-11-01
1980 data from up to 149 metropolitan areas were used to define cross-sectional associations between community air pollution and ``excess`` human mortality. The regression model proposed by Ozkaynak and Thurston (1987), which accounted for age, race, education, poverty, and population density, was evaluated and several new models were developed. The new models also accounted for migration, drinking water hardness, and smoking, and included a more detailed description of race. Cause-of-death categories analyzed include all causes, all ``non-external`` causes, major cardiovascular diseases, and chronic obstructive pulmonary diseases (COPD). Both annual mortality rates and their logarithms were analyzed. Air quality data weremore » obtained from the EPA AIRS database (TSP, SO{sub 4}{sup =}, Mn, and ozone) and from the inhalable particulate network (PM{sub 15}, PM{sub 2.5} and SO{sub 4}{sup =}, for 63{sup 4} locations). The data on particulates were averaged across all monitoring stations available for each SMSA and the TSP data were restricted to the year 1980. The associations between mortality and air pollution were found to be dependent on the socioeconomic factors included in the models, the specific locations included in the data set, and the type of statistical model used. Statistically significant associations were found as follows: between TSP and mortality due to non-external causes with log-linear models, but not with a linear model betweenestimated 10-year average (1980--90) ozone levels and 1980 non-external and cardiovascular deaths; and between TSP and COPD mortality for both linear and log-linear models. When the sulfate contribution to TSP was subtracted, the relationship with COPD mortality was strengthened.« less
(Draft) Community air pollution and mortality: Analysis of 1980 data from US metropolitan areas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lipfert, F.W.
1992-11-01
1980 data from up to 149 metropolitan areas were used to define cross-sectional associations between community air pollution and excess'' human mortality. The regression model proposed by Ozkaynak and Thurston (1987), which accounted for age, race, education, poverty, and population density, was evaluated and several new models were developed. The new models also accounted for migration, drinking water hardness, and smoking, and included a more detailed description of race. Cause-of-death categories analyzed include all causes, all non-external'' causes, major cardiovascular diseases, and chronic obstructive pulmonary diseases (COPD). Both annual mortality rates and their logarithms were analyzed. Air quality data weremore » obtained from the EPA AIRS database (TSP, SO[sub 4][sup =], Mn, and ozone) and from the inhalable particulate network (PM[sub 15], PM[sub 2.5] and SO[sub 4][sup =], for 63[sup 4] locations). The data on particulates were averaged across all monitoring stations available for each SMSA and the TSP data were restricted to the year 1980. The associations between mortality and air pollution were found to be dependent on the socioeconomic factors included in the models, the specific locations included in the data set, and the type of statistical model used. Statistically significant associations were found as follows: between TSP and mortality due to non-external causes with log-linear models, but not with a linear model betweenestimated 10-year average (1980--90) ozone levels and 1980 non-external and cardiovascular deaths; and between TSP and COPD mortality for both linear and log-linear models. When the sulfate contribution to TSP was subtracted, the relationship with COPD mortality was strengthened.« less
NASA Astrophysics Data System (ADS)
Jarzyna, Jadwiga A.; Krakowska, Paulina I.; Puskarczyk, Edyta; Wawrzyniak-Guz, Kamila; Zych, Marcin
2018-03-01
More than 70 rock samples from so-called sweet spots, i.e. the Ordovician Sa Formation and Silurian Ja Member of Pa Formation from the Baltic Basin (North Poland) were examined in the laboratory to determine bulk and grain density, total and effective/dynamic porosity, absolute permeability, pore diameters size, total surface area, and natural radioactivity. Results of the pyrolysis, i.e., TOC (Total Organic Carbon) together with S1 and S2 - parameters used to determine the hydrocarbon generation potential of rocks, were also considered. Elemental composition from chemical analyses and mineral composition from XRD measurements were also included. SCAL analysis, NMR experiments, Pressure Decay Permeability measurements together with water immersion porosimetry and adsorption/ desorption of nitrogen vapors method were carried out along with the comprehensive interpretation of the outcomes. Simple and multiple linear statistical regressions were used to recognize mutual relationships between parameters. Observed correlations and in some cases big dispersion of data and discrepancies in the property values obtained from different methods were the basis for building shale gas rock model for well logging interpretation. The model was verified by the result of the Monte Carlo modelling of spectral neutron-gamma log response in comparison with GEM log results.
Yu, S; Gao, S; Gan, Y; Zhang, Y; Ruan, X; Wang, Y; Yang, L; Shi, J
2016-04-01
Quantitative structure-property relationship modelling can be a valuable alternative method to replace or reduce experimental testing. In particular, some endpoints such as octanol-water (KOW) and organic carbon-water (KOC) partition coefficients of polychlorinated biphenyls (PCBs) are easier to predict and various models have been already developed. In this paper, two different methods, which are multiple linear regression based on the descriptors generated using Dragon software and hologram quantitative structure-activity relationships, were employed to predict suspended particulate matter (SPM) derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of 209 PCBs. The predictive ability of the derived models was validated using a test set. The performances of all these models were compared with EPI Suite™ software. The results indicated that the proposed models were robust and satisfactory, and could provide feasible and promising tools for the rapid assessment of the SPM derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of PCBs.
Jones, Andrew M; Lomas, James; Moore, Peter T; Rice, Nigel
2016-10-01
We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007-2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a conditional density approximation estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and among the best four for bias and goodness of fit. The best performing model for bias is linear regression with square-root-transformed dependent variables, whereas a generalized linear model with square-root link function and Poisson distribution performs best in terms of goodness of fit. Commonly used models utilizing a log-link are shown to perform badly relative to other models considered in our comparison.
An approach to checking case-crossover analyses based on equivalence with time-series methods.
Lu, Yun; Symons, James Morel; Geyh, Alison S; Zeger, Scott L
2008-03-01
The case-crossover design has been increasingly applied to epidemiologic investigations of acute adverse health effects associated with ambient air pollution. The correspondence of the design to that of matched case-control studies makes it inferentially appealing for epidemiologic studies. Case-crossover analyses generally use conditional logistic regression modeling. This technique is equivalent to time-series log-linear regression models when there is a common exposure across individuals, as in air pollution studies. Previous methods for obtaining unbiased estimates for case-crossover analyses have assumed that time-varying risk factors are constant within reference windows. In this paper, we rely on the connection between case-crossover and time-series methods to illustrate model-checking procedures from log-linear model diagnostics for time-stratified case-crossover analyses. Additionally, we compare the relative performance of the time-stratified case-crossover approach to time-series methods under 3 simulated scenarios representing different temporal patterns of daily mortality associated with air pollution in Chicago, Illinois, during 1995 and 1996. Whenever a model-be it time-series or case-crossover-fails to account appropriately for fluctuations in time that confound the exposure, the effect estimate will be biased. It is therefore important to perform model-checking in time-stratified case-crossover analyses rather than assume the estimator is unbiased.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Yip, Cyril C Y; Sridhar, Siddharth; Cheng, Andrew K W; Fung, Ami M Y; Cheng, Vincent C C; Chan, Kwok-Hung; Yuen, Kwok-Yung
2017-08-01
HHV-6 reactivation in immunocompromised patients is common and may be associated with serious morbidity and mortality; therefore, early detection and initiation of therapy might be of benefit. Real-time PCR assays allow for early identification of HHV-6 reactivation to assist in providing a timely response. Thus, we compared the performance of an in-house developed HHV-6 quantitative PCR assay with a commercially available kit, the RealStar ® HHV-6 PCR Kit. The analytical sensitivity, analytical specificity, linearity, precision and accuracy of the in-house developed HHV-6 qPCR assay were evaluated. The diagnostic performance of the in-house HHV-6 qPCR assay was compared with the RealStar ® HHV-6 PCR Kit, using 72 clinical specimens and 17 proficiency testing samples. Linear regression analysis of the quantitative results showed a dynamic range from 2 to 10 log 10 copies/ml and a coefficient of determination (R 2 ) of 0.999 for the in-house assay. A dilution series demonstrated a limit of detection and a limit of quantification of 1.7 log 10 and 2 log 10 copies/ml, respectively. The precision of the assay was highly reproducible among runs with coefficients of variance (CV) ranging from 0.27% to 4.37%. A comparison of 27 matched samples showed an excellent correlation between the quantitative viral loads measured by the in-house HHV-6 qPCR assay and the RealStar ® HHV-6 PCR Kit (R 2 =0.926; P<0.0001), with an average bias of -0.24 log 10 copies/ml. The in-house developed HHV-6 qPCR method is a sensitive and reliable assay with lower cost for the detection and quantification of HHV-6 DNA when compared to the RealStar ® HHV-6 PCR Kit. Copyright © 2017 Elsevier B.V. All rights reserved.
Quantum algorithm for linear regression
NASA Astrophysics Data System (ADS)
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
On comparison of net survival curves.
Pavlič, Klemen; Perme, Maja Pohar
2017-05-02
Relative survival analysis is a subfield of survival analysis where competing risks data are observed, but the causes of death are unknown. A first step in the analysis of such data is usually the estimation of a net survival curve, possibly followed by regression modelling. Recently, a log-rank type test for comparison of net survival curves has been introduced and the goal of this paper is to explore its properties and put this methodological advance into the context of the field. We build on the association between the log-rank test and the univariate or stratified Cox model and show the analogy in the relative survival setting. We study the properties of the methods using both the theoretical arguments as well as simulations. We provide an R function to enable practical usage of the log-rank type test. Both the log-rank type test and its model alternatives perform satisfactory under the null, even if the correlation between their p-values is rather low, implying that both approaches cannot be used simultaneously. The stratified version has a higher power in case of non-homogeneous hazards, but also carries a different interpretation. The log-rank type test and its stratified version can be interpreted in the same way as the results of an analogous semi-parametric additive regression model despite the fact that no direct theoretical link can be established between the test statistics.
Nistal-Nuño, Beatriz
2017-03-31
In Chile, a new law introduced in March 2012 lowered the blood alcohol concentration (BAC) limit for impaired drivers from 0.1% to 0.08% and the BAC limit for driving under the influence of alcohol from 0.05% to 0.03%, but its effectiveness remains uncertain. The goal of this investigation was to evaluate the effects of this enactment on road traffic injuries and fatalities in Chile. A retrospective cohort study. Data were analyzed using a descriptive and a Generalized Linear Models approach, type of Poisson regression, to analyze deaths and injuries in a series of additive Log-Linear Models accounting for the effects of law implementation, month influence, a linear time trend and population exposure. A review of national databases in Chile was conducted from 2003 to 2014 to evaluate the monthly rates of traffic fatalities and injuries associated to alcohol and in total. It was observed a decrease by 28.1 percent in the monthly rate of traffic fatalities related to alcohol as compared to before the law (P<0.001). Adding a linear time trend as a predictor, the decrease was by 20.9 percent (P<0.001).There was a reduction in the monthly rate of traffic injuries related to alcohol by 10.5 percent as compared to before the law (P<0.001). Adding a linear time trend as a predictor, the decrease was by 24.8 percent (P<0.001). Positive results followed from this new 'zero-tolerance' law implemented in 2012 in Chile. Chile experienced a significant reduction in alcohol-related traffic fatalities and injuries, being a successful public health intervention.
Havla, Lukas; Schneider, Moritz J; Thierfelder, Kolja M; Beyer, Sebastian E; Ertl-Wagner, Birgit; Reiser, Maximilian F; Sommer, Wieland H; Dietrich, Olaf
2016-02-01
The purpose of this study was to propose and evaluate a new wavelet-based technique for classification of arterial and venous vessels using time-resolved cerebral CT perfusion data sets. Fourteen consecutive patients (mean age 73 yr, range 17-97) with suspected stroke but no pathology in follow-up MRI were included. A CT perfusion scan with 32 dynamic phases was performed during intravenous bolus contrast-agent application. After rigid-body motion correction, a Paul wavelet (order 1) was used to calculate voxelwise the wavelet power spectrum (WPS) of each attenuation-time course. The angiographic intensity A was defined as the maximum of the WPS, located at the coordinates T (time axis) and W (scale/width axis) within the WPS. Using these three parameters (A, T, W) separately as well as combined by (1) Fisher's linear discriminant analysis (FLDA), (2) logistic regression (LogR) analysis, or (3) support vector machine (SVM) analysis, their potential to classify 18 different arterial and venous vessel segments per subject was evaluated. The best vessel classification was obtained using all three parameters A and T and W [area under the curve (AUC): 0.953 with FLDA and 0.957 with LogR or SVM]. In direct comparison, the wavelet-derived parameters provided performance at least equal to conventional attenuation-time-course parameters. The maximum AUC obtained from the proposed wavelet parameters was slightly (although not statistically significantly) higher than the maximum AUC (0.945) obtained from the conventional parameters. A new method to classify arterial and venous cerebral vessels with high statistical accuracy was introduced based on the time-domain wavelet transform of dynamic CT perfusion data in combination with linear or nonlinear multidimensional classification techniques.
NASA Astrophysics Data System (ADS)
Paul, Suman; Ali, Muhammad; Chatterjee, Rima
2018-01-01
Velocity of compressional wave ( V P) of coal and non-coal lithology is predicted from five wells from the Bokaro coalfield (CF), India. Shear sonic travel time logs are not recorded for all wells under the study area. Shear wave velocity ( Vs) is available only for two wells: one from east and other from west Bokaro CF. The major lithologies of this CF are dominated by coal, shaly coal of Barakar formation. This paper focuses on the (a) relationship between Vp and Vs, (b) prediction of Vp using regression and neural network modeling and (c) estimation of maximum horizontal stress from image log. Coal characterizes with low acoustic impedance (AI) as compared to the overlying and underlying strata. The cross-plot between AI and Vp/ Vs is able to identify coal, shaly coal, shale and sandstone from wells in Bokaro CF. The relationship between Vp and Vs is obtained with excellent goodness of fit ( R 2) ranging from 0.90 to 0.93. Linear multiple regression and multi-layered feed-forward neural network (MLFN) models are developed for prediction Vp from two wells using four input log parameters: gamma ray, resistivity, bulk density and neutron porosity. Regression model predicted Vp shows poor fit (from R 2 = 0.28) to good fit ( R 2 = 0.79) with the observed velocity. MLFN model predicted Vp indicates satisfactory to good R2 values varying from 0.62 to 0.92 with the observed velocity. Maximum horizontal stress orientation from a well at west Bokaro CF is studied from Formation Micro-Imager (FMI) log. Breakouts and drilling-induced fractures (DIFs) are identified from the FMI log. Breakout length of 4.5 m is oriented towards N60°W whereas the orientation of DIFs for a cumulative length of 26.5 m is varying from N15°E to N35°E. The mean maximum horizontal stress in this CF is towards N28°E.
Sieve analysis using the number of infecting pathogens.
Follmann, Dean; Huang, Chiung-Yu
2017-12-14
Assessment of vaccine efficacy as a function of the similarity of the infecting pathogen to the vaccine is an important scientific goal. Characterization of pathogen strains for which vaccine efficacy is low can increase understanding of the vaccine's mechanism of action and offer targets for vaccine improvement. Traditional sieve analysis estimates differential vaccine efficacy using a single identifiable pathogen for each subject. The similarity between this single entity and the vaccine immunogen is quantified, for example, by exact match or number of mismatched amino acids. With new technology, we can now obtain the actual count of genetically distinct pathogens that infect an individual. Let F be the number of distinct features of a species of pathogen. We assume a log-linear model for the expected number of infecting pathogens with feature "f," f=1,…,F. The model can be used directly in studies with passive surveillance of infections where the count of each type of pathogen is recorded at the end of some interval, or active surveillance where the time of infection is known. For active surveillance, we additionally assume that a proportional intensity model applies to the time of potentially infectious exposures and derive product and weighted estimating equation (WEE) estimators for the regression parameters in the log-linear model. The WEE estimator explicitly allows for waning vaccine efficacy and time-varying distributions of pathogens. We give conditions where sieve parameters have a per-exposure interpretation under passive surveillance. We evaluate the methods by simulation and analyze a phase III trial of a malaria vaccine. © 2017, The International Biometric Society.
Liu, Peter Y; Takahashi, Paul Y; Roebuck, Pamela D; Iranmanesh, Ali; Veldhuis, Johannes D
2005-09-01
Pulsatile and thus total testosterone (Te) secretion declines in older men, albeit for unknown reasons. Analytical models forecast that aging may reduce the capability of endogenous luteinizing hormone (LH) pulses to stimulate Leydig cell steroidogenesis. This notion has been difficult to test experimentally. The present study used graded doses of a selective gonadotropin releasing hormone (GnRH)-receptor antagonist to yield four distinct strata of pulsatile LH release in each of 18 healthy men ages 23-72 yr. Deconvolution analysis was applied to frequently sampled LH and Te concentration time series to quantitate pulsatile Te secretion over a 16-h interval. Log-linear regression was used to relate pulsatile LH secretion to attendant pulsatile Te secretion (LH-Te drive) across the four stepwise interventions in each subject. Linear regression of the 18 individual estimates of LH-Te feedforward dose-response slopes on age disclosed a strongly negative relationship (r = -0.721, P < 0.001). Accordingly, the present data support the thesis that aging in healthy men attenuates amplitude-dependent LH drive of burst-like Te secretion. The experimental strategy of graded suppression of neuroglandular outflow may have utility in estimating dose-response adaptations in other endocrine systems.
NASA Astrophysics Data System (ADS)
Wu, Z.; Guo, Z.
2017-12-01
We measured 15 parent polycyclic aromatic hydrocarbons (PAHs) in atmosphere and water during a research cruise from the East China Sea (ECS) to the northwestern Pacific Ocean (NWP) in the spring of 2015 to investigate the occurrence, air-sea gas exchange, and gas-particle partitioning of PAHs with a particular focus on the influence of East Asian continental outflow. The gaseous PAH composition and identification of sources were consistent with PAHs from the upwind area, indicating that the gaseous PAHs (three- to five-ring PAHs) were influenced by upwind land pollution. In addition, air-sea exchange fluxes of gaseous PAHs were estimated to be -54.2 to 107.4 ng m-2 d-1, and was indicative of variations of land-based PAH inputs. The logarithmic gas-particle partition coefficient (logKp) of PAHs regressed linearly against the logarithmic subcooled liquid vapor pressure, with a slope of -0.25. This was significantly larger than the theoretical value (-1), implying disequilibrium between the gaseous and particulate PAHs over the NWP. The non-equilibrium of PAH gas-particle partitioning was shielded from the volatilization of three-ring gaseous PAHs from seawater and lower soot concentrations in particular when the oceanic air masses prevailed. Modeling PAH absorption into organic matter and adsorption onto soot carbon revealed that the status of PAH gas-particle partitioning deviated more from the modeling Kp for oceanic air masses than those for continental air masses, which coincided with higher volatilization of three-ring PAHs and confirmed the influence of air-sea exchange. Meanwhile, significant linear regressions between logKp and logKoa (logKsa) for PAHs were observed for continental air masses, suggesting the dominant effect of East Asian continental outflow on atmospheric PAHs over the NWP during the sampling campaign.
O'Boyle, Cathy; Chen, Sean I; Little, Julie-Anne
2017-04-01
Clinically, picture acuity tests are thought to overestimate visual acuity (VA) compared with letter tests, but this has not been systematically investigated in children with amblyopia. This study compared VA measurements with the LogMAR Crowded Kay Picture test to the LogMAR Crowded Keeler Letter acuity test in a group of young children with amblyopia. 58 children (34 male) with amblyopia (22 anisometropic, 18 strabismic and 18 with both strabismic/anisometropic amblyopia) aged 4-6 years (mean=68.7, range=48-83 months) underwent VA measurements. VA chart testing order was randomised, but the amblyopic eye was tested before the fellow eye. All participants wore up-to-date refractive correction. The Kay Picture test significantly overestimated VA by 0.098 logMAR (95% limits of agreement (LOA), 0.13) in the amblyopic eye and 0.088 logMAR (95% LOA, 0.13) in the fellow eye, respectively (p<0.001). No interactions were found from occlusion therapy, refractive correction or type of amblyopia on VA results (p>0.23). For both the amblyopic and fellow eyes, Bland-Altman plots demonstrated a systematic and predictable difference between Kay Picture and Keeler Letter charts across the range of acuities tested (Keeler acuity: amblyopic eye 0.75 to -0.05 logMAR; fellow eye 0.45 to -0.15 logMAR). Linear regression analysis (p<0.00001) and also slope values close to one (amblyopic 0.98, fellow 0.86) demonstrate that there is no proportional bias. The Kay Picture test consistently overestimated VA by approximately 0.10 logMAR when compared with the Keeler Letter test in young children with amblyopia. Due to the predictable difference found between both crowded logMAR acuity tests, it is reasonable to adjust Kay Picture acuity thresholds by +0.10 logMAR to compute expected Keeler Letter acuity scores. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Zhang, Qun; Zhang, Qunzhi; Sornette, Didier
2016-01-01
We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the “S&P 500 1987” bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs. PMID:27806093
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Hourcade-Potelleret, F; Laporte, S; Lehnert, V; Delmar, P; Benghozi, Renée; Torriani, U; Koch, R; Mismetti, P
2015-06-01
Epidemiological evidence that the risk of coronary heart disease is inversely associated with the level of high-density lipoprotein cholesterol (HDL-C) has motivated several phase III programmes with cholesteryl ester transfer protein (CETP) inhibitors. To assess alternative methods to predict clinical response of CETP inhibitors. Meta-regression analysis on raising HDL-C drugs (statins, fibrates, niacin) in randomised controlled trials. 51 trials in secondary prevention with a total of 167,311 patients for a follow-up >1 year where HDL-C was measured at baseline and during treatment. The meta-regression analysis showed no significant association between change in HDL-C (treatment vs comparator) and log risk ratio (RR) of clinical endpoint (non-fatal myocardial infarction or cardiac death). CETP inhibitors data are consistent with this finding (RR: 1.03; P5-P95: 0.99-1.21). A prespecified sensitivity analysis by drug class suggested that the strength of relationship might differ between pharmacological groups. A significant association for both statins (p<0.02, log RR=-0.169-0.0499*HDL-C change, R(2)=0.21) and niacin (p=0.02, log RR=1.07-0.185*HDL-C change, R(2)=0.61) but not fibrates (p=0.18, log RR=-0.367+0.077*HDL-C change, R(2)=0.40) was shown. However, the association was no longer detectable after adjustment for low-density lipoprotein cholesterol for statins or exclusion of open trials for niacin. Meta-regression suggested that CETP inhibitors might not influence coronary risk. The relation between change in HDL-C level and clinical endpoint may be drug dependent, which limits the use of HDL-C as a surrogate marker of coronary events. Other markers of HDL function may be more relevant. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Analyzing Response Times in Tests with Rank Correlation Approaches
ERIC Educational Resources Information Center
Ranger, Jochen; Kuhn, Jorg-Tobias
2013-01-01
It is common practice to log-transform response times before analyzing them with standard factor analytical methods. However, sometimes the log-transformation is not capable of linearizing the relation between the response times and the latent traits. Therefore, a more general approach to response time analysis is proposed in the current…
Body burden levels of dioxin, furans, and PCBs among frequent consumers of Great Lakes sport fish
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falk, C.; Hanrahan, L.; Anderson, H.A.
1999-02-01
Dioxins, furans, and polychlorinated biphenyls (PCBs) are toxic, persist in the environment, and bioaccumulate to concentrations that can be harmful to humans. The Health Departments of five GL states, Wisconsin, Michigan, Ohio, Illinois, and Indiana, formed a consortium to study body burden levels of chemical residues in fish consumers of Lakes Michigan, Huron, and Erie. In Fall 1993, a telephone survey was administered to sport angler households to obtain fish consumption habits and demographics. A blood sample was obtained from a portion of the study subjects. One hundred serum samples were analyzed for 8 dioxin, 10 furan, and 4 coplanarmore » PCB congeners. Multiple linear regression was conducted to assess the predictability of the following covariates: GL sport fish species, age, BMI, gender, years sport fish consumed, and lake. Median total dioxin toxic equivalents (TEq), total furan TEq, and total coplanar PCB TEq were higher among all men than all women (P = 0.0001). Lake trout, salmon, age, BMI, and gender were significant regression predictors of log (total coplanar PCBs). Lake trout, age, gender, and lake were significant regression predictors of log (total furans). Age was the only significant predictor of total dioxin levels.« less
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Wockner, Leesa F; Hoffmann, Isabell; O'Rourke, Peter; McCarthy, James S; Marquart, Louise
2017-08-25
The efficacy of vaccines aimed at inhibiting the growth of malaria parasites in the blood can be assessed by comparing the growth rate of parasitaemia in the blood of subjects treated with a test vaccine compared to controls. In studies using induced blood stage malaria (IBSM), a type of controlled human malaria infection, parasite growth rate has been measured using models with the intercept on the y-axis fixed to the inoculum size. A set of statistical models was evaluated to determine an optimal methodology to estimate parasite growth rate in IBSM studies. Parasite growth rates were estimated using data from 40 subjects published in three IBSM studies. Data was fitted using 12 statistical models: log-linear, sine-wave with the period either fixed to 48 h or not fixed; these models were fitted with the intercept either fixed to the inoculum size or not fixed. All models were fitted by individual, and overall by study using a mixed effects model with a random effect for the individual. Log-linear models and sine-wave models, with the period fixed or not fixed, resulted in similar parasite growth rate estimates (within 0.05 log 10 parasites per mL/day). Average parasite growth rate estimates for models fitted by individual with the intercept fixed to the inoculum size were substantially lower by an average of 0.17 log 10 parasites per mL/day (range 0.06-0.24) compared with non-fixed intercept models. Variability of parasite growth rate estimates across the three studies analysed was substantially higher (3.5 times) for fixed-intercept models compared with non-fixed intercept models. The same tendency was observed in models fitted overall by study. Modelling data by individual or overall by study had minimal effect on parasite growth estimates. The analyses presented in this report confirm that fixing the intercept to the inoculum size influences parasite growth estimates. The most appropriate statistical model to estimate the growth rate of blood-stage parasites in IBSM studies appears to be a log-linear model fitted by individual and with the intercept estimated in the log-linear regression. Future studies should use this model to estimate parasite growth rates.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Acquah, Gifty E.; Via, Brian K.; Billor, Nedret; Fasina, Oladiran O.; Eckhardt, Lori G.
2016-01-01
As new markets, technologies and economies evolve in the low carbon bioeconomy, forest logging residue, a largely untapped renewable resource will play a vital role. The feedstock can however be variable depending on plant species and plant part component. This heterogeneity can influence the physical, chemical and thermochemical properties of the material, and thus the final yield and quality of products. Although it is challenging to control compositional variability of a batch of feedstock, it is feasible to monitor this heterogeneity and make the necessary changes in process parameters. Such a system will be a first step towards optimization, quality assurance and cost-effectiveness of processes in the emerging biofuel/chemical industry. The objective of this study was therefore to qualitatively classify forest logging residue made up of different plant parts using both near infrared spectroscopy (NIRS) and Fourier transform infrared spectroscopy (FTIRS) together with linear discriminant analysis (LDA). Forest logging residue harvested from several Pinus taeda (loblolly pine) plantations in Alabama, USA, were classified into three plant part components: clean wood, wood and bark and slash (i.e., limbs and foliage). Five-fold cross-validated linear discriminant functions had classification accuracies of over 96% for both NIRS and FTIRS based models. An extra factor/principal component (PC) was however needed to achieve this in FTIRS modeling. Analysis of factor loadings of both NIR and FTIR spectra showed that, the statistically different amount of cellulose in the three plant part components of logging residue contributed to their initial separation. This study demonstrated that NIR or FTIR spectroscopy coupled with PCA and LDA has the potential to be used as a high throughput tool in classifying the plant part makeup of a batch of forest logging residue feedstock. Thus, NIR/FTIR could be employed as a tool to rapidly probe/monitor the variability of forest biomass so that the appropriate online adjustments to parameters can be made in time to ensure process optimization and product quality. PMID:27618901
Asano, Elio Fernando; Rasera, Irineu; Shiraga, Elisabete Cristina
2012-12-01
This is an exploratory analysis of potential variables associated with open Roux-en-Y gastric bypass (RYGB) surgery hospitalization resource use pattern. Cross-sectional study based on an administrative database (DATASUS) records. Inclusion criteria were adult patients undergoing RYGB between Jan/2008 and Jun/2011. Dependent variables were length of stay (LoS) and ICU need. Independent variables were: gender, age, region, hospital volume, surgery at certified center of excellence (CoE) by the Surgical Review Corporation (SRC), teaching hospital, and year of hospitalization. Univariate and multivariate analysis (logistic regression for ICU need and linear regression for length of stay) were performed. Data from 13,069 surgeries were analyzed. In crude analysis, hospital volume was the most impactful variable associated with log-transformed LoS (1.312 ± 0.302 high volume vs. 1.670 ± 0.581 low volume, p < 0.001), whereas for ICU need it was certified CoE (odds ratio (OR), 0.016; 95% confidence interval (CI), 0.010-0.026). After adjustment by logistic regression, certified CoE remained as the strongest predictor of ICU need (OR, 0.011; 95% CI, 0.007-0.018), followed by hospital volume (OR, 3.096; 95% CI, 2.861-3.350). Age group, male gender, and teaching hospital were also significantly associated (p < 0.001). For log-transformed LoS, final model includes hospital volume (coefficient, -0.223; 95% CI, -0.250 to -0.196) and teaching hospital (coefficient, 0.375; 95% CI, 0.351-0.398). Region of Brazil was not associated with any of the outcomes. High-volume hospital was the strongest predictor for shorter LoS, whereas SRC certification was the strongest predictor of lower ICU need. Public health policies targeting an increase of efficiency and patient access to the procedure should take into account these results.
Brummel, Sean S; Singh, Kumud K; Maihofer, Adam X.; Farhad, Mona; Qin, Min; Fenton, Terry; Nievergelt, Caroline M.; Spector, Stephen A.
2015-01-01
Background Ancestry informative markers (AIMs) measure genetic admixtures within an individual beyond self-reported racial/ethnic (SRR) groups. Here, we used genetically determined ancestry (GDA) across SRR groups and examine associations between GDA and HIV-1 RNA and CD4+ counts in HIV-positive children in the US. Methods 41 AIMs, developed to distinguish 7 continental regions, were detected by real-time-PCR in 994 HIV-positive, antiretroviral naïve children. GDA was estimated comparing each individual’s genotypes to allele frequencies found in a large set of reference individuals originating from global populations using STRUCTURE. The means of GDA were calculated for each category of SRR. Linear regression was used to model GDA on CD4+ count and log10 RNA, adjusting for SRR and age. Results Subjects were 61% Black, 25% Hispanic, 13% White and 1.3% Unknown. The mean age was 2.3 years (45% male), mean CD4+ count 981 cells/mm3, and mean log10 RNA 5.11. Marked heterogeneity was found for all SRR groups with high admixture for Hispanics. In adjusted linear regression models, subjects with 100% European ancestry were estimated to have 0.33 higher log10 RNA levels (95% CI: (0.03, 0.62), p=0.028) and 253 CD4+ cells /mm3 lower (95% CI: (−517, 11), p = 0.06) in CD4+ count, compared to subjects with 100% African ancestry. Conclusion Marked continental admixture was found among this cohort of HIV-infected children from the US. GDA contributed to differences in RNA and CD4+ counts beyond SRR, and should be considered when outcomes associated with HIV infection are likely to have a genetic component. PMID:26536313
Brummel, Sean S; Singh, Kumud K; Maihofer, Adam X; Farhad, Mona; Qin, Min; Fenton, Terry; Nievergelt, Caroline M; Spector, Stephen A
2016-04-15
Ancestry informative markers (AIMs) measure genetic admixtures within an individual beyond self-reported racial/ethnic (SRR) groups. Here, we used genetically determined ancestry (GDA) across SRR groups and examine associations between GDA and HIV-1 RNA and CD4 counts in HIV-positive children in the United States. Forty-one AIMs, developed to distinguish 7 continental regions, were detected by real-time PCR in 994 HIV-positive, antiretroviral naive children. GDA was estimated comparing each individual's genotypes to allele frequencies found in a large set of reference individuals originating from global populations using STRUCTURE. The means of GDA were calculated for each category of SRR. Linear regression was used to model GDA on CD4 count and log10 RNA, adjusting for SRR and age. Subjects were 61% black, 25% Hispanic, 13% white, and 1.3% Unknown. The mean age was 2.3 years (45% male), mean CD4 count of 981 cells per cubic millimeter, and mean log10 RNA of 5.11. Marked heterogeneity was found for all SRR groups with high admixture for Hispanics. In adjusted linear regression models, subjects with 100% European ancestry were estimated to have 0.33 higher log10 RNA levels (95% CI: 0.03 to 0.62, P = 0.028) and 253 CD4 cells per cubic millimeter lower (95% CI: -517 to 11, P = 0.06) in CD4 count, compared to subjects with 100% African ancestry. Marked continental admixture was found among this cohort of HIV-infected children from the United States. GDA contributed to differences in RNA and CD4 counts beyond SRR and should be considered when outcomes associated with HIV infection are likely to have a genetic component.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Coban, Melahat; Inci, Ayca
2018-07-01
Autosomal dominant polycystic kidney disease (ADPKD) is a common congenital chronic kidney disease (CKD). We report here the relationship of serum angiopoietin-1 (Ang-1), Ang-2, and vascular endothelial growth factor (VEGF) with total kidney volume (TKV), total cyst volume (TCV), and renal failure in adult ADPKD patients at various stages of CKD. This cross-sectional study was conducted with 50 patients diagnosed with ADPKD and a control group of 45 age-matched healthy volunteers. In patient group, TKV and TCV were determined with upper abdominal magnetic resonance imaging, whereas in controls, TKV was determined with ultrasonography according to ellipsoid formula. Renal function was assessed with serum creatinine, estimated glomerular filtration rate (eGFR), and spot urinary protein/creatinine ratio (UPCR). Ang-1, Ang-2, and VEGF were measured using enzyme-linked immunosorbent assay. Patients with ADPKD had significantly higher TKV (p < 0.001) and UPCR (p < 0.001), and lower eGFR (p ≤ 0.001) compared to the controls. Log 10 Ang-2 was found to be higher in ADPKD patients at all CKD stages. Multiple linear regression analysis showed that there was no association between log 10 Ang-1, log 10 Ang-2, or log 10 VEGF and creatinine, eGFR, UPCR, log 10 TKV (p > 0.05). There was no association of serum angiogenic growth factors with TKV or renal failure in ADPKD patients. Increased serum Ang-2 observed in stages 1-2 CKD suggests that angiogenesis plays a role in the progression of early stage ADPKD, but not at later stages of the disease. This may be explained by possible cessation of angiogenesis in advanced stages of CKD due to the increased number of sclerotic glomeruli.
van der Lee, R; Pfaffendorf, M; van Zwieten, P A
2000-11-01
To investigate a possible relationship between the time courses of action of various calcium antagonists and their lipophilicity, characterized as log P-values. The functional experiments were performed in vitro in human small subcutaneous arteries (internal diameter 591 +/- 51 microm, n = 7 for each concentration), obtained from cosmetic surgery (mamma reduction and abdominoplasty). The vessels were investigated in an isometric wire myograph. The vasodilator effect of the calcium antagonists was quantified by means of log IC50-values, and the onset of the vasodilator effect for each concentration studied was expressed as time to Eeq90-values (time to reach 90% of the maximal effect). Log IC50-values were -8.46 +/- 0.09, -8.33 +/- 0.25 and -8.72 +/- 0.16 for nifedipine, felodipine and (S)-lercanidipine, respectively (not significant). On average, nifedipine reached time to Eeq90 in 11 +/- 1 min. For felodipine and (S)-lercanidipine the corresponding values were 60 +/- 11 min and 99 +/- 9 min, respectively. The differences between these values were statistically significant (P< 0.01). In spite of these differences in the in-vitro human vascular model, the three calcium antagonists are equipotent with regard to their vasodilator effects. Linear regression analysis of the correlation between the logarithm of the membrane partition coefficient (log P-values) of the calcium antagonists tested [2.50, 4.46 and 6.88 for nifedipine, felodipine and (S)-lercanidipine, respectively] and their respective values found for time to Eeq90 was highly significant. It appears that a higher log P-value is correlated with a slower onset of action.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Wiley, J.B.; Atkins, John T.; Tasker, Gary D.
2000-01-01
Multiple and simple least-squares regression models for the log10-transformed 100-year discharge with independent variables describing the basin characteristics (log10-transformed and untransformed) for 267 streamflow-gaging stations were evaluated, and the regression residuals were plotted as areal distributions that defined three regions of the State, designated East, North, and South. Exploratory data analysis procedures identified 31 gaging stations at which discharges are different than would be expected for West Virginia. Regional equations for the 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year peak discharges were determined by generalized least-squares regression using data from 236 gaging stations. Log10-transformed drainage area was the most significant independent variable for all regions.Equations developed in this study are applicable only to rural, unregulated, streams within the boundaries of West Virginia. The accuracy of estimating equations is quantified by measuring the average prediction error (from 27.7 to 44.7 percent) and equivalent years of record (from 1.6 to 20.0 years).
Decomposition and model selection for large contingency tables.
Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter
2010-04-01
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.
Neary, M; Lamorde, M; Olagunju, A; Darin, K M; Merry, C; Byakika-Kibwika, P; Back, D J; Siccardi, M; Owen, A; Scarsi, K K
2017-09-01
Reduced levonorgestrel concentrations from the levonorgestrel contraceptive implant was previously seen when given concomitantly with efavirenz. We sought to assess whether single nucleotide polymorphisms (SNPs) in genes involved in efavirenz and nevirapine metabolism were linked to these changes in levonorgestrel concentration. SNPs in CYP2B6, CYP2A6, NR1I2, and NR1I3 were analyzed. Associations of participant demographics and genotype with levonorgestrel pharmacokinetics were evaluated in HIV-positive women using the levonorgestrel implant plus efavirenz- or nevirapine-based antiretroviral therapy (ART), in comparison to ART-naïve women using multivariate linear regression. Efavirenz group: CYP2B6 516G>T was associated with lower levonorgestrel log 10 C max and log 10 AUC. CYP2B6 15582C>T was associated with lower log 10 AUC. Nevirapine group: CYP2B6 516G>T was associated with higher log 10 C max and lower log 10 C min . Pharmacogenetic variations influenced subdermal levonorgestrel pharmacokinetics in HIV-positive women, indicating that the magnitude of the interaction with non-nucleoside reverse transcriptase inhibitors (NNRTIs) is influenced by host genetics. © 2017 American Society for Clinical Pharmacology and Therapeutics.
[Association of mineral and bone disorder with increasing PWV in CKD 1-5 patients].
Shiota, Jun; Watanabe, Mitsuhiro
2007-01-01
The association between pulse wave velocity(PWV) and chronic kidney disease mineral and bone disorder(CKD-MBD) was investigated in CKD 1-5 patients without dialysis. Pulse pressure(PP), PWV, serum Cr, non-HDL-cholesterol, Alb, Ca, Pi, calcitriol, intact-PTH and BAP were measured in sixty patients not receiving a phosphate binder or vitamin D. Using the relationship between age and baPWV in healthy subjects, we determined delta baPWV(measured baPWV-calculated baPWV) as an index for the effect of CKD-related factors. delta baPWV was significantly higher in diabetic patients (p < 0.00001). Simple regression analysis revealed that delta baPWV was positively correlated with PP (p < 0.05) and Log(intact-PTH) (p < 0.01), but negatively correlated with Log(estimated GFR) and Log(calcitriol) (p < 0.01). Multiple regression analysis revealed that delta baPWV was significantly associated with PP and calcitriol, or PP and intact-PTH. These results suggest a relationship between PWV and CKD-MBD.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Postmolar gestational trophoblastic neoplasia: beyond the traditional risk factors.
Bakhtiyari, Mahmood; Mirzamoradi, Masoumeh; Kimyaiee, Parichehr; Aghaie, Abbas; Mansournia, Mohammd Ali; Ashrafi-Vand, Sepideh; Sarfjoo, Fatemeh Sadat
2015-09-01
To investigate the slope of linear regression of postevacuation serum hCG as an independent risk factor for postmolar gestational trophoblastic neoplasia (GTN). Multicenter retrospective cohort study. Academic referral health care centers. All subjects with confirmed hydatidiform mole and at least four measurements of β-hCG titer. None. Type and magnitude of the relationship between the slope of linear regression of β-hCG as a new risk factor and GTN using Bayesian logistic regression with penalized log-likelihood estimation. Among the high-risk and low-risk molar pregnancy cases, 11 (18.6%) and 19 cases (13.3%) had GTN, respectively. No significant relationship was found between the components of a high-risk pregnancy and GTN. The β-hCG return slope was higher in the spontaneous cure group. However, the initial level of this hormone in the first measurement was higher in the GTN group compared with in the spontaneous recovery group. The average time for diagnosing GTN in the high-risk molar pregnancy group was 2 weeks less than that of the low-risk molar pregnancy group. In addition to slope of linear regression of β-hCG (odds ratio [OR], 12.74, confidence interval [CI], 5.42-29.2), abortion history (OR, 2.53; 95% CI, 1.27-5.04) and large uterine height for gestational age (OR, 1.26; CI, 1.04-1.54) had the maximum effects on GTN outcome, respectively. The slope of linear regression of β-hCG was introduced as an independent risk factor, which could be used for clinical decision making based on records of β-hCG titer and subsequent prevention program. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
A watershed's response to logging and roads: South Fork of Caspar Creek, California, 1967-1976
Raymond M. Rice; Forest B. Tilley; Patricia A. Datzman
1979-01-01
The effect of logging and roadbuilding on erosion and sedimentation are analyzed by comparing the North Fork and South Fork of Caspar Creek, in northern California. Increased sediment production during the 4 years after road construction, was 326 cu yd/sq mi/yr—80 percent greater than that predicted by the predisturbance regression analysis. The average...
Nixon, R M; Bansback, N; Brennan, A
2007-03-15
Mixed treatment comparison (MTC) is a generalization of meta-analysis. Instead of the same treatment for a disease being tested in a number of studies, a number of different interventions are considered. Meta-regression is also a generalization of meta-analysis where an attempt is made to explain the heterogeneity between the treatment effects in the studies by regressing on study-level covariables. Our focus is where there are several different treatments considered in a number of randomized controlled trials in a specific disease, the same treatment can be applied in several arms within a study, and where differences in efficacy can be explained by differences in the study settings. We develop methods for simultaneously comparing several treatments and adjusting for study-level covariables by combining ideas from MTC and meta-regression. We use a case study from rheumatoid arthritis. We identified relevant trials of biologic verses standard therapy or placebo and extracted the doses, comparators and patient baseline characteristics. Efficacy is measured using the log odds ratio of achieving six-month ACR50 responder status. A random-effects meta-regression model is fitted which adjusts the log odds ratio for study-level prognostic factors. A different random-effect distribution on the log odds ratios is allowed for each different treatment. The odds ratio is found as a function of the prognostic factors for each treatment. The apparent differences in the randomized trials between tumour necrosis factor alpha (TNF- alpha) antagonists are explained by differences in prognostic factors and the analysis suggests that these drugs as a class are not different from each other. Copyright (c) 2006 John Wiley & Sons, Ltd.
Oki, Ryo; Ito, Kazuto; Suzuki, Rie; Fujizuka, Yuji; Arai, Seiji; Miyazawa, Yoshiyuki; Sekine, Yoshitaka; Koike, Hidekazu; Matsui, Hiroshi; Shibata, Yasuhiro; Suzuki, Kazuhiro
2018-04-26
Japan has experienced a drastic increase in the incidence of prostate cancer (PC). To assess changes in the risk for PC, we investigated baseline prostate specific antigen (PSA) levels in first-time screened men, across a 25-year period. In total, 72,654 men, aged 50-79, underwent first-time PSA screening in Gunma prefecture between 1992 and 2016. Changes in the distribution of PSA levels were investigated, including the percentage of men with a PSA above cut-off values and linear regression analyses comparing log 10 PSA with age. The 'ultimate incidence' of PC and clinically significant PC (CSPC) were estimated using the PC risk calculator. Changes in the age-standardized incidence rate (AIR) during this period were analyzed. The calculated coefficients of linear regression for age versus log 10 PSA fluctuated during the 25-year period, but no trend was observed. In addition, the percentage of men with a PSA above cut-off values varied in each 5-year period, with no specific trend. The 'risk calculator (RC)-based AIR' of PC and CSPC were stable between 1992 and 2016. Therefore, the baseline risk for developing PC has remained unchanged in the past 25 years, in Japan. The drastic increase in the incidence of PC, beginning around 2000, may be primarily due to increased PSA screening in the country. © 2018 UICC.
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M
2018-01-01
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Using foreground/background analysis to determine leaf and canopy chemistry
NASA Technical Reports Server (NTRS)
Pinzon, J. E.; Ustin, S. L.; Hart, Q. J.; Jacquemoud, S.; Smith, M. O.
1995-01-01
Spectral Mixture Analysis (SMA) has become a well established procedure for analyzing imaging spectrometry data, however, the technique is relatively insensitive to minor sources of spectral variation (e.g., discriminating stressed from unstressed vegetation and variations in canopy chemistry). Other statistical approaches have been tried e.g., stepwise multiple linear regression analysis to predict canopy chemistry. Grossman et al. reported that SMLR is sensitive to measurement error and that the prediction of minor chemical components are not independent of patterns observed in more dominant spectral components like water. Further, they observed that the relationships were strongly dependent on the mode of expressing reflectance (R, -log R) and whether chemistry was expressed on a weight (g/g) or are basis (g/sq m). Thus, alternative multivariate techniques need to be examined. Smith et al. reported a revised SMA that they termed Foreground/Background Analysis (FBA) that permits directing the analysis along any axis of variance by identifying vectors through the n-dimensional spectral volume orthonormal to each other. Here, we report an application of the FBA technique for the detection of canopy chemistry using a modified form of the analysis.
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-01-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Analysis of the two-point velocity correlations in turbulent boundary layer flows
NASA Technical Reports Server (NTRS)
Oberlack, M.
1995-01-01
The general objective of the present work is to explore the use of Rapid Distortion Theory (RDT) in analysis of the two-point statistics of the log-layer. RDT is applicable only to unsteady flows where the non-linear turbulence-turbulence interaction can be neglected in comparison to linear turbulence-mean interactions. Here we propose to use RDT to examine the structure of the large energy-containing scales and their interaction with the mean flow in the log-region. The contents of the work are twofold: First, two-point analysis methods will be used to derive the law-of-the-wall for the special case of zero mean pressure gradient. The basic assumptions needed are one-dimensionality in the mean flow and homogeneity of the fluctuations. It will be shown that a formal solution of the two-point correlation equation can be obtained as a power series in the von Karman constant, known to be on the order of 0.4. In the second part, a detailed analysis of the two-point correlation function in the log-layer will be given. The fundamental set of equations and a functional relation for the two-point correlation function will be derived. An asymptotic expansion procedure will be used in the log-layer to match Kolmogorov's universal range and the one-point correlations to the inviscid outer region valid for large correlation distances.
Nie, Xiaobing; Zheng, Wei Xing; Cao, Jinde
2016-12-01
In this paper, the coexistence and dynamical behaviors of multiple equilibrium points are discussed for a class of memristive neural networks (MNNs) with unbounded time-varying delays and nonmonotonic piecewise linear activation functions. By means of the fixed point theorem, nonsmooth analysis theory and rigorous mathematical analysis, it is proven that under some conditions, such n-neuron MNNs can have 5 n equilibrium points located in ℜ n , and 3 n of them are locally μ-stable. As a direct application, some criteria are also obtained on the multiple exponential stability, multiple power stability, multiple log-stability and multiple log-log-stability. All these results reveal that the addressed neural networks with activation functions introduced in this paper can generate greater storage capacity than the ones with Mexican-hat-type activation function. Numerical simulations are presented to substantiate the theoretical results. Copyright © 2016 Elsevier Ltd. All rights reserved.
Role of T1 mapping as a complementary tool to T2* for non-invasive cardiac iron overload assessment.
Torlasco, Camilla; Cassinerio, Elena; Roghi, Alberto; Faini, Andrea; Capecchi, Marco; Abdel-Gadir, Amna; Giannattasio, Cristina; Parati, Gianfranco; Moon, James C; Cappellini, Maria D; Pedrotti, Patrizia
2018-01-01
Iron overload-related heart failure is the principal cause of death in transfusion dependent patients, including those with Thalassemia Major. Linking cardiac siderosis measured by T2* to therapy improves outcomes. T1 mapping can also measure iron; preliminary data suggests it may have higher sensitivity for iron, particularly for early overload (the conventional cut-point for no iron by T2* is 20ms, but this is believed insensitive). We compared T1 mapping to T2* in cardiac iron overload. In a prospectively large single centre study of 138 Thalassemia Major patients and 32 healthy controls, we compared T1 mapping to dark blood and bright blood T2* acquired at 1.5T. Linear regression analysis was used to assess the association of T2* and T1. A "moving window" approach was taken to understand the strength of the association at different levels of iron overload. The relationship between T2* (here dark blood) and T1 is described by a log-log linear regression, which can be split in three different slopes: 1) T2* low, <20ms, r2 = 0.92; 2) T2* = 20-30ms, r2 = 0.48; 3) T2*>30ms, weak relationship. All subjects with T2*<20ms had low T1; among those with T2*>20ms, 38% had low T1 with most of the subjects in the T2* range 20-30ms having a low T1. In established cardiac iron overload, T1 and T2* are concordant. However, in the 20-30ms T2* range, T1 mapping appears to detect iron. These data support previous suggestions that T1 detects missed iron in 1 out of 3 subjects with normal T2*, and that T1 mapping is complementary to T2*. The clinical significance of a low T1 with normal T2* should be further investigated.
Gupta, Deepak K; Claggett, Brian; Wells, Quinn; Cheng, Susan; Li, Man; Maruthur, Nisa; Selvin, Elizabeth; Coresh, Josef; Konety, Suma; Butler, Kenneth R; Mosley, Thomas; Boerwinkle, Eric; Hoogeveen, Ron; Ballantyne, Christie M; Solomon, Scott D
2015-05-21
Natriuretic peptides promote natriuresis, diuresis, and vasodilation. Experimental deficiency of natriuretic peptides leads to hypertension (HTN) and cardiac hypertrophy, conditions more common among African Americans. Hospital-based studies suggest that African Americans may have reduced circulating natriuretic peptides, as compared to Caucasians, but definitive data from community-based cohorts are lacking. We examined plasma N-terminal pro B-type natriuretic peptide (NTproBNP) levels according to race in 9137 Atherosclerosis Risk in Communities (ARIC) Study participants (22% African American) without prevalent cardiovascular disease at visit 4 (1996-1998). Multivariable linear and logistic regression analyses were performed adjusting for clinical covariates. Among African Americans, percent European ancestry was determined from genetic ancestry informative markers and then examined in relation to NTproBNP levels in multivariable linear regression analysis. NTproBNP levels were significantly lower in African Americans (median, 43 pg/mL; interquartile range [IQR], 18, 88) than Caucasians (median, 68 pg/mL; IQR, 36, 124; P<0.0001). In multivariable models, adjusted log NTproBNP levels were 40% lower (95% confidence interval [CI], -43, -36) in African Americans, compared to Caucasians, which was consistent across subgroups of age, gender, HTN, diabetes, insulin resistance, and obesity. African-American race was also significantly associated with having nondetectable NTproBNP (adjusted OR, 5.74; 95% CI, 4.22, 7.80). In multivariable analyses in African Americans, a 10% increase in genetic European ancestry was associated with a 7% (95% CI, 1, 13) increase in adjusted log NTproBNP. African Americans have lower levels of plasma NTproBNP than Caucasians, which may be partially owing to genetic variation. Low natriuretic peptide levels in African Americans may contribute to the greater risk for HTN and its sequalae in this population. © 2015 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F
2018-06-01
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Adsorptive removal of pharmaceuticals from water by commercial and waste-based carbons.
Calisto, Vânia; Ferreira, Catarina I A; Oliveira, João A B P; Otero, Marta; Esteves, Valdemar I
2015-04-01
This work describes the single adsorption of seven pharmaceuticals (carbamazepine, oxazepam, sulfamethoxazole, piroxicam, cetirizine, venlafaxine and paroxetine) from water onto a commercially available activated carbon and a non-activated carbon produced by pyrolysis of primary paper mill sludge. Kinetics and equilibrium adsorption studies were performed using a batch experimental approach. For all pharmaceuticals, both carbons presented fast kinetics (equilibrium times varying from less than 5 min to 120 min), mainly described by a pseudo-second order model. Equilibrium data were appropriately described by the Langmuir and Freundlich isotherm models, the last one giving slightly higher correlation coefficients. The fitted parameters obtained for both models were quite different for the seven pharmaceuticals under study. In order to evaluate the influence of water solubility, log Kow, pKa, polar surface area and number of hydrogen bond acceptors of pharmaceuticals on the adsorption parameters, multiple linear regression analysis was performed. The variability is mainly due to log Kow followed by water solubility, in the case of the waste-based carbon, and due to water solubility in the case of the commercial activated carbon. Copyright © 2015 Elsevier Ltd. All rights reserved.
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
A decline in the prevalence of injecting drug users in Estonia, 2005–2009
Uusküla, A; Rajaleid, K; Talu, A; Abel-Ollo, K; Des Jarlais, DC
2013-01-01
Aims and setting Descriptions of behavioural epidemics have received little attention compared with infectious disease epidemics in Eastern Europe. Here we report a study aimed at estimating trends in the prevalence of injection drug use between 2005 and 2009 in Estonia. Design and methods The number of injection drug users (IDUs) aged 15–44 each year between 2005 and 2009 was estimated using capture-recapture methodology based on 4 data sources (2 treatment data bases: drug abuse and non-fatal overdose treatment; criminal justice (drug related offences) and mortality (injection drug use related deaths) data). Poisson log-linear regression models were applied to the matched data, with interactions between data sources fitted to replicate the dependencies between the data sources. Linear regression was used to estimate average change over time. Findings there were 24305, 12292, 238, 545 records and 8100, 1655, 155, 545 individual IDUs identified in the four capture sources (Police, drug treatment, overdose, and death registry, accordingly) over the period 2005 – 2009. The estimated prevalence of IDUs among the population aged 15–44 declined from 2.7% (1.8–7.9%) in 2005 to 2.0% (1.4–5.0%) in 2008, and 0.9% (0.7–1.7%) in 2009. Regression analysis indicated an average reduction of over 1700 injectors per year. Conclusion While the capture-recapture method has known limitations, the results are consistent with other data from Estonia. Identifying the drivers of change in the prevalence of injection drug use warrants further research. PMID:23290632
Schwantes-An, Tae-Hwi; Sung, Heejong; Sabourin, Jeremy A; Justice, Cristina M; Sorant, Alexa J M; Wilson, Alexander F
2016-01-01
In this study, the effects of (a) the minor allele frequency of the single nucleotide variant (SNV), (b) the degree of departure from normality of the trait, and (c) the position of the SNVs on type I error rates were investigated in the Genetic Analysis Workshop (GAW) 19 whole exome sequence data. To test the distribution of the type I error rate, 5 simulated traits were considered: standard normal and gamma distributed traits; 2 transformed versions of the gamma trait (log 10 and rank-based inverse normal transformations); and trait Q1 provided by GAW 19. Each trait was tested with 313,340 SNVs. Tests of association were performed with simple linear regression and average type I error rates were determined for minor allele frequency classes. Rare SNVs (minor allele frequency < 0.05) showed inflated type I error rates for non-normally distributed traits that increased as the minor allele frequency decreased. The inflation of average type I error rates increased as the significance threshold decreased. Normally distributed traits did not show inflated type I error rates with respect to the minor allele frequency for rare SNVs. There was no consistent effect of transformation on the uniformity of the distribution of the location of SNVs with a type I error.
Circulating fibrinogen but not D-dimer level is associated with vital exhaustion in school teachers.
Kudielka, Brigitte M; Bellingrath, Silja; von Känel, Roland
2008-07-01
Meta-analyses have established elevated fibrinogen and D-dimer levels in the circulation as biological risk factors for the development and progression of coronary artery disease (CAD). Here, we investigated whether vital exhaustion (VE), a known psychosocial risk factor for CAD, is associated with fibrinogen and D-dimer levels in a sample of apparently healthy school teachers. The teaching profession has been proposed as a potentially high stressful occupation due to enhanced psychosocial stress at the workplace. Plasma fibrinogen and D-dimer levels were measured in 150 middle-aged male and female teachers derived from the first year of the Trier-Teacher-Stress-Study. Log-transformed levels were analyzed using linear regression. Results yielded a significant association between VE and fibrinogen (p = 0.02) but not D-dimer controlling for relevant covariates. Further investigation of possible interaction effects resulted in a significant association between fibrinogen and the interaction term "VE x gender" (p = 0.05). In a secondary analysis, we reran linear regression models for males and females separately. Gender-specific results revealed that the association between fibrinogen and VE remained significant in males but not females. In sum, the present data support the notion that fibrinogen levels are positively related to VE. Elevated fibrinogen might be one biological pathway by which chronic work stress may impact on teachers' cardiovascular health in the long run.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
A kinetic energy model of two-vehicle crash injury severity.
Sobhani, Amir; Young, William; Logan, David; Bahrololoom, Sareh
2011-05-01
An important part of any model of vehicle crashes is the development of a procedure to estimate crash injury severity. After reviewing existing models of crash severity, this paper outlines the development of a modelling approach aimed at measuring the injury severity of people in two-vehicle road crashes. This model can be incorporated into a discrete event traffic simulation model, using simulation model outputs as its input. The model can then serve as an integral part of a simulation model estimating the crash potential of components of the traffic system. The model is developed using Newtonian Mechanics and Generalised Linear Regression. The factors contributing to the speed change (ΔV(s)) of a subject vehicle are identified using the law of conservation of momentum. A Log-Gamma regression model is fitted to measure speed change (ΔV(s)) of the subject vehicle based on the identified crash characteristics. The kinetic energy applied to the subject vehicle is calculated by the model, which in turn uses a Log-Gamma Regression Model to estimate the Injury Severity Score of the crash from the calculated kinetic energy, crash impact type, presence of airbag and/or seat belt and occupant age. Copyright © 2010 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bartiko, Daniel; Chaffe, Pedro; Bonumá, Nadia
2017-04-01
Floods may be strongly affected by climate, land-use, land-cover and water infrastructure changes. However, it is common to model this process as stationary. This approach has been questioned, especially when it involves estimate of the frequency and magnitude of extreme events for designing and maintaining hydraulic structures, as those responsible for flood control and dams safety. Brazil is the third largest producer of hydroelectricity in the world and many of the country's dams are located in the Southern Region. So, it seems appropriate to investigate the presence of non-stationarity in the affluence in these plants. In our study, we used historical flood data from the Brazilian National Grid Operator (ONS) to explore trends in annual maxima in river flow of the 38 main rivers flowing to Southern Brazilian reservoirs (records range from 43 to 84 years). In the analysis, we assumed a two-parameter log-normal distribution a linear regression model was applied in order to allow for the mean to vary with time. We computed recurrence reduction factors to characterize changes in the return period of an initially estimated 100 year-flood by a log-normal stationary model. To evaluate whether or not a particular site exhibits positive trend, we only considered data series with linear regression slope coefficients that exhibit significance levels (p<0,05). The significance level was calculated using the one-sided Student's test. The trend model residuals were analyzed using the Anderson-Darling normality test, the Durbin-Watson test for the independence and the Breusch-Pagan test for heteroscedasticity. Our results showed that 22 of the 38 data series analyzed have a significant positive trend. The trends were mainly in three large basins: Iguazu, Uruguay and Paranapanema, which suffered changes in land use and flow regularization in the last years. The calculated return period for the series that presented positive trend varied from 50 to 77 years for a 100 year-flood estimated by stationary model when considering a planning horizon equal to ten years. We conclude that attention should be given for future projects developed in this area, including the incorporation of non-stationarity analysis, search for answers to such changes and incorporation of new data to increase the reliability of the estimates.
Kumar, K Vasanth
2007-04-02
Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables
ERIC Educational Resources Information Center
Henson, Robert A.; Templin, Jonathan L.; Willse, John T.
2009-01-01
This paper uses log-linear models with latent variables (Hagenaars, in "Loglinear Models with Latent Variables," 1993) to define a family of cognitive diagnosis models. In doing so, the relationship between many common models is explicitly defined and discussed. In addition, because the log-linear model with latent variables is a general model for…
Characterizing Sleep Structure Using the Hypnogram
Swihart, Bruce J.; Caffo, Brian; Bandeen-Roche, Karen; Punjabi, Naresh M.
2008-01-01
Objectives: Research on the effects of sleep-disordered breathing (SDB) on sleep structure has traditionally been based on composite sleep-stage summaries. The primary objective of this investigation was to demonstrate the utility of log-linear and multistate analysis of the sleep hypnogram in evaluating differences in nocturnal sleep structure in subjects with and without SDB. Methods: A community-based sample of middle-aged and older adults with and without SDB matched on age, sex, race, and body mass index was identified from the Sleep Heart Health Study. Sleep was assessed with home polysomnography and categorized into rapid eye movement (REM) and non-REM (NREM) sleep. Log-linear and multistate survival analysis models were used to quantify the frequency and hazard rates of transitioning, respectively, between wakefulness, NREM sleep, and REM sleep. Results: Whereas composite sleep-stage summaries were similar between the two groups, subjects with SDB had higher frequencies and hazard rates for transitioning between the three states. Specifically, log-linear models showed that subjects with SDB had more wake-to-NREM sleep and NREM sleep-to-wake transitions, compared with subjects without SDB. Multistate survival models revealed that subjects with SDB transitioned more quickly from wake-to-NREM sleep and NREM sleep-to-wake than did subjects without SDB. Conclusions: The description of sleep continuity with log-linear and multistate analysis of the sleep hypnogram suggests that such methods can identify differences in sleep structure that are not evident with conventional sleep-stage summaries. Detailed characterization of nocturnal sleep evolution with event history methods provides additional means for testing hypotheses on how specific conditions impact sleep continuity and whether sleep disruption is associated with adverse health outcomes. Citation: Swihart BJ; Caffo B; Bandeen-Roche K; Punjabi NM. Characterizing sleep structure using the hypnogram. J Clin Sleep Med 2008;4(4):349–355. PMID:18763427
Sedimentary sequence evolution in a Foredeep basin: Eastern Venezuela
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bejarano, C.; Funes, D.; Sarzalho, S.
1996-08-01
Well log-seismic sequence stratigraphy analysis in the Eastern Venezuela Foreland Basin leads to study of the evolution of sedimentary sequences onto the Cretaceous-Paleocene passive margin. This basin comprises two different foredeep sub-basins: The Guarico subbasin to the west, older, and the Maturin sub-basin to the east, younger. A foredeep switching between these two sub-basins is observed at 12.5 m.y. Seismic interpretation and well log sections across the study area show sedimentary sequences with transgressive sands and coastal onlaps to the east-southeast for the Guarico sub-basin, as well as truncations below the switching sequence (12.5 m.y.), and the Maturin sub-basin showsmore » apparent coastal onlaps to the west-northwest, as well as a marine onlap (deeper water) in the west, where it starts to establish. Sequence stratigraphy analysis of these sequences with well logs allowed the study of the evolution of stratigraphic section from Paleocene to middle Miocene (68.0-12.0 m.y.). On the basis of well log patterns, the sequences were divided in regressive-transgressive-regressive sedimentary cycles caused by changes in relative sea level. Facies distributions were analyzed and the sequences were divided into simple sequences or sub- sequences of a greater frequencies than third order depositional sequences.« less
Akter, Salima; Rahman, Mohammad Khalilur
2015-01-01
Adipose tissue-derived hormone leptin plays a functional role in glucose tolerance through its effects on insulin secretion and insulin sensitivity which also represent the risk factors for nonalcoholic fatty liver disease (NAFLD). The present study explored the gender specific association of serum leptin and insulinemic indices with NAFLD in Bangladeshi prediabetic subjects. Under a cross-sectional analytical design a total of 110 ultrasound examined prediabetic subjects, aged 25–68 years consisting of 57.3% male (55.6% non NAFLD and 44.4% NAFLD) and 42.7% female (57.4% non NAFLD and 42.6% NAFLD), were investigated. Insulin secretory function (HOMA%B) and insulin sensitivity (HOMA%S) were calculated from homeostasis model assessment (HOMA). Serum leptin showed significant positive correlation with fasting insulin (r = 0.530, P = 0.004), postprandial insulin (r = 0.384, P = 0.042) and HOMA-IR (r = 0.541, P = 0.003) as well as significant negative correlation with HOMA%S (r = -0.388, P = 0.046) and HOMA%B (r = -0.356, P = 0.039) in male prediabetic subjects with NAFLD. In multiple linear regression analysis, log transformed leptin showed significant positive association with HOMA-IR (β = 0.706, P <0.001) after adjusting the effects of body mass index (BMI), triglyceride (TG) and HOMA%B in male subjects with NAFLD. In binary logistic regression analysis, only log leptin [OR 1.29 95% (C.I) (1.11–1.51), P = 0.001] in male subjects as well as HOMA%B [OR 0.94 95% (C.I) (0.89–0.98), P = 0.012], HOMA-IR [OR 3.30 95% (C.I) (0.99–10.95), P = 0.049] and log leptin [OR 1.10 95% (C.I) (1.01–1.20), P = 0.026] in female subjects were found to be independent determinants of NAFLD after adjusting the BMI and TG. Serum leptin seems to have an association with NAFLD both in male and female prediabetic subjects and this association in turn, is mediated by insulin secretory dysfunction and insulin resistance among these subjects. PMID:26569494
Raymond M. Rice; Norman H. Pillsbury; Kurt W. Schmidt
1985-01-01
Abstract - A linear discriminant function, developed to predict debris avalanches after clearcut logging on a granitic batholith in northwestern California, was tested on data from two batholiths. The equation was inaccurate in predicting slope stability on one of them. A new equation based on slope, crown cover, and distance from a stream (retained from the original...
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies
ERIC Educational Resources Information Center
Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre
2018-01-01
Purpose: Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. Method: We propose a…
Janik, Leslie J; Forrester, Sean T; Soriano-Disla, José M; Kirby, Jason K; McLaughlin, Michael J; Reimann, Clemens
2015-02-01
The authors' aim was to develop rapid and inexpensive regression models for the prediction of partitioning coefficients (Kd), defined as the ratio of the total or surface-bound metal/metalloid concentration of the solid phase to the total concentration in the solution phase. Values of Kd were measured for boric acid (B[OH]3(0)) and selected added soluble oxoanions: molybdate (MoO4(2-)), antimonate (Sb[OH](6-)), selenate (SeO4(2-)), tellurate (TeO4(2-)) and vanadate (VO4(3-)). Models were developed using approximately 500 spectrally representative soils of the Geochemical Mapping of Agricultural Soils of Europe (GEMAS) program. These calibration soils represented the major properties of the entire 4813 soils of the GEMAS project. Multiple linear regression (MLR) from soil properties, partial least-squares regression (PLSR) using mid-infrared diffuse reflectance Fourier-transformed (DRIFT) spectra, and models using DRIFT spectra plus analytical pH values (DRIFT + pH), were compared with predicted log K(d + 1) values. Apart from selenate (R(2) = 0.43), the DRIFT + pH calibrations resulted in marginally better models to predict log K(d + 1) values (R(2) = 0.62-0.79), compared with those from PSLR-DRIFT (R(2) = 0.61-0.72) and MLR (R(2) = 0.54-0.79). The DRIFT + pH calibrations were applied to the prediction of log K(d + 1) values in the remaining 4313 soils. An example map of predicted log K(d + 1) values for added soluble MoO4(2-) in soils across Europe is presented. The DRIFT + pH PLSR models provided a rapid and inexpensive tool to assess the risk of mobility and potential availability of boric acid and selected oxoanions in European soils. For these models to be used in the prediction of log K(d + 1) values in soils globally, additional research will be needed to determine if soil variability is accounted on the calibration. © 2014 SETAC.
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels
2006-08-14
63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Categorical Data Analysis Using a Skewed Weibull Regression Model
NASA Astrophysics Data System (ADS)
Caron, Renault; Sinha, Debajyoti; Dey, Dipak; Polpo, Adriano
2018-03-01
In this paper, we present a Weibull link (skewed) model for categorical response data arising from binomial as well as multinomial model. We show that, for such types of categorical data, the most commonly used models (logit, probit and complementary log-log) can be obtained as limiting cases. We further compare the proposed model with some other asymmetrical models. The Bayesian as well as frequentist estimation procedures for binomial and multinomial data responses are presented in details. The analysis of two data sets to show the efficiency of the proposed model is performed.
Hawkins, Marquis S; Sevick, Mary Ann; Richardson, Caroline R; Fried, Linda F; Arena, Vincent C; Kriska, Andrea M
2011-08-01
Chronic kidney disease is a condition characterized by the deterioration of the kidney's ability to remove waste products from the body. Although treatments to slow the progression of the disease are available, chronic kidney disease may eventually lead to a complete loss of kidney function. Previous studies have shown that physical activities of moderate intensity may have renal benefits. Few studies have examined the effects of total movement on kidney function. The purpose of this study was to determine the association between time spent at all levels of physical activity intensity and sedentary behavior and kidney function. Data were obtained from the 2003-2004 and 2005-2006 National Health and Nutrition Examination Survey, a cross-sectional study of a complex, multistage probability sample of the US population. Physical activity was assessed using an accelerometer and questionnaire. Glomerular filtration rate (eGFR) was estimated using the Modification of Diet in Renal Disease study formula. To assess linear associations between levels of physical activity and sedentary behavior with log-transformed estimated GFR (eGFR), linear regression was used. In general, physical activity (light and total) was related to log eGFR in females and males. For females, the association between light and total physical activity with log eGFR was consistent regardless of diabetes status. For males, the association between light and total physical activity and log eGFR was only significant in males without diabetes. When examining the association between physical activity, measured objectively with an accelerometer, and kidney function, total and light physical activities were found to be positively associated with kidney function.
A simplified competition data analysis for radioligand specific activity determination.
Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A
1990-01-01
Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Nasari, Masoud M; Szyszkowicz, Mieczysław; Chen, Hong; Crouse, Daniel; Turner, Michelle C; Jerrett, Michael; Pope, C Arden; Hubbell, Bryan; Fann, Neal; Cohen, Aaron; Gapstur, Susan M; Diver, W Ryan; Stieb, David; Forouzanfar, Mohammad H; Kim, Sun-Young; Olives, Casey; Krewski, Daniel; Burnett, Richard T
2016-01-01
The effectiveness of regulatory actions designed to improve air quality is often assessed by predicting changes in public health resulting from their implementation. Risk of premature mortality from long-term exposure to ambient air pollution is the single most important contributor to such assessments and is estimated from observational studies generally assuming a log-linear, no-threshold association between ambient concentrations and death. There has been only limited assessment of this assumption in part because of a lack of methods to estimate the shape of the exposure-response function in very large study populations. In this paper, we propose a new class of variable coefficient risk functions capable of capturing a variety of potentially non-linear associations which are suitable for health impact assessment. We construct the class by defining transformations of concentration as the product of either a linear or log-linear function of concentration multiplied by a logistic weighting function. These risk functions can be estimated using hazard regression survival models with currently available computer software and can accommodate large population-based cohorts which are increasingly being used for this purpose. We illustrate our modeling approach with two large cohort studies of long-term concentrations of ambient air pollution and mortality: the American Cancer Society Cancer Prevention Study II (CPS II) cohort and the Canadian Census Health and Environment Cohort (CanCHEC). We then estimate the number of deaths attributable to changes in fine particulate matter concentrations over the 2000 to 2010 time period in both Canada and the USA using both linear and non-linear hazard function models.
A break-even analysis for dementia care collaboration: Partners in Dementia Care.
Morgan, Robert O; Bass, David M; Judge, Katherine S; Liu, C F; Wilson, Nancy; Snow, A Lynn; Pirraglia, Paul; Garcia-Maldonado, Maurilio; Raia, Paul; Fouladi, N N; Kunik, Mark E
2015-06-01
Dementia is a costly disease. People with dementia, their families, and their friends are affected on personal, emotional, and financial levels. Prior work has shown that the "Partners in Dementia Care" (PDC) intervention addresses unmet needs and improves psychosocial outcomes and satisfaction with care. We examined whether PDC reduced direct Veterans Health Administration (VHA) health care costs compared with usual care. This study was a cost analysis of the PDC intervention in a 30-month trial involving five VHA medical centers. Study subjects were veterans (N = 434) 50 years of age and older with dementia and their caregivers at two intervention (N = 269) and three comparison sites (N = 165). PDC is a telephone-based care coordination and support service for veterans with dementia and their caregivers, delivered through partnerships between VHA medical centers and local Alzheimer's Association chapters. We tested for differences in total VHA health care costs, including hospital, emergency department, nursing home, outpatient, and pharmacy costs, as well as program costs for intervention participants. Covariates included caregiver reports of veterans' cognitive impairment, behavior problems, and personal care dependencies. We used linear mixed model regression to model change in log total cost post-baseline over a 1-year follow-up period. Intervention participants showed higher VHA costs than usual-care participants both before and after the intervention but did not differ significantly regarding change in log costs from pre- to post-baseline periods. Pre-baseline log cost (p ≤ 0.001), baseline cognitive impairment (p ≤ 0.05), number of personal care dependencies (p ≤ 0.01), and VA service priority (p ≤ 0.01) all predicted change in log total cost. These analyses show that PDC meets veterans' needs without significantly increasing VHA health care costs. PDC addresses the priority area of care coordination in the National Plan to Address Alzheimer's Disease, offering a low-cost, structured, protocol-driven, evidence-based method for effectively delivering care coordination.
Independent effects of both right and left ventricular function on plasma brain natriuretic peptide.
Vogelsang, Thomas Wiis; Jensen, Ruben J; Monrad, Astrid L; Russ, Kaspar; Olesen, Uffe H; Hesse, Birger; Kjaer, Andreas
2007-09-01
Brain natriuretic peptide (BNP) is increased in heart failure; however, the relative contribution of the right and left ventricles is largely unknown. To investigate if right ventricular function has an independent influence on plasma BNP concentration. Right (RVEF), left ventricular ejection fraction (LVEF), and left ventricular end-diastolic volume index (LVEDVI) were determined in 105 consecutive patients by first-pass radionuclide ventriculography (FP-RNV) and multiple ECG-gated equilibrium radionuclide ventriculography (ERNV), respectively. BNP was analyzed by immunoassay. Mean LVEF was 0.51 (range 0.10-0.83) with 36% having a reduced LVEF (<0.50). Mean RVEF was 0.50 (range 0.26-0.78) with 43% having a reduced RVEF (<0.50). The mean LVEDVI was 92 ml/m2 with 22% above the upper normal limit (117 ml/m2). Mean BNP was 239 pg/ml range (0.63-2523). In univariate linear regression analysis LVEF, LVEDVI and RVEF all correlated significantly with log BNP (p<0.0001). In a multivariate analysis only RVEF and LVEF remained significant. The parameter estimates of the final adjusted model indicated that RVEF and LVEF influence on log BNP were of the same magnitude. BNP, which is a strong prognostic marker in heart failure, independently depends on both left and right ventricular systolic function. This might, at least in part, explain why BNP holds stronger prognostic value than LVEF alone.
ERIC Educational Resources Information Center
Xu, Xueli; von Davier, Matthias
2008-01-01
The general diagnostic model (GDM) utilizes located latent classes for modeling a multidimensional proficiency variable. In this paper, the GDM is extended by employing a log-linear model for multiple populations that assumes constraints on parameters across multiple groups. This constrained model is compared to log-linear models that assume…
Fatigue shifts and scatters heart rate variability in elite endurance athletes.
Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire
2013-01-01
This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in 'fatigue' or in 'no-fatigue' state in 'real life' conditions. 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms(2) and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). 172 trials were identified as in a 'fatigue' and 891 as in 'no-fatigue' state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between 'fatigue' and 'no-fatigue': HRSU (+6.27±0.61 bpm), logTPSU (-0.36±0.04), logLFSU (-0.27±0.04), logHFSU (-0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (-9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (-0.28±0.03), logLFST (-0.29±0.03), logHFST (-0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the 'fatigue' state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern.
Behavioral economic analysis of drug preference using multiple choice procedure data.
Greenwald, Mark K
2008-01-11
The multiple choice procedure has been used to evaluate preference for psychoactive drugs, relative to money amounts (price), in human subjects. The present re-analysis shows that MCP data are compatible with behavioral economic analysis of drug choices. Demand curves were constructed from studies with intravenous fentanyl, intramuscular hydromorphone and oral methadone in opioid-dependent individuals; oral d-amphetamine, oral MDMA alone and during fluoxetine treatment, and smoked marijuana alone or following naltrexone pretreatment in recreational drug users. For each participant and dose, the MCP crossover point was converted into unit price (UP) by dividing the money value ($) by the drug dose (mg/70kg). At the crossover value, the dose ceases to function as a reinforcer, so "0" was entered for this and higher UPs to reflect lack of drug choice. At lower UPs, the dose functions as a reinforcer and "1" was entered to reflect drug choice. Data for UP vs. average percent choice were plotted in log-log space to generate demand functions. Rank of order of opioid inelasticity (slope of non-linear regression) was: fentanyl>hydromorphone (continuing heroin users)>methadone>hydromorphone (heroin abstainers). Rank order of psychostimulant inelasticity was d-amphetamine>MDMA>MDMA+fluoxetine. Smoked marijuana was more inelastic with high-dose naltrexone. These findings show this method translates individuals' drug preferences into estimates of population demand, which has the potential to yield insights into pharmacotherapy efficacy, abuse liability assessment, and individual differences in susceptibility to drug abuse.
Behavioral Economic Analysis of Drug Preference Using Multiple Choice Procedure Data
Greenwald, Mark K.
2008-01-01
The Multiple Choice Procedure has been used to evaluate preference for psychoactive drugs, relative to money amounts (price), in human subjects. The present re-analysis shows that MCP data are compatible with behavioral economic analysis of drug choices. Demand curves were constructed from studies with intravenous fentanyl, intramuscular hydromorphone and oral methadone in opioid-dependent individuals; oral d-amphetamine, oral MDMA alone and during fluoxetine treatment, and smoked marijuana alone or following naltrexone pretreatment in recreational drug users. For each participant and dose, the MCP crossover point was converted into unit price (UP) by dividing the money value ($) by the drug dose (mg/70 kg). At the crossover value, the dose ceases to function as a reinforcer, so “0” was entered for this and higher UPs to reflect lack of drug choice. At lower UPs, the dose functions as a reinforcer and “1” was entered to reflect drug choice. Data for UP vs. average percent choice were plotted in log-log space to generate demand functions. Rank of order of opioid inelasticity (slope of non-linear regression) was: fentanyl > hydromorphone (continuing heroin users) > methadone > hydromorphone (heroin abstainers). Rank order of psychostimulant inelasticity was d-amphetamine > MDMA > MDMA + fluoxetine. Smoked marijuana was more inelastic with high-dose naltrexone. These findings show this method translates individuals’ drug preferences into estimates of population demand, which has the potential to yield insights into pharmacotherapy efficacy, abuse liability assessment, and individual differences in susceptibility to drug abuse. PMID:17949924
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Temperature-Dependent Survival of Hepatitis A Virus during Storage of Contaminated Onions
Sun, Y.; Laird, D. T.
2012-01-01
Pre- or postharvest contamination of green onions by hepatitis A virus (HAV) has been linked to large numbers of food-borne illnesses. Understanding HAV survival in onions would assist in projecting the risk of the disease associated with their consumption. This study defined HAV inactivation rates in contaminated green onions contained in air-permeable, moisture-retaining high-density polyethylene packages that were stored at 3, 10, 14, 20, 21, 22, and 23°C. A protocol was established to recover HAV from whole green onions, with 31% as the average recovery by infectivity assay. Viruses in eluates were primarily analyzed by a 6-well plaque assay on FRhK-4 cells. Eight storage trials, including two trials at 3°C, were conducted, with 3 to 7 onion samples per sampling and 4 to 7 samplings per trial. Linear regression correlation (r2 = 0.80 to 0.98) was observed between HAV survival and storage time for each of the 8 trials, held at specific temperatures. Increases in the storage temperature resulted in greater HAV inactivation rates, e.g., a reduction of 0.033 log PFU/day at 3.4 ± 0.3°C versus 0.185 log PFU/day at 23.4 ± 0.7°C. Thus, decimal reduction time (D) values of 30, 14, 11, and 5 days, respectively, were obtained for HAV in onions stored at 3, 10, 14, and 23°C. Further regression analysis determined that 1 degree Celsius increase would increase inactivation of HAV by 0.007 log PFU/day in onions (r2 = 0.97). The data suggest that natural degradation of HAV in contaminated fresh produce is minimal and that a preventive strategy is critical to produce safety. The results are useful in predicting the risks associated with HAV contamination in fresh produce. PMID:22544253
NASA Astrophysics Data System (ADS)
Bloomfield, J. P.; Allen, D. J.; Griffiths, K. J.
2009-06-01
SummaryLinear regression methods can be used to quantify geological controls on baseflow index (BFI). This is illustrated using an example from the Thames Basin, UK. Two approaches have been adopted. The areal extents of geological classes based on lithostratigraphic and hydrogeological classification schemes have been correlated with BFI for 44 'natural' catchments from the Thames Basin. When regression models are built using lithostratigraphic classes that include a constant term then the model is shown to have some physical meaning and the relative influence of the different geological classes on BFI can be quantified. For example, the regression constants for two such models, 0.64 and 0.69, are consistent with the mean observed BFI (0.65) for the Thames Basin, and the signs and relative magnitudes of the regression coefficients for each of the lithostratigraphic classes are consistent with the hydrogeology of the Basin. In addition, regression coefficients for the lithostratigraphic classes scale linearly with estimates of log 10 hydraulic conductivity for each lithological class. When a regression is built using a hydrogeological classification scheme with no constant term, the model does not have any physical meaning, but it has a relatively high adjusted R2 value and because of the continuous coverage of the hydrogeological classification scheme, the model can be used for predictive purposes. A model calibrated on the 44 'natural' catchments and using four hydrogeological classes (low-permeability surficial deposits, consolidated aquitards, fractured aquifers and intergranular aquifers) is shown to perform as well as a model based on a hydrology of soil types (BFIHOST) scheme in predicting BFI in the Thames Basin. Validation of this model using 110 other 'variably impacted' catchments in the Basin shows that there is a correlation between modelled and observed BFI. Where the observed BFI is significantly higher than modelled BFI the deviations can be explained by an exogenous factor, catchment urban area. It is inferred that this is may be due influences from sewage discharge, mains leakage, and leakage from septic tanks.
The Seismic Tool-Kit (STK): an open source software for seismology and signal processing.
NASA Astrophysics Data System (ADS)
Reymond, Dominique
2016-04-01
We present an open source software project (GNU public license), named STK: Seismic ToolKit, that is dedicated mainly for seismology and signal processing. The STK project that started in 2007, is hosted by SourceForge.net, and count more than 19 500 downloads at the date of writing. The STK project is composed of two main branches: First, a graphical interface dedicated to signal processing (in the SAC format (SAC_ASCII and SAC_BIN): where the signal can be plotted, zoomed, filtered, integrated, derivated, ... etc. (a large variety of IFR and FIR filter is proposed). The estimation of spectral density of the signal are performed via the Fourier transform, with visualization of the Power Spectral Density (PSD) in linear or log scale, and also the evolutive time-frequency representation (or sonagram). The 3-components signals can be also processed for estimating their polarization properties, either for a given window, or either for evolutive windows along the time. This polarization analysis is useful for extracting the polarized noises, differentiating P waves, Rayleigh waves, Love waves, ... etc. Secondly, a panel of Utilities-Program are proposed for working in a terminal mode, with basic programs for computing azimuth and distance in spherical geometry, inter/auto-correlation, spectral density, time-frequency for an entire directory of signals, focal planes, and main components axis, radiation pattern of P waves, Polarization analysis of different waves (including noize), under/over-sampling the signals, cubic-spline smoothing, and linear/non linear regression analysis of data set. A MINimum library of Linear AlGebra (MIN-LINAG) is also provided for computing the main matrix process like: QR/QL decomposition, Cholesky solve of linear system, finding eigen value/eigen vectors, QR-solve/Eigen-solve of linear equations systems ... etc. STK is developed in C/C++, mainly under Linux OS, and it has been also partially implemented under MS-Windows. Usefull links: http://sourceforge.net/projects/seismic-toolkit/ http://sourceforge.net/p/seismic-toolkit/wiki/browse_pages/
Dorrucci, Maria; Rezza, Giovanni; Porter, Kholoud; Phillips, Andrew
2007-02-15
To determine whether early postseroconversion CD4 cell counts and human immunodeficiency virus (HIV) loads have changed over time. Our analysis was based on 22 cohorts of people with known dates of seroconversion from Europe, Australia, and Canada (Concerted Action on Seroconversion to AIDS and Death in Europe Collaboration). We focused on individuals seroconverting between 1985 and 2002 who had the first CD4 cell count (n=3687) or HIV load (n=1584) measured within 2 years of seroconversion and before antiretroviral use. Linear regression models were used to assess time trends in postseroconversion CD4 cell count and HIV load. Trends in time to key thresholds were also assessed, using survival analysis. The overall median initial CD4 cell count was 570 cells/ microL (interquartile range [IQR], 413-780 cells/ microL). The median initial HIV load was 35,542 copies/mL (IQR, 7600-153,050 copies/mL; on log(10) scale, 3.9-5.2 log(10) copies/mL). The postseroconversion CD4 cell count changed by an average of -6.33 cells/ microL/year (95% confidence interval [CI], -8.47 to -4.20 cells/ microL/year; P<.001), whereas an increase was observed in log(10) HIV load (+0.044 log(10) copies/mL/year; 95% CI, +0.034 to +0.053 log(10) copies/mL/year). These trends remained after adjusting for potential confounders. The probability of progressing to a CD4 cell count of <500 cells/ microL by 24 months from seroconversion increased from 0.66 (95% CI, 0.63-0.69) for individuals who seroconverted before 1991 to 0.80 (95% CI, 0.75-0.84) for those who seroconverted during 1999-2002. These data suggest that, in Europe, there has been a trend of decrease in the early CD4 cell count and of increase in the early HIV load. Additional research will be necessary to determine whether similar trends exist in other geographical areas.
Linear Equations with the Euler Totient Function
2007-02-13
unclassified c . THIS PAGE unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 2 FLORIAN LUCA, PANTELIMON STĂNICĂ...of positive integers n such that φ(n) = φ(n+ 1), and that the set of Phibonacci numbers is A(1,1,−1) + 2. Theorem 2.1. Let C (t, a) = t3 logH(a). Then...the estimate #Aa(x) C (t, a) x log log log x√ log log x LINEAR EQUATIONS WITH THE EULER TOTIENT FUNCTION 3 holds uniformly in a and 1 ≤ t < y. Note
Inoue, Tomoaki; Maeda, Yasutaka; Sonoda, Noriyuki; Sasaki, Shuji; Kabemura, Teppei; Kobayashi, Kunihisa; Inoguchi, Toyoshi
2016-01-01
Objective Although diabetes mellitus is associated with an increased risk of heart failure with preserved ejection fraction, the underlying mechanisms leading to left ventricular diastolic dysfunction (LVDD) remain poorly understood. The study was designed to assess the risk factors for LVDD in patients with type 2 diabetes mellitus. Research design and methods The study cohort included 101 asymptomatic patients with type 2 diabetes mellitus without overt heart disease. Left ventricular diastolic function was estimated as the ratio of early diastolic velocity (E) from transmitral inflow to early diastolic velocity (e’) of tissue Doppler at mitral annulus (E/e’). Parameters of glycemic control, plasma insulin concentration, treatment with antidiabetic drugs, lipid profile, and other clinical characteristics were evaluated, and their association with E/e’ determined. Patients with New York Heart Association class >1, ejection fraction <50%, history of coronary artery disease, severe valvulopathy, chronic atrial fibrillation, or creatinine clearance <30 mL/min, as well as those receiving insulin treatment, were excluded. Results Univariate analysis showed that E/e’ was significantly correlated with age (p<0.001), sex (p<0.001), duration of diabetes (p=0.002), systolic blood pressure (p=0.017), pulse pressure (p=0.010), fasting insulin concentration (p=0.025), and sulfonylurea use (p<0.001). Multivariate linear regression analysis showed that log E/e’ was significantly and positively correlated with log age (p=0.034), female sex (p=0.019), log fasting insulin concentration (p=0.010), and sulfonylurea use (p=0.027). Conclusions Hyperinsulinemia and sulfonylurea use may be important in the development of LVDD in patients with type 2 diabetes mellitus. PMID:27648285
Detection of changes in leaf water content using near- and middle-infrared reflectances
NASA Technical Reports Server (NTRS)
Hunt, E. Raymond, Jr.; Rock, Barrett N.
1989-01-01
A method to detect plant water stress by remote sensing is proposed using indices of near-IR and mid-IR wavelengths. The ability of the Leaf Water Content Index (LWCI) to determine leaf relative water content (RWC) is tested on species with different leaf morphologies. The way in which the Misture Stress Index (MSI) varies with RWC is studied. On test with several species, it is found that LWCI is equal to RWC, although the reflectances at 1.6 microns for two different RWC must be known to accurately predict unknown RWC. A linear correlation is found between MSI and RWC with each species having a different regression equation. Also, MSI is correlated with log sub 10 Equivalent Water Thickness (EWT) with data for all species falling on the same regression line. It is found that the minimum significant change of RWC that could be detected by appying the linear regression equation of MSI to EWT is 52 percent. Because the natural RWC variation from water stress is about 20 percent for most species, it is concluded that the near-IR and mid-IR reflectances cannot be used to remotely sense water stress.
Correlations between chromatographic parameters and bioactivity predictors of potential herbicides.
Janicka, Małgorzata
2014-08-01
Different liquid chromatography techniques, including reversed-phase liquid chromatography on Purosphere RP-18e, IAM.PC.DD2 and Cosmosil Cholester columns and micellar liqud chromatography with a Purosphere RP-8e column and using buffered sodium dodecyl sulfate-acetonitrile as the mobile phase, were applied to study the lipophilic properties of 15 newly synthesized phenoxyacetic and carbamic acid derivatives, which are potential herbicides. Chromatographic lipophilicity descriptors were used to extrapolate log k parameters (log kw and log km) and log k values. Partitioning lipophilicity descriptors, i.e., log P coefficients in an n-octanol-water system, were computed from the molecular structures of the tested compounds. Bioactivity descriptors, including partition coefficients in a water-plant cuticle system and water-human serum albumin and coefficients for human skin partition and permeation were calculated in silico by ACD/ADME software using the linear solvation energy relationship of Abraham. Principal component analysis was applied to describe similarities between various chromatographic and partitioning lipophilicities. Highly significant, predictive linear relationships were found between chromatographic parameters and bioactivity descriptors. © The Author [2013]. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Arnaoutakis, George J.; George, Timothy J.; Alejo, Diane E.; Merlo, Christian A.; Baumgartner, William A.; Cameron, Duke E.; Shah, Ashish S.
2011-01-01
Context The impact of Society of Thoracic Surgeons (STS) predicted mortality risk score on resource utilization after aortic valve replacement (AVR) has not been previously studied. Objective We hypothesize that increasing STS risk scores in patients having AVR are associated with greater hospital charges. Design, Setting, and Patients Clinical and financial data for patients undergoing AVR at a tertiary care, university hospital over a ten-year period (1/2000–12/2009) were retrospectively reviewed. The current STS formula (v2.61) for in-hospital mortality was used for all patients. After stratification into risk quartiles (Q), index admission hospital charges were compared across risk strata with Rank-Sum tests. Linear regression and Spearman’s coefficient assessed correlation and goodness of fit. Multivariable analysis assessed relative contributions of individual variables on overall charges. Main Outcome Measures Inflation-adjusted index hospitalization total charges Results 553 patients had AVR during the study period. Average predicted mortality was 2.9% (±3.4) and actual mortality was 3.4% for AVR. Median charges were greater in the upper Q of AVR patients [Q1–3,$39,949 (IQR32,708–51,323) vs Q4,$62,301 (IQR45,952–97,103), p=<0.01]. On univariate linear regression, there was a positive correlation between STS risk score and log-transformed charges (coefficient: 0.06, 95%CI 0.05–0.07, p<0.01). Spearman’s correlation R-value was 0.51. This positive correlation persisted in risk-adjusted multivariable linear regression. Each 1% increase in STS risk score was associated with an added $3,000 in hospital charges. Conclusions This study showed increasing STS risk score predicts greater charges after AVR. As competing therapies such as percutaneous valve replacement emerge to treat high risk patients, these results serve as a benchmark to compare resource utilization. PMID:21497834
Conjoint Analysis: A Study of the Effects of Using Person Variables.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Spencer, Monique E; Jain, Alka; Matteini, Amy; Beamer, Brock A; Wang, Nae-Yuh; Leng, Sean X; Punjabi, Naresh M; Walston, Jeremy D; Fedarko, Neal S
2010-08-01
Neopterin, a GTP metabolite expressed by macrophages, is a marker of immune activation. We hypothesize that levels of this serum marker alter with donor age, reflecting increased chronic immune activation in normal aging. In addition to age, we assessed gender, race, body mass index (BMI), and percentage of body fat (%fat) as potential covariates. Serum was obtained from 426 healthy participants whose age ranged from 18 to 87 years. Anthropometric measures included %fat and BMI. Neopterin concentrations were measured by competitive ELISA. The paired associations between neopterin and age, BMI, or %fat were analyzed by Spearman's correlation or by linear regression of log-transformed neopterin, whereas overall associations were modeled by multiple regression of log-transformed neopterin as a function of age, gender, race, BMI, %fat, and interaction terms. Across all participants, neopterin exhibited a positive association with age, BMI, and %fat. Multiple regression modeling of neopterin in women and men as a function of age, BMI, and race revealed that each covariate contributed significantly to neopterin values and that optimal modeling required an interaction term between race and BMI. The covariate %fat was highly correlated with BMI and could be substituted for BMI to yield similar regression coefficients. The association of age and gender with neopterin levels and their modification by race, BMI, or %fat reflect the biology underlying chronic immune activation and perhaps gender differences in disease incidence, morbidity, and mortality.
Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models
Jiang, Dingfeng; Huang, Jian
2013-01-01
Recent studies have demonstrated theoretical attractiveness of a class of concave penalties in variable selection, including the smoothly clipped absolute deviation and minimax concave penalties. The computation of the concave penalized solutions in high-dimensional models, however, is a difficult task. We propose a majorization minimization by coordinate descent (MMCD) algorithm for computing the concave penalized solutions in generalized linear models. In contrast to the existing algorithms that use local quadratic or local linear approximation to the penalty function, the MMCD seeks to majorize the negative log-likelihood by a quadratic loss, but does not use any approximation to the penalty. This strategy makes it possible to avoid the computation of a scaling factor in each update of the solutions, which improves the efficiency of coordinate descent. Under certain regularity conditions, we establish theoretical convergence property of the MMCD. We implement this algorithm for a penalized logistic regression model using the SCAD and MCP penalties. Simulation studies and a data example demonstrate that the MMCD works sufficiently fast for the penalized logistic regression in high-dimensional settings where the number of covariates is much larger than the sample size. PMID:25309048
Dai, James Y.; Chan, Kwun Chuen Gary; Hsu, Li
2014-01-01
Instrumental variable regression is one way to overcome unmeasured confounding and estimate causal effect in observational studies. Built on structural mean models, there has been considerale work recently developed for consistent estimation of causal relative risk and causal odds ratio. Such models can sometimes suffer from identification issues for weak instruments. This hampered the applicability of Mendelian randomization analysis in genetic epidemiology. When there are multiple genetic variants available as instrumental variables, and causal effect is defined in a generalized linear model in the presence of unmeasured confounders, we propose to test concordance between instrumental variable effects on the intermediate exposure and instrumental variable effects on the disease outcome, as a means to test the causal effect. We show that a class of generalized least squares estimators provide valid and consistent tests of causality. For causal effect of a continuous exposure on a dichotomous outcome in logistic models, the proposed estimators are shown to be asymptotically conservative. When the disease outcome is rare, such estimators are consistent due to the log-linear approximation of the logistic function. Optimality of such estimators relative to the well-known two-stage least squares estimator and the double-logistic structural mean model is further discussed. PMID:24863158
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
The association between subgingival periodontal pathogens and systemic inflammation.
Winning, Lewis; Patterson, Christopher C; Cullen, Kathy M; Stevenson, Kathryn A; Lundy, Fionnuala T; Kee, Frank; Linden, Gerard J
2015-09-01
To investigate associations between periodontal disease pathogens and levels of systemic inflammation measured by C-reactive protein (CRP). A representative sample of dentate 60-70-year-old men in Northern Ireland had a comprehensive periodontal examination. Men taking statins were excluded. Subgingival plaque samples were analysed by quantitative real time PCR to identify the presence of Aggregatibacter actinomycetemcomitans, Porphyromonas gingivalis, Treponema denticola and Tannerella forsythia. High-sensitivity CRP (mg/l) was measured from fasting blood samples. Multiple linear regression analysis was performed using log-transformed CRP concentration as the dependent variable, with the presence of each periodontal pathogen as predictor variables, with adjustment for various potential confounders. A total of 518 men (mean age 63.6 SD 3.0 years) were included in the analysis. Multiple regression analysis showed that body mass index (p < 0.001), current smoking (p < 0.01), the detectable presence of P. gingivalis (p < 0.01) and hypertension (p = 0.01), were independently associated with an increased CRP. The detectable presence of P. gingivalis was associated with a 20% (95% confidence interval 4-35%) increase in CRP (mg/l) after adjustment for all other predictor variables. In these 60-70-year-old dentate men, the presence of P. gingivalis in subgingival plaque was significantly associated with a raised level of C-reactive protein. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.
1983-09-01
books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Teaching the Concept of Breakdown Point in Simple Linear Regression.
ERIC Educational Resources Information Center
Chan, Wai-Sum
2001-01-01
Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
The role of NT-proBNP in explaining the variance in anaerobic threshold and VE/VCO(2) slope.
Athanasopoulos, Leonidas V; Dritsas, Athanasios; Doll, Helen A; Cokkinos, Dennis V
2011-01-01
We investigated whether anaerobic threshold (AT) and ventilatory efficiency (minute ventilation/carbon dioxide production slope, VE/VCO2 slope), both significantly associated with mortality, can be predicted by questionnaire scores and/or other laboratory measurements. Anaerobic threshold and VE/VCO(2) slope, plasma N-terminal pro-brain natriuretic peptide (NT-proBNP), and the echocardiographic markers left ventricular ejection fraction (LVEF) and left atrial (LA) diameter were measured in 62 patients with heart failure (HF), who also completed the Minnesota Living with Heart Failure Questionnaire (MLHF), and the Specific Activity Questionnaire (SAQ). Linear regression models, adjusting for age and gender, were fitted. While the etiology of HF, SAQ score, MLHF score, LVEF, LA diameter, and logNT-proBNP were each significantly predictive of both AT and VE/VCO2 slope on stepwise multiple linear regression, only SAQ score (P < .001) and logNT-proBNP (P = .001) were significantly predictive of AT, explaining 56% of the variability (adjusted R(2) = 0.525), while logNT-proBNP (P < .001) and etiology of HF (P = .003) were significantly predictive of VE/VCO(2) slope, explaining 49% of the variability (adjusted R(2) = 0.45). The area under the ROC curve for NT-proBNP to identify patients with a VE/VCO(2) slope greater than 34 and AT less than 11 mL · kg(-1) · min(-1) was 0.797; P < .001 and 0.712; P = .044, respectively. A plasma concentration greater than 429.5 pg/mL (sensitivity: 78%; specificity: 70%) and greater than 674.5 pg/mL (sensitivity: 77.8%; specificity: 65%) identified a VE/VCO(2) slope greater than 34 and AT lower than 11 mL · kg(-1) · min(-1), respectively. NT-proBNP is independently related to both AT and VE/VCO(2) slope. Specific Activity Questionnaire score is independently related only to AT and the etiology of HF only to VE/VCO(2) slope.
Calculating the Solubilities of Drugs and Drug-Like Compounds in Octanol.
Alantary, Doaa; Yalkowsky, Samuel
2016-09-01
A modification of the Van't Hoff equation is used to predict the solubility of organic compounds in dry octanol. The new equation describes a linear relationship between the logarithm of the solubility of a solute in octanol to its melting temperature. More than 620 experimentally measured octanol solubilities, collected from the literature, are used to validate the equation without using any regression or fitting. The average absolute error of the prediction is 0.66 log units. Copyright © 2016 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Development of a pharmacogenetic-guided warfarin dosing algorithm for Puerto Rican patients.
Ramos, Alga S; Seip, Richard L; Rivera-Miranda, Giselle; Felici-Giovanini, Marcos E; Garcia-Berdecia, Rafael; Alejandro-Cowan, Yirelia; Kocherla, Mohan; Cruz, Iadelisse; Feliu, Juan F; Cadilla, Carmen L; Renta, Jessica Y; Gorowski, Krystyna; Vergara, Cunegundo; Ruaño, Gualberto; Duconge, Jorge
2012-12-01
This study was aimed at developing a pharmacogenetic-driven warfarin-dosing algorithm in 163 admixed Puerto Rican patients on stable warfarin therapy. A multiple linear-regression analysis was performed using log-transformed effective warfarin dose as the dependent variable, and combining CYP2C9 and VKORC1 genotyping with other relevant nongenetic clinical and demographic factors as independent predictors. The model explained more than two-thirds of the observed variance in the warfarin dose among Puerto Ricans, and also produced significantly better 'ideal dose' estimates than two pharmacogenetic models and clinical algorithms published previously, with the greatest benefit seen in patients ultimately requiring <7 mg/day. We also assessed the clinical validity of the model using an independent validation cohort of 55 Puerto Rican patients from Hartford, CT, USA (R(2) = 51%). Our findings provide the basis for planning prospective pharmacogenetic studies to demonstrate the clinical utility of genotyping warfarin-treated Puerto Rican patients.
Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K
2014-10-01
Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.
Sgroi, Dennis C; Chapman, Judy-Anne W; Badovinac-Crnjevic, T; Zarella, Elizabeth; Binns, Shemeica; Zhang, Yi; Schnabel, Catherine A; Erlander, Mark G; Pritchard, Kathleen I; Han, Lei; Shepherd, Lois E; Goss, Paul E; Pollak, Michael
2016-01-04
Biomarkers that can be used to accurately assess the residual risk of disease recurrence in women with hormone receptor-positive breast cancer are clinically valuable. We evaluated the prognostic value of the Breast Cancer Index (BCI), a continuous risk index based on a combination of HOXB13:IL17BR and molecular grade index, in women with early breast cancer treated with either tamoxifen alone or tamoxifen plus octreotide in the NCIC MA.14 phase III clinical trial (ClinicalTrials.gov Identifier NCT00002864; registered 1 November 1999). Gene expression analysis of BCI by real-time polymerase chain reaction was performed blinded to outcome on RNA extracted from archived formalin-fixed, paraffin-embedded tumor samples of 299 patients with both lymph node-negative (LN-) and lymph node-positive (LN+) disease enrolled in the MA.14 trial. Our primary objective was to determine the prognostic performance of BCI based on relapse-free survival (RFS). MA.14 patients experienced similar RFS on both treatment arms. Association of gene expression data with RFS was evaluated in univariate analysis with a stratified log-rank test statistic, depicted with a Kaplan-Meier plot and an adjusted Cox survivor plot. In the multivariate assessment, we used stratified Cox regression. The prognostic performance of an emerging, optimized linear BCI model was also assessed in a post hoc analysis. Of 299 samples, 292 were assessed successfully for BCI for 146 patients accrued in each MA.14 treatment arm. BCI risk groups had a significant univariate association with RFS (stratified log-rank p = 0.005, unstratified log-rank p = 0.007). Adjusted 10-year RFS in BCI low-, intermediate-, and high-risk groups was 87.5 %, 83.9 %, and 74.7 %, respectively. BCI had a significant prognostic effect [hazard ratio (HR) 2.34, 95 % confidence interval (CI) 1.33-4.11; p = 0.004], although not a predictive effect, on RFS in stratified multivariate analysis, adjusted for pathological tumor stage (HR 2.22, 95 % CI 1.22-4.07; p = 0.01). In the post hoc multivariate analysis, higher linear BCI was associated with shorter RFS (p = 0.002). BCI had a strong prognostic effect on RFS in patients with early-stage breast cancer treated with tamoxifen alone or with tamoxifen and octreotide. BCI was prognostic in both LN- and LN+ patients. This retrospective study is an independent validation of the prognostic performance of BCI in a prospective trial.
Chronic Kidney Disease Is Associated With White Matter Hyperintensity Volume
Khatri, Minesh; Wright, Clinton B.; Nickolas, Thomas L.; Yoshita, Mitsuhiro; Paik, Myunghee C.; Kranwinkel, Grace; Sacco, Ralph L.; DeCarli, Charles
2010-01-01
Background and Purpose White matter hyperintensities have been associated with increased risk of stroke, cognitive decline, and dementia. Chronic kidney disease is a risk factor for vascular disease and has been associated with inflammation and endothelial dysfunction, which have been implicated in the pathogenesis of white matter hyperintensities. Few studies have explored the relationship between chronic kidney disease and white matter hyperintensities. Methods The Northern Manhattan Study is a prospective, community-based cohort of which a subset of stroke-free participants underwent MRIs. MRIs were analyzed quantitatively for white matter hyperintensities volume, which was log-transformed to yield a normal distribution (log-white matter hyperintensity volume). Kidney function was modeled using serum creatinine, the Cockcroft-Gault formula for creatinine clearance, and the Modification of Diet in Renal Disease formula for estimated glomerular filtration rate. Creatinine clearance and estimated glomerular filtration rate were trichotomized to 15 to 60 mL/min, 60 to 90 mL/min, and >90 mL/min (reference). Linear regression was used to measure the association between kidney function and log-white matter hyperintensity volume adjusting for age, gender, race–ethnicity, education, cardiac disease, diabetes, homocysteine, and hypertension. Results Baseline data were available on 615 subjects (mean age 70 years, 60% women, 18% whites, 21% blacks, 62% Hispanics). In multivariate analysis, creatinine clearance 15 to 60 mL/min was associated with increased log-white matter hyperintensity volume (β 0.322; 95% CI, 0.095 to 0.550) as was estimated glomerular filtration rate 15 to 60 mL/min (β 0.322; 95% CI, 0.080 to 0.564). Serum creatinine, per 1-mg/dL increase, was also positively associated with log-white matter hyperintensity volume (β 1.479; 95% CI, 1.067 to 2.050). Conclusions The association between moderate–severe chronic kidney disease and white matter hyperintensity volume highlights the growing importance of kidney disease as a possible determinant of cerebrovascular disease and/or as a marker of microangiopathy. PMID:17962588
MIXOR: a computer program for mixed-effects ordinal regression analysis.
Hedeker, D; Gibbons, R D
1996-03-01
MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
von Eye, Alexander; Mun, Eun Young; Bogat, G Anne
2008-03-01
This article reviews the premises of configural frequency analysis (CFA), including methods of choosing significance tests and base models, as well as protecting alpha, and discusses why CFA is a useful approach when conducting longitudinal person-oriented research. CFA operates at the manifest variable level. Longitudinal CFA seeks to identify those temporal patterns that stand out as more frequent (CFA types) or less frequent (CFA antitypes) than expected with reference to a base model. A base model that has been used frequently in CFA applications, prediction CFA, and a new base model, auto-association CFA, are discussed for analysis of cross-classifications of longitudinal data. The former base model takes the associations among predictors and among criteria into account. The latter takes the auto-associations among repeatedly observed variables into account. Application examples of each are given using data from a longitudinal study of domestic violence. It is demonstrated that CFA results are not redundant with results from log-linear modeling or multinomial regression and that, of these approaches, CFA shows particular utility when conducting person-oriented research.
Twenty-year trends in cardiovascular risk factors in India and influence of educational status.
Gupta, Rajeev; Guptha, Soneil; Gupta, V P; Agrawal, Aachu; Gaur, Kiran; Deedwania, Prakash C
2012-12-01
Urban middle-socioeconomic status (SES) subjects have high burden of cardiovascular risk factors in low-income countries. To determine secular trends in risk factors among this population and to correlate risks with educational status we performed epidemiological studies in India. Five cross-sectional studies were performed in middle-SES urban locations in Jaipur, India from years 1992 to 2010. Cluster sampling was performed. Subjects (men, women) aged 20-59 years evaluated were 712 (459, 253) in 1992-94, 558 (286, 272) in 1999-2001, 374 (179, 195) in 2002-03, 887 (414, 473) in 2004-05, and 530 (324, 206) in 2009-10. Data were obtained by history, anthropometry, and fasting blood glucose and lipids estimation. Response rates varied from 55 to 75%. Mean values and risk factor prevalence were determined. Secular trends were identified using quadratic and log-linear regression and chi-squared for trend. Across the studies, there was high prevalence of overweight, hypertension, and lipid abnormalities. Age- and sex-adjusted trends showed significant increases in mean body mass index (BMI), fasting glucose, total cholesterol, high-density lipoprotein (HDL) cholesterol, and triglycerides (quadratic and log-linear regression, p < 0.001). Systolic blood pressure (BP) decreased while insignificant changes were observed for waist-hip ratio and low-density lipoprotein (LDL) cholesterol. Categorical trends showed increase in overweight and decrease in smoking (p < 0.05); insignificant changes were observed in truncal obesity, hypertension, hypercholesterolaemia, and diabetes. Adjustment for educational status attenuated linear trends in BMI and total and LDL cholesterol and accentuated trends in systolic BP, glucose, and HDL cholesterol. There was significant association of an increase in education with decline in smoking and an increase in overweight (two-line regression p < 0.05). In Indian urban middle-SES subjects there is high prevalence of cardiovascular risk factors. Over a 20-year period BMI and overweight increased, smoking and systolic BP decreased, and truncal obesity, hypercholesterolaemia, and diabetes remained stable. Increasing educational status attenuated trends for systolic BP, glucose and HDL cholesterol, and BMI.
Mairinger, Fabian D; Schmeller, Jan; Borchert, Sabrina; Wessolly, Michael; Mairinger, Elena; Kollmeier, Jens; Hager, Thomas; Mairinger, Thomas; Christoph, Daniel C; Walter, Robert F H; Eberhardt, Wilfried E E; Plönes, Till; Wohlschlaeger, Jeremias; Jasani, Bharat; Schmid, Kurt Werner; Bankfalvi, Agnes
2018-04-27
Malignant pleural mesothelioma (MPM) is a biologically highly aggressive tumor arising from the pleura with a dismal prognosis. Cisplatin is the drug of choice for the treatment of MPM, and carboplatin seems to have comparable efficacy. Nevertheless, cisplatin treatment results in a response rate of merely 14% and a median survival of less than seven months. Due to their role in many cellular processes, methallothioneins (MTs) have been widely studied in various cancers. The known heavy metal detoxifying effect of MT-I and MT-II may be the reason for heavy metal drug resistance of various cancers including MPM. 105 patients were retrospectively analyzed immunohistochemically for their MT expression levels. Survival analysis was done by Cox-regression, and statistical significance determined using likelihood ratio, Wald test and Score (logrank) tests. Cox-regression analyses were done in a linear and logarithmic scale revealing a significant association between expression of MT and shortened overall survival (OS) in a linear (p=0.0009) and logarithmic scale (p=0.0003). Reduced progression free survival (PFS) was also observed for MT expressing tumors (linear: p=0.0134, log: p=0.0152). Since both, overall survival and progression-free survival are negatively correlated with detectable MT expression in MPM, our results indicate a possible resistance to platin-based chemotherapy associated with MT expression upregulation, found exclusively in progressive MPM samples. Initial cell culture studies suggest promoter DNA hypomethylation and expression of miRNA-566 a direct regulator of copper transporter SLC31A1 and a putative regulator of MT1A and MT2A gene expression, to be responsible for the drug resistance.
Borchert, Sabrina; Wessolly, Michael; Mairinger, Elena; Kollmeier, Jens; Hager, Thomas; Mairinger, Thomas; Christoph, Daniel C.; Walter, Robert F.H.; Eberhardt, Wilfried E.E.; Plönes, Till; Wohlschlaeger, Jeremias; Jasani, Bharat; Schmid, Kurt Werner; Bankfalvi, Agnes
2018-01-01
Background Malignant pleural mesothelioma (MPM) is a biologically highly aggressive tumor arising from the pleura with a dismal prognosis. Cisplatin is the drug of choice for the treatment of MPM, and carboplatin seems to have comparable efficacy. Nevertheless, cisplatin treatment results in a response rate of merely 14% and a median survival of less than seven months. Due to their role in many cellular processes, methallothioneins (MTs) have been widely studied in various cancers. The known heavy metal detoxifying effect of MT-I and MT-II may be the reason for heavy metal drug resistance of various cancers including MPM. Methods 105 patients were retrospectively analyzed immunohistochemically for their MT expression levels. Survival analysis was done by Cox-regression, and statistical significance determined using likelihood ratio, Wald test and Score (logrank) tests. Results Cox-regression analyses were done in a linear and logarithmic scale revealing a significant association between expression of MT and shortened overall survival (OS) in a linear (p=0.0009) and logarithmic scale (p=0.0003). Reduced progression free survival (PFS) was also observed for MT expressing tumors (linear: p=0.0134, log: p=0.0152). Conclusion Since both, overall survival and progression-free survival are negatively correlated with detectable MT expression in MPM, our results indicate a possible resistance to platin-based chemotherapy associated with MT expression upregulation, found exclusively in progressive MPM samples. Initial cell culture studies suggest promoter DNA hypomethylation and expression of miRNA-566 a direct regulator of copper transporter SLC31A1 and a putative regulator of MT1A and MT2A gene expression, to be responsible for the drug resistance. PMID:29854276
Local Linear Regression for Data with AR Errors.
Li, Runze; Li, Yan
2009-07-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
NASA Astrophysics Data System (ADS)
Li, X.; Gao, M.
2017-12-01
The magnitude of an earthquake is one of its basic parameters and is a measure of its scale. It plays a significant role in seismology and earthquake engineering research, particularly in the calculations of the seismic rate and b value in earthquake prediction and seismic hazard analysis. However, several current types of magnitudes used in seismology research, such as local magnitude (ML), surface wave magnitude (MS), and body-wave magnitude (MB), have a common limitation, which is the magnitude saturation phenomenon. Fortunately, the problem of magnitude saturation was solved by a formula for calculating the seismic moment magnitude (MW) based on the seismic moment, which describes the seismic source strength. Now the moment magnitude is very commonly used in seismology research. However, in China, the earthquake scale is primarily based on local and surface-wave magnitudes. In the present work, we studied the empirical relationships between moment magnitude (MW) and local magnitude (ML) as well as surface wave magnitude (MS) in the Chinese Mainland. The China Earthquake Networks Center (CENC) ML catalog, China Seismograph Network (CSN) MS catalog, ANSS Comprehensive Earthquake Catalog (ComCat), and Global Centroid Moment Tensor (GCMT) are adopted to regress the relationships using the orthogonal regression method. The obtained relationships are as follows: MW=0.64+0.87MS; MW=1.16+0.75ML. Therefore, in China, if the moment magnitude of an earthquake is not reported by any agency in the world, we can use the equations mentioned above for converting ML to MW and MS to MW. These relationships are very important, because they will allow the China earthquake catalogs to be used more effectively for seismic hazard analysis, earthquake prediction, and other seismology research. We also computed the relationships of and (where Mo is the seismic moment) by linear regression using the Global Centroid Moment Tensor. The obtained relationships are as follows: logMo=18.21+1.05ML; logMo=17.04+1.32MS. This formula can be used by seismologists to convert the ML/MS of Chinese mainland events into their seismic moments.
Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio
Koltun, G.F.; Roberts, J.W.
1990-01-01
Multiple-regression equations are presented for estimating flood-peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years at ungaged sites on rural, unregulated streams in Ohio. The average standard errors of prediction for the equations range from 33.4% to 41.4%. Peak discharge estimates determined by log-Pearson Type III analysis using data collected through the 1987 water year are reported for 275 streamflow-gaging stations. Ordinary least-squares multiple-regression techniques were used to divide the State into three regions and to identify a set of basin characteristics that help explain station-to- station variation in the log-Pearson estimates. Contributing drainage area, main-channel slope, and storage area were identified as suitable explanatory variables. Generalized least-square procedures, which include historical flow data and account for differences in the variance of flows at different gaging stations, spatial correlation among gaging station records, and variable lengths of station record were used to estimate the regression parameters. Weighted peak-discharge estimates computed as a function of the log-Pearson Type III and regression estimates are reported for each station. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site located on the same stream. Limitations and shortcomings cited in an earlier report on the magnitude and frequency of floods in Ohio are addressed in this study. Geographic bias is no longer evident for the Maumee River basin of northwestern Ohio. No bias is found to be associated with the forested-area characteristic for the range used in the regression analysis (0.0 to 99.0%), nor is this characteristic significant in explaining peak discharges. Surface-mined area likewise is not significant in explaining peak discharges, and the regression equations are not biased when applied to basins having approximately 30% or less surface-mined area. Analyses of residuals indicate that the equations tend to overestimate flood-peak discharges for basins having approximately 30% or more surface-mined area. (USGS)
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs
Andrew F. Howard; Daniel A. Yaussy
1986-01-01
A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
Using Configural Frequency Analysis as a Person-Centered Analytic Approach with Categorical Data
ERIC Educational Resources Information Center
Stemmler, Mark; Heine, Jörg-Henrik
2017-01-01
Configural frequency analysis and log-linear modeling are presented as person-centered analytic approaches for the analysis of categorical or categorized data in multi-way contingency tables. Person-centered developmental psychology, based on the holistic interactionistic perspective of the Stockholm working group around David Magnusson and Lars…
Basa, Ranor C B; Davies, Vince; Li, Xiaoxiao; Murali, Bhavya; Shah, Jinel; Yang, Bing; Li, Shi; Khan, Mohammad W; Tian, Mengxi; Tejada, Ruth; Hassan, Avan; Washington, Allen; Mukherjee, Bhramar; Carethers, John M; McGuire, Kathleen L
2016-01-01
Colorectal cancer is a leading cause of cancer related deaths in the U.S., with African-Americans having higher incidence and mortality rates than Caucasian-Americans. Recent studies have demonstrated that anti-tumor cytotoxic T lymphocytes provide protection to patients with colon cancer while patients deficient in these responses have significantly worse prognosis. To determine if differences in cytotoxic immunity might play a role in racial disparities in colorectal cancer 258 microsatellite-stable colon tumors were examined for infiltrating immune biomarkers via immunohistochemistry. Descriptive summary statistics were calculated using two-sample Wilcoxon rank sum tests, while linear regression models with log-transformed data were used to assess differences in race and Pearson and Spearman correlations were used to correlate different biomarkers. The association between different biomarkers was also assessed using linear regression after adjusting for covariates. No significant differences were observed in CD8+ (p = 0.83), CD57+ (p = 0.55), and IL-17-expressing (p = 0.63) cell numbers within the tumor samples tested. When infiltration of granzyme B+ cells was analyzed, however, a significant difference was observed, with African Americans having lower infiltration of cells expressing this cytotoxic marker than Caucasians (p<0.01). Analysis of infiltrating granzyme B+ cells at the invasive borders of the tumor revealed an even greater difference by race (p<0.001). Taken together, the data presented suggest differences in anti-tumor immune cytotoxicity may be a contributing factor in the racial disparities observed in colorectal cancer.
Li, Feiming; Gimpel, John R; Arenson, Ethan; Song, Hao; Bates, Bruce P; Ludwin, Fredric
2014-04-01
Few studies have investigated how well scores from the Comprehensive Osteopathic Medical Licensing Examination-USA (COMLEX-USA) series predict resident outcomes, such as performance on board certification examinations. To determine how well COMLEX-USA predicts performance on the American Osteopathic Board of Emergency Medicine (AOBEM) Part I certification examination. The target study population was first-time examinees who took AOBEM Part I in 2011 and 2012 with matched performances on COMLEX-USA Level 1, Level 2-Cognitive Evaluation (CE), and Level 3. Pearson correlations were computed between AOBEM Part I first-attempt scores and COMLEX-USA performances to measure the association between these examinations. Stepwise linear regression analysis was conducted to predict AOBEM Part I scores by the 3 COMLEX-USA scores. An independent t test was conducted to compare mean COMLEX-USA performances between candidates who passed and who failed AOBEM Part I, and a stepwise logistic regression analysis was used to predict the log-odds of passing AOBEM Part I on the basis of COMLEX-USA scores. Scores from AOBEM Part I had the highest correlation with COMLEX-USA Level 3 scores (.57) and slightly lower correlation with COMLEX-USA Level 2-CE scores (.53). The lowest correlation was between AOBEM Part I and COMLEX-USA Level 1 scores (.47). According to the stepwise regression model, COMLEX-USA Level 1 and Level 2-CE scores, which residency programs often use as selection criteria, together explained 30% of variance in AOBEM Part I scores. Adding Level 3 scores explained 37% of variance. The independent t test indicated that the 397 examinees passing AOBEM Part I performed significantly better than the 54 examinees failing AOBEM Part I in all 3 COMLEX-USA levels (P<.001 for all 3 levels). The logistic regression model showed that COMLEX-USA Level 1 and Level 3 scores predicted the log-odds of passing AOBEM Part I (P=.03 and P<.001, respectively). The present study empirically supported the predictive and discriminant validities of the COMLEX-USA series in relation to the AOBEM Part I certification examination. Although residency programs may use COMLEX-USA Level 1 and Level 2-CE scores as partial criteria in selecting residents, Level 3 scores, though typically not available at the time of application, are actually the most statistically related to performances on AOBEM Part I.
ERIC Educational Resources Information Center
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Poisson Regression Analysis of Illness and Injury Surveillance Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences duemore » to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.« less
Developing and applying metamodels of high resolution ...
As defined by Wikipedia (https://en.wikipedia.org/wiki/Metamodeling), “(a) metamodel or surrogate model is a model of a model, and metamodeling is the process of generating such metamodels.” The goals of metamodeling include, but are not limited to (1) developing functional or statistical relationships between a model’s input and output variables for model analysis, interpretation, or information consumption by users’ clients; (2) quantifying a model’s sensitivity to alternative or uncertain forcing functions, initial conditions, or parameters; and (3) characterizing the model’s response or state space. Using five existing models developed by US Environmental Protection Agency, we generate a metamodeling database of the expected environmental and biological concentrations of 644 organic chemicals released into nine US rivers from wastewater treatment works (WTWs) assuming multiple loading rates and sizes of populations serviced. The chemicals of interest have log n-octanol/water partition coefficients ( ) ranging from 3 to 14, and the rivers of concern have mean annual discharges ranging from 1.09 to 3240 m3/s. Log linear regression models are derived to predict mean annual dissolved and total water concentrations and total sediment concentrations of chemicals of concern based on their , Henry’s Law Constant, and WTW loading rate and on the mean annual discharges of the receiving rivers. Metamodels are also derived to predict mean annual chemical
ERIC Educational Resources Information Center
Rocconi, Louis M.
2011-01-01
Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
The prisoner's dilemma as a cancer model.
West, Jeffrey; Hasnain, Zaki; Mason, Jeremy; Newton, Paul K
2016-09-01
Tumor development is an evolutionary process in which a heterogeneous population of cells with different growth capabilities compete for resources in order to gain a proliferative advantage. What are the minimal ingredients needed to recreate some of the emergent features of such a developing complex ecosystem? What is a tumor doing before we can detect it? We outline a mathematical model, driven by a stochastic Moran process, in which cancer cells and healthy cells compete for dominance in the population. Each are assigned payoffs according to a Prisoner's Dilemma evolutionary game where the healthy cells are the cooperators and the cancer cells are the defectors. With point mutational dynamics, heredity, and a fitness landscape controlling birth and death rates, natural selection acts on the cell population and simulated 'cancer-like' features emerge, such as Gompertzian tumor growth driven by heterogeneity, the log-kill law which (linearly) relates therapeutic dose density to the (log) probability of cancer cell survival, and the Norton-Simon hypothesis which (linearly) relates tumor regression rates to tumor growth rates. We highlight the utility, clarity, and power that such models provide, despite (and because of) their simplicity and built-in assumptions.
Survival analysis: Part I — analysis of time-to-event
2018-01-01
Length of time is a variable often encountered during data analysis. Survival analysis provides simple, intuitive results concerning time-to-event for events of interest, which are not confined to death. This review introduces methods of analyzing time-to-event. The Kaplan-Meier survival analysis, log-rank test, and Cox proportional hazards regression modeling method are described with examples of hypothetical data. PMID:29768911
Statistical power for detecting trends with applications to seabird monitoring
Hatch, Shyla A.
2003-01-01
Power analysis is helpful in defining goals for ecological monitoring and evaluating the performance of ongoing efforts. I examined detection standards proposed for population monitoring of seabirds using two programs (MONITOR and TRENDS) specially designed for power analysis of trend data. Neither program models within- and among-years components of variance explicitly and independently, thus an error term that incorporates both components is an essential input. Residual variation in seabird counts consisted of day-to-day variation within years and unexplained variation among years in approximately equal parts. The appropriate measure of error for power analysis is the standard error of estimation (S.E.est) from a regression of annual means against year. Replicate counts within years are helpful in minimizing S.E.est but should not be treated as independent samples for estimating power to detect trends. Other issues include a choice of assumptions about variance structure and selection of an exponential or linear model of population change. Seabird count data are characterized by strong correlations between S.D. and mean, thus a constant CV model is appropriate for power calculations. Time series were fit about equally well with exponential or linear models, but log transformation ensures equal variances over time, a basic assumption of regression analysis. Using sample data from seabird monitoring in Alaska, I computed the number of years required (with annual censusing) to detect trends of -1.4% per year (50% decline in 50 years) and -2.7% per year (50% decline in 25 years). At ??=0.05 and a desired power of 0.9, estimated study intervals ranged from 11 to 69 years depending on species, trend, software, and study design. Power to detect a negative trend of 6.7% per year (50% decline in 10 years) is suggested as an alternative standard for seabird monitoring that achieves a reasonable match between statistical and biological significance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wiedemeier, Heribert, E-mail: wiedeh@rpi.edu
The observed linear (Na-, K-halides) and near-linear (Mg-, Sr-, Zn-, Cd-, and Hg-chalcogenides) dependences of Schottky constants on reciprocal interatomic distances yield the relation logK{sub S}=((s{sub s}1/T)+i{sub s})1/d{sub (A−B)}+(s{sub i}1/T)+i{sub i}, where K{sub S} is the product of metal and non-metal thermal equilibrium vacancy concentrations, and s{sub s}, i{sub s}, s{sub i} and i{sub i} are the group specific slope and intercept values obtained from an extended analysis of the above log K{sub S} versus 1/d{sub (A−B)} data. The previously reported linear dependences of log K{sub S} on the Born–Haber lattice energies [1] are the basis for combining the earliermore » results [1] with the Born–Mayer lattice energy equation to yield a new thermodynamic relationship, namely logK{sub S}=−(2.303nRT){sup −1}(c{sub (B−M)}/d{sub (A−B)}−I{sub e}), where c{sub (B−M)} is the product of the constants of the Born–Mayer equation and I{sub e} is the metal ionization energy of the above compounds. These results establish a correlation between point defect concentrations and basic thermodynamic, coulombic, and structural solid state properties for selected I–VII and II–VI semiconductor materials. - Graphical abstract: Display Omitted.« less
Normal reference values for bladder wall thickness on CT in a healthy population.
Fananapazir, Ghaneh; Kitich, Aleksandar; Lamba, Ramit; Stewart, Susan L; Corwin, Michael T
2018-02-01
To determine normal bladder wall thickness on CT in patients without bladder disease. Four hundred and nineteen patients presenting for trauma with normal CTs of the abdomen and pelvis were included in our retrospective study. Bladder wall thickness was assessed, and bladder volume was measured using both the ellipsoid formula and an automated technique. Patient age, gender, and body mass index were recorded. Linear regression models were created to account for bladder volume, age, gender, and body mass index, and the multiple correlation coefficient with bladder wall thickness was computed. Bladder volume and bladder wall thickness were log-transformed to achieve approximate normality and homogeneity of variance. Variables that did not contribute substantively to the model were excluded, and a parsimonious model was created and the multiple correlation coefficient was calculated. Expected bladder wall thickness was estimated for different bladder volumes, and 1.96 standard deviation above expected provided the upper limit of normal on the log scale. Age, gender, and bladder volume were associated with bladder wall thickness (p = 0.049, 0.024, and < 0.001, respectively). The linear regression model had an R 2 of 0.52. Age and gender were negligible in contribution to the model, and a parsimonious model using only volume was created for both the ellipsoid and automated volumes (R 2 = 0.52 and 0.51, respectively). Bladder wall thickness correlates with bladder wall volume. The study provides reference bladder wall thicknesses on CT utilizing both the ellipsoid formula and automated bladder volumes.
Body mass index in relation to serum prostate-specific antigen levels and prostate cancer risk.
Bonn, Stephanie E; Sjölander, Arvid; Tillander, Annika; Wiklund, Fredrik; Grönberg, Henrik; Bälter, Katarina
2016-07-01
High Body mass index (BMI) has been directly associated with risk of aggressive or fatal prostate cancer. One possible explanation may be an effect of BMI on serum levels of prostate-specific antigen (PSA). To study the association between BMI and serum PSA as well as prostate cancer risk, a large cohort of men without prostate cancer at baseline was followed prospectively for prostate cancer diagnoses until 2015. Serum PSA and BMI were assessed among 15,827 men at baseline in 2010-2012. During follow-up, 735 men were diagnosed with prostate cancer with 282 (38.4%) classified as high-grade cancers. Multivariable linear regression models and natural cubic linear regression splines were fitted for analyses of BMI and log-PSA. For risk analysis, Cox proportional hazards regression models were used to estimate hazard ratios (HR) and 95% confidence intervals (CI) and natural cubic Cox regression splines producing standardized cancer-free probabilities were fitted. Results showed that baseline Serum PSA decreased by 1.6% (95% CI: -2.1 to -1.1) with every one unit increase in BMI. Statistically significant decreases of 3.7, 11.7 and 32.3% were seen for increasing BMI-categories of 25 < 30, 30 < 35 and ≥35 kg/m(2), respectively, compared to the reference (18.5 < 25 kg/m(2)). No statistically significant associations were seen between BMI and prostate cancer risk although results were indicative of a positive association to incidence rates of high-grade disease and an inverse association to incidence of low-grade disease. However, findings regarding risk are limited by the short follow-up time. In conclusion, BMI was inversely associated to PSA-levels. BMI should be taken into consideration when referring men to a prostate biopsy based on serum PSA-levels. © 2016 UICC.
Using landslide risk analysis to protect fish habitat
R. M. Rice
1986-01-01
The protection of anadromous fish habitat is an important water quslity concern in the Pacific Northwest. Sediment from logging-related debris avalanches can cause habitat degradation. Research on conditions associated with the sites where debris avalanches originate has resulted in a risk assessment methodology based on linear discriminant analysis. The probability...
Questionable Validity of Poisson Assumptions in a Combined Loglinear/MDS Mapping Model.
ERIC Educational Resources Information Center
Gleason, John M.
1993-01-01
This response to an earlier article on a combined log-linear/MDS model for mapping journals by citation analysis discusses the underlying assumptions of the Poisson model with respect to characteristics of the citation process. The importance of empirical data analysis is also addressed. (nine references) (LRW)
Biostatistics Series Module 6: Correlation and Linear Regression.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.
Biostatistics Series Module 6: Correlation and Linear Regression
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
COMBATXXI, JDAFS, and LBC Integration Requirements for EASE
2015-10-06
process as linear and as new data is made available, any previous analysis is obsolete and has to start the process over again. Figure 2 proposes a...final line of the manifest file names the scenario file associated with the run. Under the usual practice, the analyst now starts the COMBATXXI...describes which events are to be logged. Finally the scenario is started with the click of a button. The simulation generates logs of a couple of sorts
Kinetics of hydrogen peroxide decomposition by catalase: hydroxylic solvent effects.
Raducan, Adina; Cantemir, Anca Ruxandra; Puiu, Mihaela; Oancea, Dumitru
2012-11-01
The effect of water-alcohol (methanol, ethanol, propan-1-ol, propan-2-ol, ethane-1,2-diol and propane-1,2,3-triol) binary mixtures on the kinetics of hydrogen peroxide decomposition in the presence of bovine liver catalase is investigated. In all solvents, the activity of catalase is smaller than in water. The results are discussed on the basis of a simple kinetic model. The kinetic constants for product formation through enzyme-substrate complex decomposition and for inactivation of catalase are estimated. The organic solvents are characterized by several physical properties: dielectric constant (D), hydrophobicity (log P), concentration of hydroxyl groups ([OH]), polarizability (α), Kamlet-Taft parameter (β) and Kosower parameter (Z). The relationships between the initial rate, kinetic constants and medium properties are analyzed by linear and multiple linear regression.
Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija
2018-01-01
The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A. R.
2014-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Caraviello, D Z; Weigel, K A; Gianola, D
2004-05-01
Predicted transmitting abilities (PTA) of US Jersey sires for daughter longevity were calculated using a Weibull proportional hazards sire model and compared with predictions from a conventional linear animal model. Culling data from 268,008 Jersey cows with first calving from 1981 to 2000 were used. The proportional hazards model included time-dependent effects of herd-year-season contemporary group and parity by stage of lactation interaction, as well as time-independent effects of sire and age at first calving. Sire variances and parameters of the Weibull distribution were estimated, providing heritability estimates of 4.7% on the log scale and 18.0% on the original scale. The PTA of each sire was expressed as the expected risk of culling relative to daughters of an average sire. Risk ratios (RR) ranged from 0.7 to 1.3, indicating that the risk of culling for daughters of the best sires was 30% lower than for daughters of average sires and nearly 50% lower than than for daughters of the poorest sires. Sire PTA from the proportional hazards model were compared with PTA from a linear model similar to that used for routine national genetic evaluation of length of productive life (PL) using cross-validation in independent samples of herds. Models were compared using logistic regression of daughters' stayability to second, third, fourth, or fifth lactation on their sires' PTA values, with alternative approaches for weighting the contribution of each sire. Models were also compared using logistic regression of daughters' stayability to 36, 48, 60, 72, and 84 mo of life. The proportional hazards model generally yielded more accurate predictions according to these criteria, but differences in predictive ability between methods were smaller when using a Kullback-Leibler distance than with other approaches. Results of this study suggest that survival analysis methodology may provide more accurate predictions of genetic merit for longevity than conventional linear models.
The word frequency effect during sentence reading: A linear or nonlinear effect of log frequency?
White, Sarah J; Drieghe, Denis; Liversedge, Simon P; Staub, Adrian
2016-10-20
The effect of word frequency on eye movement behaviour during reading has been reported in many experimental studies. However, the vast majority of these studies compared only two levels of word frequency (high and low). Here we assess whether the effect of log word frequency on eye movement measures is linear, in an experiment in which a critical target word in each sentence was at one of three approximately equally spaced log frequency levels. Separate analyses treated log frequency as a categorical or a continuous predictor. Both analyses showed only a linear effect of log frequency on the likelihood of skipping a word, and on first fixation duration. Ex-Gaussian analyses of first fixation duration showed similar effects on distributional parameters in comparing high- and medium-frequency words, and medium- and low-frequency words. Analyses of gaze duration and the probability of a refixation suggested a nonlinear pattern, with a larger effect at the lower end of the log frequency scale. However, the nonlinear effects were small, and Bayes Factor analyses favoured the simpler linear models for all measures. The possible roles of lexical and post-lexical factors in producing nonlinear effects of log word frequency during sentence reading are discussed.
Treatment of singularities in a middle-crack tension specimen
NASA Technical Reports Server (NTRS)
Shivakumar, K. N.; Raju, I. S.
1990-01-01
A three-dimensional finite-element analysis of a middle-crack tension specimen subjected to mode I loading was performed to study the stress singularity along the crack front. The specimen was modeled using 20-node isoparametric elements with collapsed nonsingular elements at the crack front. The displacements and stresses from the analysis were used to estimate the power of singularities, by a log-log regression analysis, along the crack front. Analyses showed that finite-sized cracked bodies have two singular stress fields. Because of two singular stress fields near the free surface and the classical square root singularity elsewhere, the strain energy release rate appears to be an appropriate parameter all along the crack front.
Thelin, E P; Zibung, E; Riddez, L; Nordenvall, C
2016-10-01
Worldwide, the use of bicycles, for both recreation and commuting, is increasing. S100B, a suggested protein biomarker for cerebral injury, has been shown to correlate to extracranial injury as well. Using serum levels of S100B, we aimed to investigate how S100B could be used when assessing injuries in patients suffering from bicycle trauma injury. As a secondary aim, we investigated how hospital length of stay and injury severity score (ISS) were correlated to S100B levels. We performed a retrospective, database study including all patients admitted for bicycle trauma to a level 1 trauma center over a four-year period with admission samples of S100B (n = 127). Computerized tomography (CT) scans were reviewed and remaining data were collected from case records. Univariate- and multivariate regression analyses, linear regressions and comparative statistics (Mann-Whitney) were used where appropriate. Both intra- and extracranial injuries were correlated with S100B levels. Stockholm CT score presented the best correlation of an intracranial parameter with S100B levels (p < 0.0001), while the presences of extremity injury, thoracic injury, and non-cervical spinal injury were also significantly correlated (all p < 0.0001, respectively). A multivariate linear regression revealed that Stockholm CT score, non-cervical spinal injury, and abdominal injury all independently correlated with levels of S100B. Patients with a ISS > 15 had higher S100 levels than patients with ISS < 16 (p < 0.0001). Patients with extracranial, as well as intracranial- and extracranial injuries, had significantly higher levels of S100B than patients without injuries (p < 0.05 and p < 0.01, respectively). The admission serum levels of S100B (log, µg/L) were correlated with ISS (log) (r = 0.53) and length of stay (log, days) (r = 0.45). S100B levels were independently correlated with intracranial pathology, but also with the extent of extracranial injury. Length of stay and ISS were both correlated with the admission levels of S100B in bicycle trauma, suggesting S100B to be a good marker of aggregated injury severity. Further studies are warranted to confirm our findings.
Rosenbaum, Paula F; Weinstock, Ruth S; Silverstone, Allen E; Sjödin, Andreas; Pavuk, Marian
2017-11-01
The Anniston Community Health Survey, a cross-sectional study, was undertaken in 2005-2007 to study environmental exposure to polychlorinated biphenyl (PCB) and organochlorine (OC) pesticides and health outcomes among residents of Anniston, AL, United States. The examination of potential risks between these pollutants and metabolic syndrome, a cluster of cardiovascular risk factors (i.e., hypertension, central obesity, dyslipidemia and dysglycemia) was the focus of this analysis. Participants were 548 adults who completed the survey and a clinic visit, were free of diabetes, and had a serum sample for clinical laboratory parameters as well as PCB and OC pesticide concentrations. Associations between summed concentrations of 35 PCB congeners and 9 individual pesticides and metabolic syndrome were examined using generalized linear modeling and logistic regression; odds ratios (OR) and 95% confidence intervals (CI) are reported. Pollutants were evaluated as quintiles and as log transformations of continuous serum concentrations. Participants were mostly female (68%) with a mean age (SD) of 53.6 (16.2) years. The racial distribution was 56% white and 44% African American; 49% met the criteria for metabolic syndrome. In unadjusted logistic regression, statistically significant and positive associations across the majority of quintiles were noted for seven individually modeled pesticides (p,p'-DDT, p,p'-DDE, HCB, β-HCCH, oxychlor, tNONA, Mirex). Following adjustment for covariables (i.e., age, sex, race, education, marital status, current smoking, alcohol consumption, positive family history of diabetes or cardiovascular disease, liver disease, BMI), significant elevations in risk were noted for p,p'-DDT across multiple quintiles (range of ORs 1.61 to 2.36), for tNONA (range of ORs 1.62-2.80) and for p,p'-DDE [OR (95% CI)] of 2.73 (1.09-6.88) in the highest quintile relative to the first. Significant trends were observed in adjusted logistic models for log 10 HCB [OR=6.15 (1.66-22.88)], log 10 oxychlor [OR=2.09 (1.07-4.07)] and log 10 tNONA [3.19 (1.45-7.00)]. Summed PCB concentrations were significantly and positively associated with metabolic syndrome only in unadjusted models; adjustment resulted in attenuation of the ORs in both the quintile and log-transformed models. In conclusion, several OC pesticides were found to have significant associations with metabolic syndrome in the Anniston study population while no association was observed for PCBs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Rosenbaum, Paula F.; Weinstock, Ruth S.; Silverstone, Allen E.; Sjödin, Andreas; Pavuk, Marian
2017-01-01
The Anniston Community Health Survey, a cross-sectional study, was undertaken in 2005–2007 to study environmental exposure to polychlorinated biphenyl (PCB) and organochlorine (OC) pesticides and health outcomes among residents of Anniston, AL, United States. The examination of potential risks between these pollutants and metabolic syndrome, a cluster of cardiovascular risk factors (i.e., hypertension, central obesity, dyslipidemia and dysglycemia) was the focus of this analysis. Participants were 548 adults who completed the survey and a clinic visit, were free of diabetes, and had a serum sample for clinical laboratory parameters as well as PCB and OC pesticide concentrations. Associations between summed concentrations of 35 PCB congeners and 9 individual pesticides and metabolic syndrome were examined using generalized linear modeling and logistic regression; odds ratios (OR) and 95% confidence intervals (CI) are reported. Pollutants were evaluated as quintiles and as log transformations of continuous serum concentrations. Participants were mostly female (68%) with a mean age (SD) of 53.6 (16.2) years. The racial distribution was 56% white and 44% African American; 49% met the criteria for metabolic syndrome. In unadjusted logistic regression, statistically significant and positive associations across the majority of quintiles were noted for seven individually modeled pesticides (p,p′-DDT, p,p′-DDE, HCB, β-HCCH, oxychlor, tNONA, Mirex). Following adjustment for covariables (i.e., age, sex, race, education, marital status, current smoking, alcohol consumption, positive family history of diabetes or cardiovascular disease, liver disease, BMI), significant elevations in risk were noted for p,p′-DDT across multiple quintiles (range of ORs 1.61 to 2.36), for tNONA (range of ORs 1.62–2.80) and for p,p′-DDE [OR (95% CI)] of 2.73 (1.09–6.88) in the highest quintile relative to the first. Significant trends were observed in adjusted logistic models for log10 HCB [OR = 6.15 (1.66–22.88)], log10 oxychlor [OR = 2.09 (1.07–4.07)] and log10 tNONA [3.19 (1.45–7.00)]. Summed PCB concentrations were significantly and positively associated with metabolic syndrome only in unadjusted models; adjustment resulted in attenuation of the ORs in both the quintile and log-transformed models. In conclusion, several OC pesticides were found to have significant associations with metabolic syndrome in the Anniston study population while no association was observed for PCBs. PMID:28779625
Yang, Xiaowei; Nie, Kun
2008-03-15
Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M
2017-04-01
A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ahearn, Elizabeth A.
2004-01-01
Multiple linear-regression equations were developed to estimate the magnitudes of floods in Connecticut for recurrence intervals ranging from 2 to 500 years. The equations can be used for nonurban, unregulated stream sites in Connecticut with drainage areas ranging from about 2 to 715 square miles. Flood-frequency data and hydrologic characteristics from 70 streamflow-gaging stations and the upstream drainage basins were used to develop the equations. The hydrologic characteristics?drainage area, mean basin elevation, and 24-hour rainfall?are used in the equations to estimate the magnitude of floods. Average standard errors of prediction for the equations are 31.8, 32.7, 34.4, 35.9, 37.6 and 45.0 percent for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals, respectively. Simplified equations using only one hydrologic characteristic?drainage area?also were developed. The regression analysis is based on generalized least-squares regression techniques. Observed flows (log-Pearson Type III analysis of the annual maximum flows) from five streamflow-gaging stations in urban basins in Connecticut were compared to flows estimated from national three-parameter and seven-parameter urban regression equations. The comparison shows that the three- and seven- parameter equations used in conjunction with the new statewide equations generally provide reasonable estimates of flood flows for urban sites in Connecticut, although a national urban flood-frequency study indicated that the three-parameter equations significantly underestimated flood flows in many regions of the country. Verification of the accuracy of the three-parameter or seven-parameter national regression equations using new data from Connecticut stations was beyond the scope of this study. A technique for calculating flood flows at streamflow-gaging stations using a weighted average also is described. Two estimates of flood flows?one estimate based on the log-Pearson Type III analyses of the annual maximum flows at the gaging station, and the other estimate from the regression equation?are weighted together based on the years of record at the gaging station and the equivalent years of record value determined from the regression. Weighted averages of flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are tabulated for the 70 streamflow-gaging stations used in the regression analysis. Generally, weighted averages give the most accurate estimate of flood flows at gaging stations. An evaluation of the Connecticut's streamflow-gaging network was performed to determine whether the spatial coverage and range of geographic and hydrologic conditions are adequately represented for transferring flood characteristics from gaged to ungaged sites. Fifty-one of 54 stations in the current (2004) network support one or more flood needs of federal, state, and local agencies. Twenty-five of 54 stations in the current network are considered high-priority stations by the U.S. Geological Survey because of their contribution to the longterm understanding of floods, and their application for regionalflood analysis. Enhancements to the network to improve overall effectiveness for regionalization can be made by increasing the spatial coverage of gaging stations, establishing stations in regions of the state that are not well-represented, and adding stations in basins with drainage area sizes not represented. Additionally, the usefulness of the network for characterizing floods can be maintained and improved by continuing operation at the current stations because flood flows can be more accurately estimated at stations with continuous, long-term record.
Fatigue Shifts and Scatters Heart Rate Variability in Elite Endurance Athletes
Schmitt, Laurent; Regnard, Jacques; Desmarets, Maxime; Mauny, Fréderic; Mourot, Laurent; Fouillot, Jean-Pierre; Coulmy, Nicolas; Millet, Grégoire
2013-01-01
Purpose This longitudinal study aimed at comparing heart rate variability (HRV) in elite athletes identified either in ‘fatigue’ or in ‘no-fatigue’ state in ‘real life’ conditions. Methods 57 elite Nordic-skiers were surveyed over 4 years. R-R intervals were recorded supine (SU) and standing (ST). A fatigue state was quoted with a validated questionnaire. A multilevel linear regression model was used to analyze relationships between heart rate (HR) and HRV descriptors [total spectral power (TP), power in low (LF) and high frequency (HF) ranges expressed in ms2 and normalized units (nu)] and the status without and with fatigue. The variables not distributed normally were transformed by taking their common logarithm (log10). Results 172 trials were identified as in a ‘fatigue’ and 891 as in ‘no-fatigue’ state. All supine HR and HRV parameters (Beta±SE) were significantly different (P<0.0001) between ‘fatigue’ and ‘no-fatigue’: HRSU (+6.27±0.61 bpm), logTPSU (−0.36±0.04), logLFSU (−0.27±0.04), logHFSU (−0.46±0.05), logLF/HFSU (+0.19±0.03), HFSU(nu) (−9.55±1.33). Differences were also significant (P<0.0001) in standing: HRST (+8.83±0.89), logTPST (−0.28±0.03), logLFST (−0.29±0.03), logHFST (−0.32±0.04). Also, intra-individual variance of HRV parameters was larger (P<0.05) in the ‘fatigue’ state (logTPSU: 0.26 vs. 0.07, logLFSU: 0.28 vs. 0.11, logHFSU: 0.32 vs. 0.08, logTPST: 0.13 vs. 0.07, logLFST: 0.16 vs. 0.07, logHFST: 0.25 vs. 0.14). Conclusion HRV was significantly lower in 'fatigue' vs. 'no-fatigue' but accompanied with larger intra-individual variance of HRV parameters in 'fatigue'. The broader intra-individual variance of HRV parameters might encompass different changes from no-fatigue state, possibly reflecting different fatigue-induced alterations of HRV pattern. PMID:23951198
Daily magnesium intake and serum magnesium concentration among Japanese people.
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. The mean (+/-standard deviation) daily magnesium intake was 322 (+/-132), 323 (+/-163), and 322 (+/-147) mg/day for men, women, and the entire group, respectively. The mean (+/-standard deviation) serum magnesium concentration was 20.69 (+/-2.83), 20.69 (+/-2.88), and 20.69 (+/-2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log(10)X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Sedentary behavior, physical activity, and concentrations of insulin among US adults.
Ford, Earl S; Li, Chaoyang; Zhao, Guixiang; Pearson, William S; Tsai, James; Churilla, James R
2010-09-01
Time spent watching television has been linked to obesity, metabolic syndrome, and diabetes, all conditions characterized to some degree by hyperinsulinemia and insulin resistance. However, limited evidence relates screen time (watching television or using a computer) directly to concentrations of insulin. We examined the cross-sectional associations between time spent watching television or using a computer, physical activity, and serum concentrations of insulin using data from 2800 participants aged at least 20 years of the 2003-2006 National Health and Nutrition Examination Survey. The amount of time spent watching television and using a computer as well as physical activity was self-reported. The unadjusted geometric mean concentration of insulin increased from 6.2 microU/mL among participants who did not watch television to 10.0 microU/mL among those who watched television for 5 or more hours per day (P = .001). After adjustment for age, sex, race or ethnicity, educational status, concentration of cotinine, alcohol intake, physical activity, waist circumference, and body mass index using multiple linear regression analysis, the log-transformed concentrations of insulin were significantly and positively associated with time spent watching television (P = < .001). Reported time spent using a computer was significantly associated with log-transformed concentrations of insulin before but not after accounting for waist circumference and body mass index. Leisure-time physical activity but not transportation or household physical activity was significantly and inversely associated with log-transformed concentrations of insulin. Sedentary behavior, particularly the amount of time spent watching television, may be an important modifiable determinant of concentrations of insulin. Published by Elsevier Inc.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage
NASA Astrophysics Data System (ADS)
Cepowski, Tomasz
2017-06-01
The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
White matter degeneration in schizophrenia: a comparative diffusion tensor analysis
NASA Astrophysics Data System (ADS)
Ingalhalikar, Madhura A.; Andreasen, Nancy C.; Kim, Jinsuh; Alexander, Andrew L.; Magnotta, Vincent A.
2010-03-01
Schizophrenia is a serious and disabling mental disorder. Diffusion tensor imaging (DTI) studies performed on schizophrenia have demonstrated white matter degeneration either due to loss of myelination or deterioration of fiber tracts although the areas where the changes occur are variable across studies. Most of the population based studies analyze the changes in schizophrenia using scalar indices computed from the diffusion tensor such as fractional anisotropy (FA) and relative anisotropy (RA). The scalar measures may not capture the complete information from the diffusion tensor. In this paper we have applied the RADTI method on a group of 9 controls and 9 patients with schizophrenia. The RADTI method converts the tensors to log-Euclidean space where a linear regression model is applied and hypothesis testing is performed between the control and patient groups. Results show that there is a significant difference in the anisotropy between patients and controls especially in the parts of forceps minor, superior corona radiata, anterior limb of internal capsule and genu of corpus callosum. To check if the tensor analysis gives a better idea of the changes in anisotropy, we compared the results with voxelwise FA analysis as well as voxelwise geodesic anisotropy (GA) analysis.
Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08
Casey, Martin F.; Neidell, Matthew
2013-01-01
Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205
Interpretation of a compositional time series
NASA Astrophysics Data System (ADS)
Tolosana-Delgado, R.; van den Boogaart, K. G.
2012-04-01
Common methods for multivariate time series analysis use linear operations, from the definition of a time-lagged covariance/correlation to the prediction of new outcomes. However, when the time series response is a composition (a vector of positive components showing the relative importance of a set of parts in a total, like percentages and proportions), then linear operations are afflicted of several problems. For instance, it has been long recognised that (auto/cross-)correlations between raw percentages are spurious, more dependent on which other components are being considered than on any natural link between the components of interest. Also, a long-term forecast of a composition in models with a linear trend will ultimately predict negative components. In general terms, compositional data should not be treated in a raw scale, but after a log-ratio transformation (Aitchison, 1986: The statistical analysis of compositional data. Chapman and Hill). This is so because the information conveyed by a compositional data is relative, as stated in their definition. The principle of working in coordinates allows to apply any sort of multivariate analysis to a log-ratio transformed composition, as long as this transformation is invertible. This principle is of full application to time series analysis. We will discuss how results (both auto/cross-correlation functions and predictions) can be back-transformed, viewed and interpreted in a meaningful way. One view is to use the exhaustive set of all possible pairwise log-ratios, which allows to express the results into D(D - 1)/2 separate, interpretable sets of one-dimensional models showing the behaviour of each possible pairwise log-ratios. Another view is the interpretation of estimated coefficients or correlations back-transformed in terms of compositions. These two views are compatible and complementary. These issues are illustrated with time series of seasonal precipitation patterns at different rain gauges of the USA. In this data set, the proportion of annual precipitation falling in winter, spring, summer and autumn is considered a 4-component time series. Three invertible log-ratios are defined for calculations, balancing rainfall in autumn vs. winter, in summer vs. spring, and in autumn-winter vs. spring-summer. Results suggest a 2-year correlation range, and certain oscillatory behaviour in the last balance, which does not occur in the other two.
Erice, Alejo; Brambilla, Donald; Bremer, James; Jackson, J. Brooks; Kokka, Robert; Yen-Lieberman, Belinda; Coombs, Robert W.
2000-01-01
The QUANTIPLEX HIV-1 RNA assay, version 3.0 (a branched DNA, version 3.0, assay [bDNA 3.0 assay]), was evaluated by analyzing spiked and clinical plasma samples and was compared with the AMPLICOR HIV-1 MONITOR Ultrasensitive (ultrasensitive reverse transcription-PCR [US-RT-PCR]) method. A panel of spiked plasma samples that contained 0 to 750,000 copies of human immunodeficiency virus type 1 (HIV-1) RNA per ml was tested four times in each of four laboratories (1,344 assays). Negative results (<50 copies/ml) were obtained in 30 of 32 (94%) assays with seronegative samples, 66 of 128 (52%) assays with HIV-1 RNA at 50 copies/ml, and 5 of 128 (4%) assays with HIV-1 RNA at 100 copies/ml. The assay was linear from 100 to 500,000 copies/ml. The within-run standard deviation (SD) of the log10 estimated HIV-1 RNA concentration was 0.08 at 1,000 to 500,000 copies/ml, increased below 1,000 copies/ml, and was 0.17 at 100 copies/ml. Between-run reproducibility at 100 to 500 copies/ml was <0.10 log10 in most comparisons. Interlaboratory differences across runs were ≤0.10 log10 at all concentrations examined. A subset of the panel (25 to 500 copies/ml) was also analyzed by the US-RT-PCR assay. The within-run SD varied inversely with the log10 HIV-1 RNA concentration but was higher than the SD for the bDNA 3.0 assay at all concentrations. Log-log regression analysis indicated that the two methods produced very similar estimates at 100 to 500 copies/ml. In parallel testing of clinical specimens with low HIV-1 RNA levels, 80 plasma samples with <50 copies/ml by the US-RT-PCR assay had <50 copies/ml when they were retested by the bDNA 3.0 assay. In contrast, 11 of 78 (14%) plasma samples with <50 copies/ml by the bDNA 3.0 assay had ≥50 copies/ml when they were retested by the US-RT-PCR assay (median, 86 copies/ml; range, 50 to 217 copies/ml). Estimation of bDNA 3.0 values of <50 copies/ml by extending the standard curve of the assay showed that these samples with discrepant results had higher HIV-1 RNA levels than the samples with concordant results (median, 34 versus 17 copies/ml; P = 0.0051 by the Wilcoxon two-sample test). The excellent reproducibility, broad linear range, and good sensitivity of the bDNA 3.0 assay make it a very attractive method for quantitation of HIV-1 RNA levels in plasma. PMID:10921936
Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C
2011-09-01
Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
Adsorption of polar organic molecules on sediments: Case-study on Callovian-Oxfordian claystone.
Rasamimanana, S; Lefèvre, G; Dagnelie, R V H
2017-08-01
The release and transport of anthropogenic organic matter through the geosphere is often an environmental criterion of safety. Sedimentary rocks are widely studied in this context as geological barriers for waste management. It is the case of Callovian-Oxfordian claystone (COx), for which several studies report adsorption of anthropogenic organic molecules. In this study, we evaluated and reviewed adsorption data of polar organic molecules on COx claystone. Experiments were performed on raw claystone, decarbonated and clay fractions. Adsorption isotherms were measured with adsorbates of various polarities: adipate, benzoate, ortho-phthalate, succinate, gluconate, oxalate, EDTA, citrate. A significant adsorption was observed for multidentate polycarboxylic acids as evidenced with phthalate, succinate, oxalate, gluconate, EDTA and citrate (R d = 1.53, 3.52, 8.4, 8.8, 12.4, 54.7 L kg -1 respectively). Multiple linear regression were performed as a statistical analysis to determine the predictors from these adsorption data. A linear correlation between adsorption data (R d ) and dipole moment (μ) of adsorbates was evidenced (R 2 = 0.91). Molecules with a high dipole moment, μ(D) > 2.5, displayed a significant adsorption, R d ≫1 L kg -1 . A qualitative correlation can be easily estimated using the water/octanol partition coefficient, P ow , of adsorbates (R 2 = 0.77). In this case, two opposite trends were distinguished for polar and apolar molecules. The use of organic carbon content in sediments is relevant for predicting adsorption of apolar compounds, log (P ow )>+1. The oxides/clays contents may be relevant regarding polar molecules, log ( apparent P ow )<-1. The proposed scheme offers a general methodology for investigation of geo-barriers towards heterogeneous organic plumes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma
2016-01-01
Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Analysis of the SFR-M∗ plane at z < 3: single fitting versus multi-Gaussian decomposition
NASA Astrophysics Data System (ADS)
Bisigello, L.; Caputi, K. I.; Grogin, N.; Koekemoer, A.
2018-01-01
The analysis of galaxies on the star formation rate-stellar mass (SFR-M∗) plane is a powerful diagnostic for galaxy evolution at different cosmic times. We consider a sample of 24 463 galaxies from the CANDELS/GOODS-S survey to conduct a detailed analysis of the SFR-M∗ relation at redshifts re than three dex in stellar mass. To obtain SFR estimates, we utilise mid- and far-IR photometry when available, and rest-UV fluxes for all the other galaxies. We perform our analysis in different redshift bins, with two different methods: 1) a linear regression fitting of all star-forming galaxies, defined as those with specific SFRs log 10(sSFR/ yr-1) > -9.8, similarly to what is typically done in the literature; 2) a multi-Gaussian decomposition to identify the galaxy main sequence (MS), the starburst sequence and the quenched galaxy cloud. We find that the MS slope becomes flatter when higher stellar mass cuts are adopted, and that the apparent slope change observed at high masses depends on the SFR estimation method. In addition, the multi-Gaussian decomposition reveals the presence of a starburst population which increases towards low stellar masses and high redshifts. We find that starbursts make up 5% of all galaxies at z = 0.5-1.0, while they account for 16% of galaxies at 2
DOE Office of Scientific and Technical Information (OSTI.GOV)
DiCostanzo, D; Ayan, A; Woollard, J
Purpose: To predict potential failures of hardware within the Varian TrueBeam linear accelerator in order to proactively replace parts and decrease machine downtime. Methods: Machine downtime is a problem for all radiation oncology departments and vendors. Most often it is the result of unexpected equipment failure, and increased due to lack of in-house clinical engineering support. Preventative maintenance attempts to assuage downtime, but often is ineffective at preemptively preventing many failure modes such as MLC motor failures, the need to tighten a gantry chain, or the replacement of a jaw motor, among other things. To attempt to alleviate downtime, softwaremore » was developed in house that determines the maximum value of each axis enumerated in the Truebeam trajectory log files. After patient treatments, this data is stored in a SQL database. Microsoft Power BI is used to plot the average maximum error of each day of each machine as a function of time. The results are then correlated with actual faults that occurred at the machine with the help of Varian service engineers. Results: Over the course of six months, 76,312 trajectory logs have been written into the database and plotted in Power BI. Throughout the course of analysis MLC motors have been replaced on three machines due to the early warning of the trajectory log analysis. The service engineers have also been alerted to possible gantry issues on one occasion due to the aforementioned analysis. Conclusion: Analyzing the trajectory log data is a viable and effective early warning system for potential failures of the TrueBeam linear accelerator. With further analysis and tightening of the tolerance values used to determine a possible imminent failure, it should be possible to pinpoint future issues more thoroughly and for more axes of motion.« less
Study on power grid characteristics in summer based on Linear regression analysis
NASA Astrophysics Data System (ADS)
Tang, Jin-hui; Liu, You-fei; Liu, Juan; Liu, Qiang; Liu, Zhuan; Xu, Xi
2018-05-01
The correlation analysis of power load and temperature is the precondition and foundation for accurate load prediction, and a great deal of research has been made. This paper constructed the linear correlation model between temperature and power load, then the correlation of fault maintenance work orders with the power load is researched. Data details of Jiangxi province in 2017 summer such as temperature, power load, fault maintenance work orders were adopted in this paper to develop data analysis and mining. Linear regression models established in this paper will promote electricity load growth forecast, fault repair work order review, distribution network operation weakness analysis and other work to further deepen the refinement.
Aspects of porosity prediction using multivariate linear regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Byrnes, A.P.; Wilson, M.D.
1991-03-01
Highly accurate multiple linear regression models have been developed for sandstones of diverse compositions. Porosity reduction or enhancement processes are controlled by the fundamental variables, Pressure (P), Temperature (T), Time (t), and Composition (X), where composition includes mineralogy, size, sorting, fluid composition, etc. The multiple linear regression equation, of which all linear porosity prediction models are subsets, takes the generalized form: Porosity = C{sub 0} + C{sub 1}(P) + C{sub 2}(T) + C{sub 3}(X) + C{sub 4}(t) + C{sub 5}(PT) + C{sub 6}(PX) + C{sub 7}(Pt) + C{sub 8}(TX) + C{sub 9}(Tt) + C{sub 10}(Xt) + C{sub 11}(PTX) + C{submore » 12}(PXt) + C{sub 13}(PTt) + C{sub 14}(TXt) + C{sub 15}(PTXt). The first four primary variables are often interactive, thus requiring terms involving two or more primary variables (the form shown implies interaction and not necessarily multiplication). The final terms used may also involve simple mathematic transforms such as log X, e{sup T}, X{sup 2}, or more complex transformations such as the Time-Temperature Index (TTI). The X term in the equation above represents a suite of compositional variable and, therefore, a fully expanded equation may include a series of terms incorporating these variables. Numerous published bivariate porosity prediction models involving P (or depth) or Tt (TTI) are effective to a degree, largely because of the high degree of colinearity between p and TTI. However, all such bivariate models ignore the unique contributions of P and Tt, as well as various X terms. These simpler models become poor predictors in regions where colinear relations change, were important variables have been ignored, or where the database does not include a sufficient range or weight distribution for the critical variables.« less
Coping with Guilt and Shame: A Narrative Approach
ERIC Educational Resources Information Center
Silfver, Mia
2007-01-01
Autobiographical narratives (N = 97) of guilt and shame experiences were analysed to determine how the nature of emotion and context relate to ways of coping in such situations. The coding categories were created by content analysis, and the connections between categories were analysed with optimal scaling and log-linear analysis. Two theoretical…
ERIC Educational Resources Information Center
Zwick, Rebecca; Lenaburg, Lubella
2009-01-01
In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…
Real-time soil sensing based on fiber optics and spectroscopy
NASA Astrophysics Data System (ADS)
Li, Minzan
2005-08-01
Using NIR spectroscopic techniques, correlation analysis and regression analysis for soil parameter estimation was conducted with raw soil samples collected in a cornfield and a forage field. Soil parameters analyzed were soil moisture, soil organic matter, nitrate nitrogen, soil electrical conductivity and pH. Results showed that all soil parameters could be evaluated by NIR spectral reflectance. For soil moisture, a linear regression model was available at low moisture contents below 30 % db, while an exponential model can be used in a wide range of moisture content up to 100 % db. Nitrate nitrogen estimation required a multi-spectral exponential model and electrical conductivity could be evaluated by a single spectral regression. According to the result above mentioned, a real time soil sensor system based on fiber optics and spectroscopy was developed. The sensor system was composed of a soil subsoiler with four optical fiber probes, a spectrometer, and a control unit. Two optical fiber probes were used for illumination and the other two optical fiber probes for collecting soil reflectance from visible to NIR wavebands at depths around 30 cm. The spectrometer was used to obtain the spectra of reflected lights. The control unit consisted of a data logging device, a personal computer, and a pulse generator. The experiment showed that clear photo-spectral reflectance was obtained from the underground soil. The soil reflectance was equal to that obtained by the desktop spectrophotometer in laboratory tests. Using the spectral reflectance, the soil parameters, such as soil moisture, pH, EC and SOM, were evaluated.
Stuart, James Ian; Delport, Johan; Lannigan, Robert; Zahariadis, George
2014-01-01
BACKGROUND: Disease monitoring of viruses using real-time polymerase chain reaction (PCR) requires knowledge of the precision of the test to determine what constitutes a significant change. Calculation of quantitative PCR confidence limits requires bivariate statistical methods. OBJECTIVE: To develop a simple-to-use graphical user interface to determine the uncertainty of measurement (UOM) of BK virus, cytomegalovirus (CMV) and Epstein-Barr virus (EBV) real-time PCR assays. METHODS: Thirty positive clinical samples for each of the three viral assays were repeated once. A graphical user interface was developed using a spreadsheet (Excel, Microsoft Corporation, USA) to enable data entry and calculation of the UOM (according to Fieller’s theorem) and PCR efficiency. RESULTS: The confidence limits for the BK virus, CMV and EBV tests were ∼0.5 log, 0.5 log to 1.0 log, and 0.5 log to 1.0 log, respectively. The efficiencies of these assays, in the same order were 105%, 119% and 90%. The confidence limits remained stable over the linear range of all three tests. DISCUSSION: A >5 fold (0.7 log) and a >3-fold (0.5 log) change in viral load were significant for CMV and EBV when the results were ≤1000 copies/mL and >1000 copies/mL, respectively. A >3-fold (0.5 log) change in viral load was significant for BK virus over its entire linear range. PCR efficiency was ideal for BK virus and EBV but not CMV. Standardized international reference materials and shared reporting of UOM among laboratories are required for the development of treatment guidelines for BK virus, CMV and EBV in the context of changes in viral load. PMID:25285125
Stuart, James Ian; Delport, Johan; Lannigan, Robert; Zahariadis, George
2014-07-01
Disease monitoring of viruses using real-time polymerase chain reaction (PCR) requires knowledge of the precision of the test to determine what constitutes a significant change. Calculation of quantitative PCR confidence limits requires bivariate statistical methods. To develop a simple-to-use graphical user interface to determine the uncertainty of measurement (UOM) of BK virus, cytomegalovirus (CMV) and Epstein-Barr virus (EBV) real-time PCR assays. Thirty positive clinical samples for each of the three viral assays were repeated once. A graphical user interface was developed using a spreadsheet (Excel, Microsoft Corporation, USA) to enable data entry and calculation of the UOM (according to Fieller's theorem) and PCR efficiency. The confidence limits for the BK virus, CMV and EBV tests were ∼0.5 log, 0.5 log to 1.0 log, and 0.5 log to 1.0 log, respectively. The efficiencies of these assays, in the same order were 105%, 119% and 90%. The confidence limits remained stable over the linear range of all three tests. A >5 fold (0.7 log) and a >3-fold (0.5 log) change in viral load were significant for CMV and EBV when the results were ≤1000 copies/mL and >1000 copies/mL, respectively. A >3-fold (0.5 log) change in viral load was significant for BK virus over its entire linear range. PCR efficiency was ideal for BK virus and EBV but not CMV. Standardized international reference materials and shared reporting of UOM among laboratories are required for the development of treatment guidelines for BK virus, CMV and EBV in the context of changes in viral load.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, Brigette; King, Ann; Lo, Y.M. Dennis
Purpose: Plasma Epstein-Barr virus DNA (pEBV DNA) is an important prognostic marker in nasopharyngeal carcinoma (NPC). This study tested the hypotheses that pEBV DNA reflects tumor burden and metabolic activity by evaluating its relationship with tumor volume and {sup 18}F-fluorodeoxyglucose ({sup 18}F-FDG) uptake in NPC. Methods and Materials: Pre-treatment pEBV DNA analysis, {sup 18}F-FDG positron emission tomography-computed tomography scan (PET-CT) and magnetic resonance imaging (MRI) of the head and neck were performed in 57 patients. Net volume (cm{sup 3}) of the primary tumor (T{sub vol}) and regional nodes (N{sub vol}) were quantified on MRI. {sup 18}F-FDG uptake was expressed asmore » the maximum standardized uptake value (SUV{sub max}) at the primary tumor (T{sub suv}) and regional nodes (N{sub suv}). Lesions with SUV{sub max} {>=} 2.5 were considered malignant. Relationship between SUV{sub max}, natural logarithm (log) of pEBV DNA, and square root (sq) of MRI volumes was analyzed using the Wilcoxon test. A linear regression model was constructed to test for any interaction between variables and disease stage. Results: Log-pEBV DNA showed significant correlation with sq-T{sub vol} (r = 0.393), sq-N{sub vol} (r = 0.452), total tumor volume (sq-Total{sub vol} = T{sub vol} + N{sub vol}, r = 0.554), T{sub suv} (r = 0.276), N{sub suv} (r = 0.434), and total SUV{sub max} (Total{sub suv} = T{sub suv} + N{sub suv}, r = 0.457). Likewise, sq-T{sub vol} was correlated to T{sub suv} (r 0.426), and sq-N{sub vol} with N{sub suv} (r = 0.651). Regression analysis showed that only log-pEBV DNA was significantly associated with sq-Total{sub vol} (p < 0.001; parameter estimate = 8.844; 95% confidence interval = 3.986-13.703), whereas Sq-T{sub vol} was significantly associated with T{sub suv} (p = 0.002; parameter estimate = 3.923; 95% confidence interval = 1.498-6.348). Conclusion: This study supports the hypothesis that cell-free plasma EBV DNA is a marker of tumor burden in EBV-related NPC.« less
Dispositional optimism, depression, disability and quality of life in Parkinson’s disease
Gison, Annalisa; Dall’Armi, Valentina; Donati, Valentina; Rizza, Federica; Giaquinto, Salvatore
2014-01-01
Summary Very little research on dispositional optimism (DO) has been carried out in the field of Parkinson’s disease (PD). The present cross-sectional study, focusing on this personality trait, was performed with two main aims: i) to compare DO between patients with PD and a control group (CG); ii) to perform, in the PD group, a regression analysis including health-related variables, such as depression, anxiety, quality of life (QoL) and activities of daily living. Seventy PD participants and 70 healthy volunteers were enrolled in the study. The Mann-Whitney test was used to compare life orientation between the PD and CG groups. In the PD group, Pearson’s correlation analysis was used to investigate the relationship between the measures of DO and the other variables. Means of log-linear regression were also used. Mean ratios adjusted for sex, age, education, and severity of disease were estimated, with relative 95% confidence intervals and p-values. The main results were as follows: i) no significant difference in DO was found between the PD participants and the CG; ii) DO was positively associated with QoL and emotional distress and inversely correlated with the Unified Parkinson’s Disease Rating Scale; iii) DO was not correlated with disability. In conclusion, high DO predicts a satisfactory quality of life, low emotional distress and reduced disease severity in PD. PMID:25306121
voom: precision weights unlock linear model analysis tools for RNA-seq read counts
2014-01-01
New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods. PMID:24485249
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.
Law, Charity W; Chen, Yunshun; Shi, Wei; Smyth, Gordon K
2014-02-03
New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.
Supek, Fran; Ramljak, Tatjana Šumanovac; Marjanović, Marko; Buljubašić, Maja; Kragol, Goran; Ilić, Nataša; Smuc, Tomislav; Zahradka, Davor; Mlinarić-Majerski, Kata; Kralj, Marijeta
2011-08-01
18-crown-6 ethers are known to exert their biological activity by transporting K(+) ions across cell membranes. Using non-linear Support Vector Machines regression, we searched for structural features that influence antiproliferative activity in a diverse set of 19 known oxa-, monoaza- and diaza-18-crown-6 ethers. Here, we show that the logP of the molecule is the most important molecular descriptor, among ∼1300 tested descriptors, in determining biological potency (R(2)(cv) = 0.704). The optimal logP was at 5.5 (Ghose-Crippen ALOGP estimate) while both higher and lower values were detrimental to biological potency. After controlling for logP, we found that the antiproliferative activity of the molecule was generally not affected by side chain length, molecular symmetry, or presence of side chain amide links. To validate this QSAR model, we synthesized six novel, highly lipophilic diaza-18-crown-6 derivatives with adamantane moieties attached to the side arms. These compounds have near-optimal logP values and consequently exhibit strong growth inhibition in various human cancer cell lines and a bacterial system. The bioactivities of different diaza-18-crown-6 analogs in Bacillus subtilis and cancer cells were correlated, suggesting conserved molecular features may be mediating the cytotoxic response. We conclude that relying primarily on the logP is a sensible strategy in preparing future 18-crown-6 analogs with optimized biological activity. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Functional Relationships and Regression Analysis.
ERIC Educational Resources Information Center
Preece, Peter F. W.
1978-01-01
Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression
ERIC Educational Resources Information Center
Beckstead, Jason W.
2012-01-01
The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.
Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification
Feng, Yang; Jiang, Jiancheng; Tong, Xin
2015-01-01
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing. PMID:27185970
Monitoring Seismo-volcanic and Infrasonic Signals at Volcanoes: Mt. Etna Case Study
NASA Astrophysics Data System (ADS)
Cannata, Andrea; Di Grazia, Giuseppe; Aliotta, Marco; Cassisi, Carmelo; Montalto, Placido; Patanè, Domenico
2013-11-01
Volcanoes generate a broad range of seismo-volcanic and infrasonic signals, whose features and variations are often closely related to volcanic activity. The study of these signals is hence very useful in the monitoring and investigation of volcano dynamics. The analysis of seismo-volcanic and infrasonic signals requires specifically developed techniques due to their unique characteristics, which are generally quite distinct compared with tectonic and volcano-tectonic earthquakes. In this work, we describe analysis methods used to detect and locate seismo-volcanic and infrasonic signals at Mt. Etna. Volcanic tremor sources are located using a method based on spatial seismic amplitude distribution, assuming propagation in a homogeneous medium. The tremor source is found by calculating the goodness of the linear regression fit ( R 2) of the log-linearized equation of the seismic amplitude decay with distance. The location method for long-period events is based on the joint computation of semblance and R 2 values, and the location method of very long-period events is based on the application of radial semblance. Infrasonic events and tremor are located by semblance-brightness- and semblance-based methods, respectively. The techniques described here can also be applied to other volcanoes and do not require particular network geometries (such as arrays) but rather simple sparse networks. Using the source locations of all the considered signals, we were able to reconstruct the shallow plumbing system (above sea level) during 2011.
Natarajan, R; Nirdosh, I; Venuvanalingam, P; Ramalingam, M
2002-07-01
The QPPR approach has been used to model cupferrons as mineral collectors. Separation efficiencies (Es) of these chelating agents have been correlated with property parameters namely, log P, log Koc, substituent-constant sigma, Mullikan and ESP derived charges using multiple regression analysis. Es of substituted-cupferrons in the flotation of a uranium ore could be predicted within experimental error either by log P or log Koc and an electronic parameter. However, when a halo, methoxy or phenyl substituent was in para to the chelating group, experimental Es was greater than the predicted values. Inclusion of a Boolean type indicative parameter improved significantly the predictability power. This approach has been extended to 2-aminothiophenols that were used to float a zinc ore and the correlations were found to be reasonably good.
Gong, Jian; Duan, Dandan; Yang, Yu; Ran, Yong; Chen, Diyun
2016-12-01
Endocrine disrupting chemicals (EDCs) were seasonally investigated in surface water, suspended particulate matter, and sediments of the Pearl River Delta (PRD), South China. EDC concentrations in the surface water were generally higher in the summer than in winter. The surface water in the investigated rivers was heavily contaminated by the phenolic xenoestrogens. Moreover, the in-situ log K soc and log K poc values and their regression with log K ow in the field experiments suggest that binding mechanisms other than hydrophobic interaction are present for the sedimentary organic carbon and particulate organic carbon (SOC/POC). The logK soc -logK ow and logK poc -logK ow regression analyses imply that higher complexity of nonhydrophobic interactions with EDCs is present on the SOC samples comparing with the POC samples, which is related to their different sources. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Saunders, Ian; Ottemöller, Lars; Brandt, Martin B. C.; Fourie, Christoffel J. S.
2013-04-01
A relation to determine local magnitude ( M L) based on the original Richter definition is empirically derived from synthetic Wood-Anderson seismograms recorded by the South African National Seismograph Network. In total, 263 earthquakes in the distance range 10 to 1,000 km, representing 1,681 trace amplitudes measured in nanometers from synthesized Wood-Anderson records on the vertical channel were considered to derive an attenuation relation appropriate for South Africa through multiple regression analysis. Additionally, station corrections were determined for 26 stations during the regression analysis resulting in values ranging between -0.31 and 0.50. The most appropriate M L scale for South Africa from this study satisfies the equation: {M_{{{L}}}} = {{lo}}{{{g}}_{{10}}}(A) + 1.149{{lo}}{{{g}}_{{10}}}(R) + 0.00063R + 2.04 - S The anelastic attenuation term derived from this study indicates that ground motion attenuation is significantly different from Southern California but comparable with stable continental regions.
Harvard, Stephanie; Guh, Daphne; Bansback, Nick; Richette, Pascal; Saraux, Alain; Fautrel, Bruno; Anis, Aslam H
2017-10-01
To evaluate a classification system to define adherence to axial spondyloarthritis (axSpA) anti-tumor necrosis factor (anti-TNF) use recommendations and examine the effect of adherence on outcomes in the DESIR cohort (Devenir des Spondylarthropathies Indifférenciées Récentes). Using alternate definitions of adherence, patients were classified as adherent "timely" anti-TNF users, nonadherent "late" anti-TNF users, adherent nonusers ("no anti-TNF need"), non-adherent nonusers ("unmet anti-TNF need"). Multivariate models were fitted to examine the effect of adherence on quality-adjusted life-years (QALY), total costs, and nonbiologic costs 1 year following an index date. Generalized linear regression models assuming a γ-distribution with log link were used for costs outcomes and linear regression models for QALY outcomes. Using the main definition of adherence, there were no significant differences between late anti-TNF users and timely anti-TNF users in total costs (RR 0.86, 95% CI 0.54-1.36, p = 0.516) or nonbiologic costs (RR 0.72, 95% CI 0.44-1.18, p = 0.187). However, in the sensitivity analysis, late anti-TNF users had significantly increased nonbiologic costs compared with timely users (RR 1.58, 95% CI 1.06-2.36, p = 0.026). In the main analysis, there were no significant differences in QALY between timely anti-TNF users and late anti-TNF users, or between timely users and patients with unmet anti-TNF need. In the sensitivity analysis, patients with unmet anti-TNF need had significantly lower QALY than timely anti-TNF users (-0.04, 95% CI -0.07 to -0.01, p = 0.016). The effect of adherence to anti-TNF recommendations on outcomes was sensitive to the definition of adherence used, highlighting the need to validate methods to measure adherence.
Hammer, Jort; Haftka, Joris J-H; Scherpenisse, Peter; Hermens, Joop L M; de Voogt, Pim W P
2017-02-01
To predict the fate and potential effects of organic contaminants, information about their hydrophobicity is required. However, common parameters to describe the hydrophobicity of organic compounds (e.g., octanol-water partition constant [K OW ]) proved to be inadequate for ionic and nonionic surfactants because of their surface-active properties. As an alternative approach to determine their hydrophobicity, the aim of the present study was therefore to measure the retention of a wide range of surfactants on a C 18 stationary phase. Capacity factors in pure water (k' 0 ) increased linearly with increasing number of carbon atoms in the surfactant structure. Fragment contribution values were determined for each structural unit with multilinear regression, and the results were consistent with the expected influence of these fragments on the hydrophobicity of surfactants. Capacity factors of reference compounds and log K OW values from the literature were used to estimate log K OW values for surfactants (log KOWHPLC). These log KOWHPLC values were also compared to log K OW values calculated with 4 computational programs: KOWWIN, Marvin calculator, SPARC, and COSMOThermX. In conclusion, capacity factors from a C 18 stationary phase are found to better reflect hydrophobicity of surfactants than their K OW values. Environ Toxicol Chem 2017;36:329-336. © 2016 The Authors. Environmental Toxicology and Chemistry Published by Wiley Periodicals, Inc. on behalf of SETAC. © 2016 The Authors. Environmental Toxicology and Chemistry Published by Wiley Periodicals, Inc. on behalf of SETAC.
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.
2000-01-01
The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.
Ciura, Krzesimir; Belka, Mariusz; Kawczak, Piotr; Bączek, Tomasz; Markuszewski, Michał J; Nowakowska, Joanna
2017-09-05
The objective of this paper is to build QSRR/QSAR model for predicting the blood-brain barrier (BBB) permeability. The obtained models are based on salting-out thin layer chromatography (SOTLC) constants and calculated molecular descriptors. Among chromatographic methods SOTLC was chosen, since the mobile phases are free of organic solvent. As consequences, there are less toxic, and have lower environmental impact compared to classical reserved phases liquid chromatography (RPLC). During the study three stationary phase silica gel, cellulose plates and neutral aluminum oxide were examined. The model set of solutes presents a wide range of log BB values, containing compounds which cross the BBB readily and molecules poorly distributed to the brain including drugs acting on the nervous system as well as peripheral acting drugs. Additionally, the comparison of three regression models: multiple linear regression (MLR), partial least-squares (PLS) and orthogonal partial least squares (OPLS) were performed. The designed QSRR/QSAR models could be useful to predict BBB of systematically synthesized newly compounds in the drug development pipeline and are attractive alternatives of time-consuming and demanding directed methods for log BB measurement. The study also shown that among several regression techniques, significant differences can be obtained in models performance, measured by R 2 and Q 2 , hence it is strongly suggested to evaluate all available options as MLR, PLS and OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
Thackeray, J F; Dykes, S
2016-02-01
Thackeray has previously explored the possibility of using a morphometric approach to quantify the "amount" of variation within species and to assess probabilities of conspecificity when two fossil specimens are compared, instead of "pigeon-holing" them into discrete species. In an attempt to obtain a statistical (probabilistic) definition of a species, Thackeray has recognized an approximation of a biological species constant (T=-1.61) based on the log-transformed standard error of the coefficient m (log sem) in regression analysis of cranial and other data from pairs of specimens of conspecific extant species, associated with regression equations of the form y=mx+c where m is the slope and c is the intercept, using measurements of any specimen A (x axis), and any specimen B of the same species (y axis). The log-transformed standard error of the co-efficient m (log sem) is a measure of the degree of similarity between pairs of specimens, and in this study shows central tendency around a mean value of -1.61 and standard deviation 0.10 for modern conspecific specimens. In this paper we focus attention on the need to take into account the range of difference in log sem values (Δlog sem or "delta log sem") obtained from comparisons when specimen A (x axis) is compared to B (y axis), and secondly when specimen A (y axis) is compared to B (x axis). Thackeray's approach can be refined to focus on high probabilities of conspecificity for pairs of specimens for which log sem is less than -1.61 and for which Δlog sem is less than 0.03. We appeal for the adoption of a concept here called "sigma taxonomy" (as opposed to "alpha taxonomy"), recognizing that boundaries between species are not always well defined. Copyright © 2015 Elsevier GmbH. All rights reserved.
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
NASA Astrophysics Data System (ADS)
Rahman, Md Mushfiqur; Lei, Yu; Kalantzis, Georgios
2018-01-01
Quality Assurance (QA) for medical linear accelerator (linac) is one of the primary concerns in external beam radiation Therapy. Continued advancements in clinical accelerators and computer control technology make the QA procedures more complex and time consuming which often, adequate software accompanied with specific phantoms is required. To ameliorate that matter, we introduce QALMA (Quality Assurance for Linac with MATLAB), a MALAB toolkit which aims to simplify the quantitative analysis of QA for linac which includes Star-Shot analysis, Picket Fence test, Winston-Lutz test, Multileaf Collimator (MLC) log file analysis and verification of light & radiation field coincidence test.
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Police work stressors and cardiac vagal control.
Andrew, Michael E; Violanti, John M; Gu, Ja K; Fekedulegn, Desta; Li, Shengqiao; Hartley, Tara A; Charles, Luenda E; Mnatsakanova, Anna; Miller, Diane B; Burchfiel, Cecil M
2017-09-10
This study examines relationships between the frequency and intensity of police work stressors and cardiac vagal control, estimated using the high frequency component of heart rate variability (HRV). This is a cross-sectional study of 360 officers from the Buffalo New York Police Department. Police stress was measured using the Spielberger police stress survey, which includes exposure indices created as the product of the self-evaluation of how stressful certain events were and the self-reported frequency with which they occurred. Vagal control was estimated using the high frequency component of resting HRV calculated in units of milliseconds squared and reported in natural log scale. Associations between police work stressors and vagal control were examined using linear regression for significance testing and analysis of covariance for descriptive purposes, stratified by gender, and adjusted for age and race/ethnicity. There were no significant associations between police work stressor exposure indices and vagal control among men. Among women, the inverse associations between the lack of support stressor exposure and vagal control were statistically significant in adjusted models for indices of exposure over the past year (lowest stressor quartile: M = 5.57, 95% CI 5.07 to 6.08, and highest stressor quartile: M = 5.02, 95% CI 4.54 to 5.51, test of association from continuous linear regression of vagal control on lack of support stressor β = -0.273, P = .04). This study supports an inverse association between lack of organizational support and vagal control among female but not male police officers. © 2017 Wiley Periodicals, Inc.
CAG repeat expansion in Huntington disease determines age at onset in a fully dominant fashion
Lee, J.-M.; Ramos, E.M.; Lee, J.-H.; Gillis, T.; Mysore, J.S.; Hayden, M.R.; Warby, S.C.; Morrison, P.; Nance, M.; Ross, C.A.; Margolis, R.L.; Squitieri, F.; Orobello, S.; Di Donato, S.; Gomez-Tortosa, E.; Ayuso, C.; Suchowersky, O.; Trent, R.J.A.; McCusker, E.; Novelletto, A.; Frontali, M.; Jones, R.; Ashizawa, T.; Frank, S.; Saint-Hilaire, M.H.; Hersch, S.M.; Rosas, H.D.; Lucente, D.; Harrison, M.B.; Zanko, A.; Abramson, R.K.; Marder, K.; Sequeiros, J.; Paulsen, J.S.; Landwehrmeyer, G.B.; Myers, R.H.; MacDonald, M.E.; Durr, Alexandra; Rosenblatt, Adam; Frati, Luigi; Perlman, Susan; Conneally, Patrick M.; Klimek, Mary Lou; Diggin, Melissa; Hadzi, Tiffany; Duckett, Ayana; Ahmed, Anwar; Allen, Paul; Ames, David; Anderson, Christine; Anderson, Karla; Anderson, Karen; Andrews, Thomasin; Ashburner, John; Axelson, Eric; Aylward, Elizabeth; Barker, Roger A.; Barth, Katrin; Barton, Stacey; Baynes, Kathleen; Bea, Alexandra; Beall, Erik; Beg, Mirza Faisal; Beglinger, Leigh J.; Biglan, Kevin; Bjork, Kristine; Blanchard, Steve; Bockholt, Jeremy; Bommu, Sudharshan Reddy; Brossman, Bradley; Burrows, Maggie; Calhoun, Vince; Carlozzi, Noelle; Chesire, Amy; Chiu, Edmond; Chua, Phyllis; Connell, R.J.; Connor, Carmela; Corey-Bloom, Jody; Craufurd, David; Cross, Stephen; Cysique, Lucette; Santos, Rachelle Dar; Davis, Jennifer; Decolongon, Joji; DiPietro, Anna; Doucette, Nicholas; Downing, Nancy; Dudler, Ann; Dunn, Steve; Ecker, Daniel; Epping, Eric A.; Erickson, Diane; Erwin, Cheryl; Evans, Ken; Factor, Stewart A.; Farias, Sarah; Fatas, Marta; Fiedorowicz, Jess; Fullam, Ruth; Furtado, Sarah; Garde, Monica Bascunana; Gehl, Carissa; Geschwind, Michael D.; Goh, Anita; Gooblar, Jon; Goodman, Anna; Griffith, Jane; Groves, Mark; Guttman, Mark; Hamilton, Joanne; Harrington, Deborah; Harris, Greg; Heaton, Robert K.; Helmer, Karl; Henneberry, Machelle; Hershey, Tamara; Herwig, Kelly; Howard, Elizabeth; Hunter, Christine; Jankovic, Joseph; Johnson, Hans; Johnson, Arik; Jones, Kathy; Juhl, Andrew; Kim, Eun Young; Kimble, Mycah; King, Pamela; Klimek, Mary Lou; Klöppel, Stefan; Koenig, Katherine; Komiti, Angela; Kumar, Rajeev; Langbehn, Douglas; Leavitt, Blair; Leserman, Anne; Lim, Kelvin; Lipe, Hillary; Lowe, Mark; Magnotta, Vincent A.; Mallonee, William M.; Mans, Nicole; Marietta, Jacquie; Marshall, Frederick; Martin, Wayne; Mason, Sarah; Matheson, Kirsty; Matson, Wayne; Mazzoni, Pietro; McDowell, William; Miedzybrodzka, Zosia; Miller, Michael; Mills, James; Miracle, Dawn; Montross, Kelsey; Moore, David; Mori, Sasumu; Moser, David J.; Moskowitz, Carol; Newman, Emily; Nopoulos, Peg; Novak, Marianne; O'Rourke, Justin; Oakes, David; Ondo, William; Orth, Michael; Panegyres, Peter; Pease, Karen; Perlman, Susan; Perlmutter, Joel; Peterson, Asa; Phillips, Michael; Pierson, Ron; Potkin, Steve; Preston, Joy; Quaid, Kimberly; Radtke, Dawn; Rae, Daniela; Rao, Stephen; Raymond, Lynn; Reading, Sarah; Ready, Rebecca; Reece, Christine; Reilmann, Ralf; Reynolds, Norm; Richardson, Kylie; Rickards, Hugh; Ro, Eunyoe; Robinson, Robert; Rodnitzky, Robert; Rogers, Ben; Rosenblatt, Adam; Rosser, Elisabeth; Rosser, Anne; Price, Kathy; Price, Kathy; Ryan, Pat; Salmon, David; Samii, Ali; Schumacher, Jamy; Schumacher, Jessica; Sendon, Jose Luis Lópenz; Shear, Paula; Sheinberg, Alanna; Shpritz, Barnett; Siedlecki, Karen; Simpson, Sheila A.; Singer, Adam; Smith, Jim; Smith, Megan; Smith, Glenn; Snyder, Pete; Song, Allen; Sran, Satwinder; Stephan, Klaas; Stober, Janice; Sü?muth, Sigurd; Suter, Greg; Tabrizi, Sarah; Tempkin, Terry; Testa, Claudia; Thompson, Sean; Thomsen, Teri; Thumma, Kelli; Toga, Arthur; Trautmann, Sonja; Tremont, Geoff; Turner, Jessica; Uc, Ergun; Vaccarino, Anthony; van Duijn, Eric; Van Walsem, Marleen; Vik, Stacie; Vonsattel, Jean Paul; Vuletich, Elizabeth; Warner, Tom; Wasserman, Paula; Wassink, Thomas; Waterman, Elijah; Weaver, Kurt; Weir, David; Welsh, Claire; Werling-Witkoske, Chris; Wesson, Melissa; Westervelt, Holly; Weydt, Patrick; Wheelock, Vicki; Williams, Kent; Williams, Janet; Wodarski, Mary; Wojcieszek, Joanne; Wood, Jessica; Wood-Siverio, Cathy; Wu, Shuhua; Yastrubetskaya, Olga; de Yebenes, Justo Garcia; Zhao, Yong Qiang; Zimbelman, Janice; Zschiegner, Roland; Aaserud, Olaf; Abbruzzese, Giovanni; Andrews, Thomasin; Andrich, Jurgin; Antczak, Jakub; Arran, Natalie; Artiga, Maria J. Saiz; Bachoud-Lévi, Anne-Catherine; Banaszkiewicz, Krysztof; di Poggio, Monica Bandettini; Bandmann, Oliver; Barbera, Miguel A.; Barker, Roger A.; Barrero, Francisco; Barth, Katrin; Bas, Jordi; Beister, Antoine; Bentivoglio, Anna Rita; Bertini, Elisabetta; Biunno, Ida; Bjørgo, Kathrine; Bjørnevoll, Inga; Bohlen, Stefan; Bonelli, Raphael M.; Bos, Reineke; Bourne, Colin; Bradbury, Alyson; Brockie, Peter; Brown, Felicity; Bruno, Stefania; Bryl, Anna; Buck, Andrea; Burg, Sabrina; Burgunder, Jean-Marc; Burns, Peter; Burrows, Liz; Busquets, Nuria; Busse, Monica; Calopa, Matilde; Carruesco, Gemma T.; Casado, Ana Gonzalez; Catena, Judit López; Chu, Carol; Ciesielska, Anna; Clapton, Jackie; Clayton, Carole; Clenaghan, Catherine; Coelho, Miguel; Connemann, Julia; Craufurd, David; Crooks, Jenny; Cubillo, Patricia Trigo; Cubo, Esther; Curtis, Adrienne; De Michele, Giuseppe; De Nicola, A.; de Souza, Jenny; de Weert, A. Marit; de Yébenes, Justo Garcia; Dekker, M.; Descals, A. Martínez; Di Maio, Luigi; Di Pietro, Anna; Dipple, Heather; Dose, Matthias; Dumas, Eve M.; Dunnett, Stephen; Ecker, Daniel; Elifani, F.; Ellison-Rose, Lynda; Elorza, Marina D.; Eschenbach, Carolin; Evans, Carole; Fairtlough, Helen; Fannemel, Madelein; Fasano, Alfonso; Fenollar, Maria; Ferrandes, Giovanna; Ferreira, Jaoquim J.; Fillingham, Kay; Finisterra, Ana Maria; Fisher, K.; Fletcher, Amy; Foster, Jillian; Foustanos, Isabella; Frech, Fernando A.; Fullam, Robert; Fullham, Ruth; Gago, Miguel; García, RocioGarcía-Ramos; García, Socorro S.; Garrett, Carolina; Gellera, Cinzia; Gill, Paul; Ginestroni, Andrea; Golding, Charlotte; Goodman, Anna; Gørvell, Per; Grant, Janet; Griguoli, A.; Gross, Diana; Guedes, Leonor; BascuñanaGuerra, Monica; Guerra, Maria Rosalia; Guerrero, Rosa; Guia, Dolores B.; Guidubaldi, Arianna; Hallam, Caroline; Hamer, Stephanie; Hammer, Kathrin; Handley, Olivia J.; Harding, Alison; Hasholt, Lis; Hedge, Reikha; Heiberg, Arvid; Heinicke, Walburgis; Held, Christine; Hernanz, Laura Casas; Herranhof, Briggitte; Herrera, Carmen Durán; Hidding, Ute; Hiivola, Heli; Hill, Susan; Hjermind, Lena. E.; Hobson, Emma; Hoffmann, Rainer; Holl, Anna Hödl; Howard, Liz; Hunt, Sarah; Huson, Susan; Ialongo, Tamara; Idiago, Jesus Miguel R.; Illmann, Torsten; Jachinska, Katarzyna; Jacopini, Gioia; Jakobsen, Oda; Jamieson, Stuart; Jamrozik, Zygmunt; Janik, Piotr; Johns, Nicola; Jones, Lesley; Jones, Una; Jurgens, Caroline K.; Kaelin, Alain; Kalbarczyk, Anna; Kershaw, Ann; Khalil, Hanan; Kieni, Janina; Klimberg, Aneta; Koivisto, Susana P.; Koppers, Kerstin; Kosinski, Christoph Michael; Krawczyk, Malgorzata; Kremer, Berry; Krysa, Wioletta; Kwiecinski, Hubert; Lahiri, Nayana; Lambeck, Johann; Lange, Herwig; Laver, Fiona; Leenders, K.L.; Levey, Jamie; Leythaeuser, Gabriele; Lezius, Franziska; Llesoy, Joan Roig; Löhle, Matthias; López, Cristobal Diez-Aja; Lorenza, Fortuna; Loria, Giovanna; Magnet, Markus; Mandich, Paola; Marchese, Roberta; Marcinkowski, Jerzy; Mariotti, Caterina; Mariscal, Natividad; Markova, Ivana; Marquard, Ralf; Martikainen, Kirsti; Martínez, Isabel Haro; Martínez-Descals, Asuncion; Martino, T.; Mason, Sarah; McKenzie, Sue; Mechi, Claudia; Mendes, Tiago; Mestre, Tiago; Middleton, Julia; Milkereit, Eva; Miller, Joanne; Miller, Julie; Minster, Sara; Möller, Jens Carsten; Monza, Daniela; Morales, Blas; Moreau, Laura V.; Moreno, Jose L. López-Sendón; Münchau, Alexander; Murch, Ann; Nielsen, Jørgen E.; Niess, Anke; Nørremølle, Anne; Novak, Marianne; O'Donovan, Kristy; Orth, Michael; Otti, Daniela; Owen, Michael; Padieu, Helene; Paganini, Marco; Painold, Annamaria; Päivärinta, Markku; Partington-Jones, Lucy; Paterski, Laurent; Paterson, Nicole; Patino, Dawn; Patton, Michael; Peinemann, Alexander; Peppa, Nadia; Perea, Maria Fuensanta Noguera; Peterson, Maria; Piacentini, Silvia; Piano, Carla; Càrdenas, Regina Pons i; Prehn, Christian; Price, Kathleen; Probst, Daniela; Quarrell, Oliver; Quiroga, Purificacion Pin; Raab, Tina; Rakowicz, Maryla; Raman, Ashok; Raymond, Lucy; Reilmann, Ralf; Reinante, Gema; Reisinger, Karin; Retterstol, Lars; Ribaï, Pascale; Riballo, Antonio V.; Ribas, Guillermo G.; Richter, Sven; Rickards, Hugh; Rinaldi, Carlo; Rissling, Ida; Ritchie, Stuart; Rivera, Susana Vázquez; Robert, Misericordia Floriach; Roca, Elvira; Romano, Silvia; Romoli, Anna Maria; Roos, Raymond A.C.; Røren, Niini; Rose, Sarah; Rosser, Elisabeth; Rosser, Anne; Rossi, Fabiana; Rothery, Jean; Rudzinska, Monika; Ruíz, Pedro J. García; Ruíz, Belan Garzon; Russo, Cinzia Valeria; Ryglewicz, Danuta; Saft, Carston; Salvatore, Elena; Sánchez, Vicenta; Sando, Sigrid Botne; Šašinková, Pavla; Sass, Christian; Scheibl, Monika; Schiefer, Johannes; Schlangen, Christiane; Schmidt, Simone; Schöggl, Helmut; Schrenk, Caroline; Schüpbach, Michael; Schuierer, Michele; Sebastián, Ana Rojo; Selimbegovic-Turkovic, Amina; Sempolowicz, Justyna; Silva, Mark; Sitek, Emilia; Slawek, Jaroslaw; Snowden, Julie; Soleti, Francesco; Soliveri, Paola; Sollom, Andrea; Soltan, Witold; Sorbi, Sandro; Sorensen, Sven Asger; Spadaro, Maria; Städtler, Michael; Stamm, Christiane; Steiner, Tanja; Stokholm, Jette; Stokke, Bodil; Stopford, Cheryl; Storch, Alexander; Straßburger, Katrin; Stubbe, Lars; Sulek, Anna; Szczudlik, Andrzej; Tabrizi, Sarah; Taylor, Rachel; Terol, Santiago Duran-Sindreu; Thomas, Gareth; Thompson, Jennifer; Thomson, Aileen; Tidswell, Katherine; Torres, Maria M. Antequera; Toscano, Jean; Townhill, Jenny; Trautmann, Sonja; Tucci, Tecla; Tuuha, Katri; Uhrova, Tereza; Valadas, Anabela; van Hout, Monique S.E.; van Oostrom, J.C.H.; van Vugt, Jeroen P.P.; vanm, Walsem Marleen R.; Vandenberghe, Wim; Verellen-Dumoulin, Christine; Vergara, Mar Ruiz; Verstappen, C.C.P.; Verstraelen, Nichola; Viladrich, Celia Mareca; Villanueva, Clara; Wahlström, Jan; Warner, Thomas; Wehus, Raghild; Weindl, Adolf; Werner, Cornelius J.; Westmoreland, Leann; Weydt, Patrick; Wiedemann, Alexandra; Wild, Edward; Wild, Sue; Witjes-Ané, Marie-Noelle; Witkowski, Grzegorz; Wójcik, Magdalena; Wolz, Martin; Wolz, Annett; Wright, Jan; Yardumian, Pam; Yates, Shona; Yudina, Elizaveta; Zaremba, Jacek; Zaugg, Sabine W.; Zdzienicka, Elzbieta; Zielonka, Daniel; Zielonka, Euginiusz; Zinzi, Paola; Zittel, Simone; Zucker, Birgrit; Adams, John; Agarwal, Pinky; Antonijevic, Irina; Beck, Christopher; Chiu, Edmond; Churchyard, Andrew; Colcher, Amy; Corey-Bloom, Jody; Dorsey, Ray; Drazinic, Carolyn; Dubinsky, Richard; Duff, Kevin; Factor, Stewart; Foroud, Tatiana; Furtado, Sarah; Giuliano, Joe; Greenamyre, Timothy; Higgins, Don; Jankovic, Joseph; Jennings, Dana; Kang, Un Jung; Kostyk, Sandra; Kumar, Rajeev; Leavitt, Blair; LeDoux, Mark; Mallonee, William; Marshall, Frederick; Mohlo, Eric; Morgan, John; Oakes, David; Panegyres, Peter; Panisset, Michel; Perlman, Susan; Perlmutter, Joel; Quaid, Kimberly; Raymond, Lynn; Revilla, Fredy; Robertson, Suzanne; Robottom, Bradley; Sanchez-Ramos, Juan; Scott, Burton; Shannon, Kathleen; Shoulson, Ira; Singer, Carlos; Tabbal, Samer; Testa, Claudia; van, Kammen Dan; Vetter, Louise; Walker, Francis; Warner, John; Weiner, illiam; Wheelock, Vicki; Yastrubetskaya, Olga; Barton, Stacey; Broyles, Janice; Clouse, Ronda; Coleman, Allison; Davis, Robert; Decolongon, Joji; DeLaRosa, Jeanene; Deuel, Lisa; Dietrich, Susan; Dubinsky, Hilary; Eaton, Ken; Erickson, Diane; Fitzpatrick, Mary Jane; Frucht, Steven; Gartner, Maureen; Goldstein, Jody; Griffith, Jane; Hickey, Charlyne; Hunt, Victoria; Jaglin, Jeana; Klimek, Mary Lou; Lindsay, Pat; Louis, Elan; Loy, Clemet; Lucarelli, Nancy; Malarick, Keith; Martin, Amanda; McInnis, Robert; Moskowitz, Carol; Muratori, Lisa; Nucifora, Frederick; O'Neill, Christine; Palao, Alicia; Peavy, Guerry; Quesada, Monica; Schmidt, Amy; Segro, Vicki; Sperin, Elaine; Suter, Greg; Tanev, Kalo; Tempkin, Teresa; Thiede, Curtis; Wasserman, Paula; Welsh, Claire; Wesson, Melissa; Zauber, Elizabeth
2012-01-01
Objective: Age at onset of diagnostic motor manifestations in Huntington disease (HD) is strongly correlated with an expanded CAG trinucleotide repeat. The length of the normal CAG repeat allele has been reported also to influence age at onset, in interaction with the expanded allele. Due to profound implications for disease mechanism and modification, we tested whether the normal allele, interaction between the expanded and normal alleles, or presence of a second expanded allele affects age at onset of HD motor signs. Methods: We modeled natural log-transformed age at onset as a function of CAG repeat lengths of expanded and normal alleles and their interaction by linear regression. Results: An apparently significant effect of interaction on age at motor onset among 4,068 subjects was dependent on a single outlier data point. A rigorous statistical analysis with a well-behaved dataset that conformed to the fundamental assumptions of linear regression (e.g., constant variance and normally distributed error) revealed significance only for the expanded CAG repeat, with no effect of the normal CAG repeat. Ten subjects with 2 expanded alleles showed an age at motor onset consistent with the length of the larger expanded allele. Conclusions: Normal allele CAG length, interaction between expanded and normal alleles, and presence of a second expanded allele do not influence age at onset of motor manifestations, indicating that the rate of HD pathogenesis leading to motor diagnosis is determined by a completely dominant action of the longest expanded allele and as yet unidentified genetic or environmental factors. Neurology® 2012;78:690–695 PMID:22323755
Lorenzo-Seva, Urbano; Ferrando, Pere J
2011-03-01
We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
Fujisawa, Seiichiro; Kadoma, Yoshinori
2012-01-01
We investigated the quantitative structure-activity relationships between hemolytic activity (log 1/H(50)) or in vivo mouse intraperitoneal (ip) LD(50) using reported data for α,β-unsaturated carbonyl compounds such as (meth)acrylate monomers and their (13)C-NMR β-carbon chemical shift (δ). The log 1/H(50) value for methacrylates was linearly correlated with the δC(β) value. That for (meth)acrylates was linearly correlated with log P, an index of lipophilicity. The ipLD(50) for (meth)acrylates was linearly correlated with δC(β) but not with log P. For (meth)acrylates, the δC(β) value, which is dependent on the π-electron density on the β-carbon, was linearly correlated with PM3-based theoretical parameters (chemical hardness, η; electronegativity, χ; electrophilicity, ω), whereas log P was linearly correlated with heat of formation (HF). Also, the interaction between (meth)acrylates and DPPC liposomes in cell membrane molecular models was investigated using (1)H-NMR spectroscopy and differential scanning calorimetry (DSC). The log 1/H(50) value was related to the difference in chemical shift (ΔδHa) (Ha: H (trans) attached to the β-carbon) between the free monomer and the DPPC liposome-bound monomer. Monomer-induced DSC phase transition properties were related to HF for monomers. NMR chemical shifts may represent a valuable parameter for investigating the biological mechanisms of action of (meth)acrylates.
Fujisawa, Seiichiro; Kadoma, Yoshinori
2012-01-01
We investigated the quantitative structure-activity relationships between hemolytic activity (log 1/H50) or in vivo mouse intraperitoneal (ip) LD50 using reported data for α,β-unsaturated carbonyl compounds such as (meth)acrylate monomers and their 13C-NMR β-carbon chemical shift (δ). The log 1/H50 value for methacrylates was linearly correlated with the δCβ value. That for (meth)acrylates was linearly correlated with log P, an index of lipophilicity. The ipLD50 for (meth)acrylates was linearly correlated with δCβ but not with log P. For (meth)acrylates, the δCβ value, which is dependent on the π-electron density on the β-carbon, was linearly correlated with PM3-based theoretical parameters (chemical hardness, η; electronegativity, χ; electrophilicity, ω), whereas log P was linearly correlated with heat of formation (HF). Also, the interaction between (meth)acrylates and DPPC liposomes in cell membrane molecular models was investigated using 1H-NMR spectroscopy and differential scanning calorimetry (DSC). The log 1/H50 value was related to the difference in chemical shift (ΔδHa) (Ha: H (trans) attached to the β-carbon) between the free monomer and the DPPC liposome-bound monomer. Monomer-induced DSC phase transition properties were related to HF for monomers. NMR chemical shifts may represent a valuable parameter for investigating the biological mechanisms of action of (meth)acrylates. PMID:22312284
Excess adiposity, inflammation, and iron-deficiency in female adolescents.
Tussing-Humphreys, Lisa M; Liang, Huifang; Nemeth, Elizabeta; Freels, Sally; Braunschweig, Carol A
2009-02-01
Iron deficiency is more prevalent in overweight children and adolescents but the mechanisms that underlie this condition remain unclear. The purpose of this cross-sectional study was to assess the relationship between iron status and excess adiposity, inflammation, menarche, diet, physical activity, and poverty status in female adolescents included in the National Health and Nutrition Examination Survey 2003-2004 dataset. Descriptive and simple comparative statistics (t test, chi(2)) were used to assess differences between normal-weight (5th < or = body mass index [BMI] percentile <85th) and heavier-weight girls (< or = 85th percentile for BMI) for demographic, biochemical, dietary, and physical activity variables. In addition, logistic regression analyses predicting iron deficiency and linear regression predicting serum iron levels were performed. Heavier-weight girls had an increased prevalence of iron deficiency compared to those with normal weight. Dietary iron, age of and time since first menarche, poverty status, and physical activity were similar between the two groups and were not independent predictors of iron deficiency or log serum iron levels. Logistic modeling predicting iron deficiency revealed having a BMI > or = 85th percentile and for each 1 mg/dL increase in C-reactive protein the odds ratio for iron deficiency more than doubled. The best-fit linear model to predict serum iron levels included both serum transferrin receptor and C-reactive protein following log-transformation for normalization of these variables. Findings indicate that heavier-weight female adolescents are at greater risk for iron deficiency and that inflammation stemming from excess adipose tissue contributes to this phenomenon. Food and nutrition professionals should consider elevated BMI as an additional risk factor for iron deficiency in female adolescents.
Connolly, Alison; Jones, Kate; Galea, Karen S; Basinas, Ioannis; Kenny, Laura; McGowan, Padraic; Coggins, Marie
2017-08-01
Pesticides and their potential adverse health effects are of great concern and there is a dearth of knowledge regarding occupational exposure to pesticides among amenity horticulturalists. This study aims to measure occupational exposures to amenity horticuturalists using pesticides containing the active ingredients, glyphosate and fluroxypyr by urinary biomonitoring. A total of 40 work tasks involving glyphosate and fluroxypyr were surveyed over the period of June - October 2015. Workers used a variety of pesticide application methods; manual knapsack sprayers, controlled droplet applicators, pressurised lance applicators and boom sprayers. Pesticide concentrations were measured in urine samples collected pre and post work tasks using liquid chromatography tandem mass spectrometry (LC-MS/MS). Differences in pesticide urinary concentrations pre and post work task, and across applications methods were analysed using paired t-tests and linear regression. Pesticide urinary concentrations were higher than those reported for environmental exposures and comparable to those reported in some agricultural studies. Log-transformed pesticide concentrations were statistically significantly higher in post-work samples compared to those in pre-work samples (paired t-test, p<0.001; for both μgL -1 and μmol/mol creatinine). Urinary pesticide concentrations in post-work samples had a geometric mean (geometric standard deviation) of 0.66 (1.11) μgL -1 for glyphosate and 0.29 (1.69) μgL -1 for fluroxypyr. Linear regression revealed a statistically significant positive association to exist between the time-interval between samples and the log-transformed adjusted (i.e. post- minus pre-task) pesticide urinary concentrations (β=0.0039; p<0.0001). Amenity horticulturists can be exposed to pesticides during tasks involving these products. Further research is required to evaluate routes of exposure among this occupational group. Crown Copyright © 2017. Published by Elsevier GmbH. All rights reserved.
Compositional data analysis for physical activity, sedentary time and sleep research.
Dumuid, Dorothea; Stanford, Tyman E; Martin-Fernández, Josep-Antoni; Pedišić, Željko; Maher, Carol A; Lewis, Lucy K; Hron, Karel; Katzmarzyk, Peter T; Chaput, Jean-Philippe; Fogelholm, Mikael; Hu, Gang; Lambert, Estelle V; Maia, José; Sarmiento, Olga L; Standage, Martyn; Barreira, Tiago V; Broyles, Stephanie T; Tudor-Locke, Catrine; Tremblay, Mark S; Olds, Timothy
2017-01-01
The health effects of daily activity behaviours (physical activity, sedentary time and sleep) are widely studied. While previous research has largely examined activity behaviours in isolation, recent studies have adjusted for multiple behaviours. However, the inclusion of all activity behaviours in traditional multivariate analyses has not been possible due to the perfect multicollinearity of 24-h time budget data. The ensuing lack of adjustment for known effects on the outcome undermines the validity of study findings. We describe a statistical approach that enables the inclusion of all daily activity behaviours, based on the principles of compositional data analysis. Using data from the International Study of Childhood Obesity, Lifestyle and the Environment, we demonstrate the application of compositional multiple linear regression to estimate adiposity from children's daily activity behaviours expressed as isometric log-ratio coordinates. We present a novel method for predicting change in a continuous outcome based on relative changes within a composition, and for calculating associated confidence intervals to allow for statistical inference. The compositional data analysis presented overcomes the lack of adjustment that has plagued traditional statistical methods in the field, and provides robust and reliable insights into the health effects of daily activity behaviours.
SU-G-BRB-02: An Open-Source Software Analysis Library for Linear Accelerator Quality Assurance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerns, J; Yaldo, D
Purpose: Routine linac quality assurance (QA) tests have become complex enough to require automation of most test analyses. A new data analysis software library was built that allows physicists to automate routine linear accelerator quality assurance tests. The package is open source, code tested, and benchmarked. Methods: Images and data were generated on a TrueBeam linac for the following routine QA tests: VMAT, starshot, CBCT, machine logs, Winston Lutz, and picket fence. The analysis library was built using the general programming language Python. Each test was analyzed with the library algorithms and compared to manual measurements taken at the timemore » of acquisition. Results: VMAT QA results agreed within 0.1% between the library and manual measurements. Machine logs (dynalogs & trajectory logs) were successfully parsed; mechanical axis positions were verified for accuracy and MLC fluence agreed well with EPID measurements. CBCT QA measurements were within 10 HU and 0.2mm where applicable. Winston Lutz isocenter size measurements were within 0.2mm of TrueBeam’s Machine Performance Check. Starshot analysis was within 0.2mm of the Winston Lutz results for the same conditions. Picket fence images with and without a known error showed that the library was capable of detecting MLC offsets within 0.02mm. Conclusion: A new routine QA software library has been benchmarked and is available for use by the community. The library is open-source and extensible for use in larger systems.« less
Arnaoutakis, George J; George, Timothy J; Alejo, Diane E; Merlo, Christian A; Baumgartner, William A; Cameron, Duke E; Shah, Ashish S
2011-09-01
The impact of Society of Thoracic Surgeons predicted mortality risk score on resource use has not been previously studied. We hypothesize that increasing Society of Thoracic Surgeons risk scores in patients undergoing aortic valve replacement are associated with greater hospital charges. Clinical and financial data for patients undergoing aortic valve replacement at The Johns Hopkins Hospital over a 10-year period (January 2000 to December 2009) were reviewed. The current Society of Thoracic Surgeons formula (v2.61) for in-hospital mortality was used for all patients. After stratification into risk quartiles, index admission hospital charges were compared across risk strata with rank-sum and Kruskal-Wallis tests. Linear regression and Spearman's coefficient assessed correlation and goodness of fit. Multivariable analysis assessed relative contributions of individual variables on overall charges. A total of 553 patients underwent aortic valve replacement during the study period. Average predicted mortality was 2.9% (±3.4) and actual mortality was 3.4% for aortic valve replacement. Median charges were greater in the upper quartile of patients undergoing aortic valve replacement (quartiles 1-3, $39,949 [interquartile range, 32,708-51,323] vs quartile 4, $62,301 [interquartile range, 45,952-97,103], P < .01]. On univariate linear regression, there was a positive correlation between Society of Thoracic Surgeons risk score and log-transformed charges (coefficient, 0.06; 95% confidence interval, 0.05-0.07; P < .01). Spearman's correlation R-value was 0.51. This positive correlation persisted in risk-adjusted multivariable linear regression. Each 1% increase in Society of Thoracic Surgeons risk score was associated with an added $3000 in hospital charges. This is the first study to show that increasing Society of Thoracic Surgeons risk score predicts greater charges after aortic valve replacement. As competing therapies, such as percutaneous valve replacement, emerge to treat high-risk patients, these results serve as a benchmark to compare resource use. Copyright © 2011 The American Association for Thoracic Surgery. Published by Mosby, Inc. All rights reserved.
Anstey, Chris M
2005-06-01
Currently, three strong ion models exist for the determination of plasma pH. Mathematically, they vary in their treatment of weak acids, and this study was designed to determine whether any significant differences exist in the simulated performance of these models. The models were subjected to a "metabolic" stress either in the form of variable strong ion difference and fixed weak acid effect, or vice versa, and compared over the range 25 < or = Pco(2) < or = 135 Torr. The predictive equations for each model were iteratively solved for pH at each Pco(2) step, and the results were plotted as a series of log(Pco(2))-pH titration curves. The results were analyzed for linearity by using ordinary least squares regression and for collinearity by using correlation. In every case, the results revealed a linear relationship between log(Pco(2)) and pH over the range 6.8 < or = pH < or = 7.8, and no significant difference between the curve predictions under metabolic stress. The curves were statistically collinear. Ultimately, their clinical utility will be determined both by acceptance of the strong ion framework for describing acid-base physiology and by the ease of measurement of the independent model parameters.
Comparison of Abbott and Da-an real-time PCR for quantitating serum HBV DNA.
Qiu, Ning; Li, Rui; Yu, Jian-Guo; Yang, Wen; Zhang, Wei; An, Yong; Li, Tong; Liu, Xue-En; Zhuang, Hui
2014-09-07
To compare the performance of the Da-an real-time hepatitis B virus (HBV) DNA assay and Abbott RealTime HBV assay. HBV DNA standards as well as a total of 180 clinical serum samples from patients with chronic hepatitis B were measured using the Abbott and Da-an real-time polymerase chain reaction (PCR) assays. Correlation and Bland-Altman plot analysis was used to compare the performance of the Abbott and Da-an assays. The HBV DNA levels were logarithmically transformed for analysis. All statistical analyses were performed using SPSS for Windows version 18.0. The correlation between the two assays was analyzed by Pearson's correlation and linear regression. The Bland-Altman plots were used for the analysis of agreement between the two assays. A P value of < 0.05 was considered statistically significant. The HBV DNA values measured by the Abbott or Da-an assay were significantly correlated with the expected values of HBV DNA standards (r = 0.999, for Abbott; r = 0.987, for Da-an, P < 0.001). A Bland-Altman plot showed good agreement between these two assays in detecting HBV DNA standards. Among the 180 clinical serum samples, 126 were quantifiable by both assays. Fifty-two samples were detectable by the Abbott assay but below the detection limit of the Da-an assay. Moreover, HBV DNA levels measured by the Abbott assay were significantly higher than those of the Da-an assay (6.23 ± 1.76 log IU/mL vs 5.46 ± 1.55 log IU/mL, P < 0.001). A positive correlation was observed between HBV DNA concentrations determined by the two assays in 126 paired samples (r = 0.648, P < 0.001). One hundred and fifteen of 126 (91.3%) specimens tested with both assays were within mean difference ± 1.96 SD of HBV DNA levels. The Da-an assay presented lower sensitivity and a narrower linear range as compared to the Abbott assay, suggesting the need to be improved.
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
NASA Astrophysics Data System (ADS)
Vásquez Lavín, F. A.; Hernandez, J. I.; Ponce, R. D.; Orrego, S. A.
2017-07-01
During recent decades, water demand estimation has gained considerable attention from scholars. From an econometric perspective, the most used functional forms include log-log and linear specifications. Despite the advances in this field and the relevance for policymaking, little attention has been paid to the functional forms used in these estimations, and most authors have not provided justifications for their selection of functional forms. A discrete continuous choice model of the residential water demand is estimated using six functional forms (log-log, full-log, log-quadratic, semilog, linear, and Stone-Geary), and the expected consumption and price elasticity are evaluated. From a policy perspective, our results highlight the relevance of functional form selection for both the expected consumption and price elasticity.
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly drawing datasets with replacement from the training data. Each sample has the same size of the original training set and it can be conducted N times to produce N bootstrap datasets to re-fit the model accordingly to decrease the squared difference between the estimated and observed categorical variables (facies) leading to decrease the degree of uncertainty.
Blaya, Josefa; Lloret, Eva; Santísima-Trinidad, Ana B; Ros, Margarita; Pascual, Jose A
2016-04-01
Currently, real-time polymerase chain reaction (qPCR) is the technique most often used to quantify pathogen presence. Digital PCR (dPCR) is a new technique with the potential to have a substantial impact on plant pathology research owing to its reproducibility, sensitivity and low susceptibility to inhibitors. In this study, we evaluated the feasibility of using dPCR and qPCR to quantify Phytophthora nicotianae in several background matrices, including host tissues (stems and roots) and soil samples. In spite of the low dynamic range of dPCR (3 logs compared with 7 logs for qPCR), this technique proved to have very high precision applicable at very low copy numbers. The dPCR was able to detect accurately the pathogen in all type of samples in a broad concentration range. Moreover, dPCR seems to be less susceptible to inhibitors than qPCR in plant samples. Linear regression analysis showed a high correlation between the results obtained with the two techniques in soil, stem and root samples, with R(2) = 0.873, 0.999 and 0.995 respectively. These results suggest that dPCR is a promising alternative for quantifying soil-borne pathogens in environmental samples, even in early stages of the disease. © 2015 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Laha, S.; Guainazzi, M.; Dewangan, G.; Chakravorty, S.; Kembhavi, A.
2014-07-01
We present results from a homogeneous analysis of the broadband 0.3-10 keV CCD resolution as well as of soft X-ray high-resolution grating spectra of a hard X-ray flux-limited sample of 26 Seyfert galaxies observed with XMM-Newton. We could put a strict lower limit on the detection fraction of 50%. We find a gap in the distribution of the ionisation parameter in the range 0.5
Daily Magnesium Intake and Serum Magnesium Concentration among Japanese People
Akizawa, Yoriko; Koizumi, Sadayuki; Itokawa, Yoshinori; Ojima, Toshiyuki; Nakamura, Yosikazu; Tamura, Tarou; Kusaka, Yukinori
2008-01-01
Background The vitamins and minerals that are deficient in the daily diet of a normal adult remain unknown. To answer this question, we conducted a population survey focusing on the relationship between dietary magnesium intake and serum magnesium level. Methods The subjects were 62 individuals from Fukui Prefecture who participated in the 1998 National Nutrition Survey. The survey investigated the physical status, nutritional status, and dietary data of the subjects. Holidays and special occasions were avoided, and a day when people are most likely to be on an ordinary diet was selected as the survey date. Results The mean (±standard deviation) daily magnesium intake was 322 (±132), 323 (±163), and 322 (±147) mg/day for men, women, and the entire group, respectively. The mean (±standard deviation) serum magnesium concentration was 20.69 (±2.83), 20.69 (±2.88), and 20.69 (±2.83) ppm for men, women, and the entire group, respectively. The distribution of serum magnesium concentration was normal. Dietary magnesium intake showed a log-normal distribution, which was then transformed by logarithmic conversion for examining the regression coefficients. The slope of the regression line between the serum magnesium concentration (Y ppm) and daily magnesium intake (X mg) was determined using the formula Y = 4.93 (log10X) + 8.49. The coefficient of correlation (r) was 0.29. A regression line (Y = 14.65X + 19.31) was observed between the daily intake of magnesium (Y mg) and serum magnesium concentration (X ppm). The coefficient of correlation was 0.28. Conclusion The daily magnesium intake correlated with serum magnesium concentration, and a linear regression model between them was proposed. PMID:18635902
Mouse allergen, lung function, and atopy in Puerto Rican children.
Forno, Erick; Cloutier, Michelle M; Datta, Soma; Paul, Kathryn; Sylvia, Jody; Calvert, Deanna; Thornton-Thompson, Sherell; Wakefield, Dorothy B; Brehm, John; Hamilton, Robert G; Alvarez, María; Colón-Semidey, Angel; Acosta-Pérez, Edna; Canino, Glorisa; Celedón, Juan C
2012-01-01
To examine the relation between mouse allergen exposure and asthma in Puerto Rican children. Mus m 1, Der p 1, Bla g 2, and Fel d 1 allergens were measured in dust samples from homes of Puerto Rican children with (cases) and without (controls) asthma in Hartford, CT (n = 449) and San Juan (SJ), Puerto Rico (n = 678). Linear or logistic regression was used for the multivariate analysis of mouse allergen (Mus m 1) and lung function (FEV(1) and FEV(1)/FVC) and allergy (total IgE and skin test reactivity (STR) to ≥1 allergen) measures. Homes in SJ had lower mouse allergen levels than those in Hartford. In multivariate analyses, mouse allergen was associated with higher FEV(1) in cases in Hartford (+70.6 ml, 95% confidence interval (CI) = 8.6-132.7 ml, P = 0.03) and SJ (+45.1 ml, 95% CI = -0.5 to 90.6 ml, P = 0.05). In multivariate analyses of controls, mouse allergen was inversely associated with STR to ≥1 allergen in non-sensitized children (odds ratio [OR] for each log-unit increment in Mus m 1 = 0.7, 95% CI = 0.5-0.9, P<0.01). In a multivariate analysis including all children at both study sites, each log-increment in mouse allergen was positively associated with FEV(1) (+28.3 ml, 95% CI = 1.4-55.2 ml, P = 0.04) and inversely associated with STR to ≥1 allergen (OR for each log-unit increment in Mus m 1 = 0.8, 95% CI = 0.6-0.9, P<0.01). Mouse allergen is associated with a higher FEV(1) and lower odds of STR to ≥1 allergen in Puerto Rican children. This may be explained by the allergen itself or correlated microbial exposures.
Mouse Allergen, Lung Function, and Atopy in Puerto Rican Children
Forno, Erick; Cloutier, Michelle M.; Datta, Soma; Paul, Kathryn; Sylvia, Jody; Calvert, Deanna; Thornton-Thompson, Sherell; Wakefield, Dorothy B.; Brehm, John; Hamilton, Robert G.; Alvarez, María; Colón-Semidey, Angel; Acosta-Pérez, Edna; Canino, Glorisa; Celedón, Juan C.
2012-01-01
Objective To examine the relation between mouse allergen exposure and asthma in Puerto Rican children. Methods Mus m 1, Der p 1, Bla g 2, and Fel d 1 allergens were measured in dust samples from homes of Puerto Rican children with (cases) and without (controls) asthma in Hartford, CT (n = 449) and San Juan (SJ), Puerto Rico (n = 678). Linear or logistic regression was used for the multivariate analysis of mouse allergen (Mus m 1) and lung function (FEV1 and FEV1/FVC) and allergy (total IgE and skin test reactivity (STR) to ≥1 allergen) measures. Results Homes in SJ had lower mouse allergen levels than those in Hartford. In multivariate analyses, mouse allergen was associated with higher FEV1 in cases in Hartford (+70.6 ml, 95% confidence interval (CI) = 8.6–132.7 ml, P = 0.03) and SJ (+45.1 ml, 95% CI = −0.5 to 90.6 ml, P = 0.05). In multivariate analyses of controls, mouse allergen was inversely associated with STR to ≥1 allergen in non-sensitized children (odds ratio [OR] for each log-unit increment in Mus m 1 = 0.7, 95% CI = 0.5–0.9, P<0.01). In a multivariate analysis including all children at both study sites, each log-increment in mouse allergen was positively associated with FEV1 (+28.3 ml, 95% CI = 1.4–55.2 ml, P = 0.04) and inversely associated with STR to ≥1 allergen (OR for each log-unit increment in Mus m 1 = 0.8, 95% CI = 0.6–0.9, P<0.01). Conclusions Mouse allergen is associated with a higher FEV1 and lower odds of STR to ≥1 allergen in Puerto Rican children. This may be explained by the allergen itself or correlated microbial exposures. PMID:22815744
Dawdy, M R; Munter, D W; Gilmore, R A
1997-03-01
This study was designed to examine the relationship between patient entry rates (a measure of physician work load) and documentation errors/omissions in both handwritten and dictated emergency treatment records. The study was carried out in two phases. Phase I examined handwritten records and Phase II examined dictated and transcribed records. A total of 838 charts for three common chief complaints (chest pain, abdominal pain, asthma/chronic obstructive pulmonary disease) were retrospectively reviewed and scored for the presence or absence of 11 predetermined criteria. Patient entry rates were determined by reviewing the emergency department patient registration logs. The data were analyzed using simple correlation and linear regression analysis. A positive correlation was found between patient entry rates and documentation errors in handwritten charts. No such correlation was found in the dictated charts. We conclude that work load may negatively affect documentation accuracy when charts are handwritten. However, the use of dictation services may minimize or eliminate this effect.
Exploratory Analysis of Exercise Adherence Patterns with Sedentary Pregnant Women
Yeo, SeonAe; Cisewski, Jessi; Lock, Eric F.; Marron, J. S.
2010-01-01
Background It is not well understood how sedentary women who wish to engage in regular exercise adhere to interventions during pregnancy and what factors may influence adherence over time. Objective To examine longitudinal patterns of pregnant women’s adherence to exercise. Methods Exploratory secondary data analyses were carried out with 124 previously sedentary pregnant women (ages 31 ± 5 years; 85% non-Hispanic White) from a randomized controlled trial. Daily exercise logs (n = 92) from 18 through 35 weeks of gestation were explored using linear regression, functional data, and principal component analyses. Results Adherence decreased as gestation week increased (p < .001); the top adherers maintained levels of adherence, and the bottom adherers decreased levels of adherence; and adherence pattern was influenced by types of exercise throughout the study period. Discussion Exercise behavior patterns were explored in a randomized controlled trial study, using chronometric data on exercise attendance. A new analytic approach revealed that sedentary pregnant women may adopt exercise habits differently from other populations. PMID:20585224
NASA Technical Reports Server (NTRS)
Barrett, Charles A.
1992-01-01
A large body of high temperature cyclic oxidation data generated from tests at NASA Lewis Research Center involving gravimetric/time values for 36 Ni- and Co-base superalloys was reduced to a single attack parameter, K(sub a), for each run. This K(sub a) value was used to rank the cyclic oxidation resistance of each alloy at 1000, 1100, and 1150 C. These K(sub a) values were also used to derive an estimating equation using multiple linear regression involving log(sub 10)K(sub a) as a function of alloy chemistry and test temperature. This estimating equation has a high degree of fit and could be used to predict cyclic oxidation behavior for similar alloys and to design an optimum high strength Ni-base superalloy with maximum high temperature cyclic oxidation resistance. The critical alloy elements found to be beneficial were Al, Cr, and Ta.
Real-Time PCR Quantification Using A Variable Reaction Efficiency Model
Platts, Adrian E.; Johnson, Graham D.; Linnemann, Amelia K.; Krawetz, Stephen A.
2008-01-01
Quantitative real-time PCR remains a cornerstone technique in gene expression analysis and sequence characterization. Despite the importance of the approach to experimental biology the confident assignment of reaction efficiency to the early cycles of real-time PCR reactions remains problematic. Considerable noise may be generated where few cycles in the amplification are available to estimate peak efficiency. An alternate approach that uses data from beyond the log-linear amplification phase is explored with the aim of reducing noise and adding confidence to efficiency estimates. PCR reaction efficiency is regressed to estimate the per-cycle profile of an asymptotically departed peak efficiency, even when this is not closely approximated in the measurable cycles. The process can be repeated over replicates to develop a robust estimate of peak reaction efficiency. This leads to an estimate of the maximum reaction efficiency that may be considered primer-design specific. Using a series of biological scenarios we demonstrate that this approach can provide an accurate estimate of initial template concentration. PMID:18570886
Development of a pharmacogenetic-guided warfarin dosing algorithm for Puerto Rican patients
Ramos, Alga S; Seip, Richard L; Rivera-Miranda, Giselle; Felici-Giovanini, Marcos E; Garcia-Berdecia, Rafael; Alejandro-Cowan, Yirelia; Kocherla, Mohan; Cruz, Iadelisse; Feliu, Juan F; Cadilla, Carmen L; Renta, Jessica Y; Gorowski, Krystyna; Vergara, Cunegundo; Ruaño, Gualberto; Duconge, Jorge
2012-01-01
Aim This study was aimed at developing a pharmacogenetic-driven warfarin-dosing algorithm in 163 admixed Puerto Rican patients on stable warfarin therapy. Patients & methods A multiple linear-regression analysis was performed using log-transformed effective warfarin dose as the dependent variable, and combining CYP2C9 and VKORC1 genotyping with other relevant nongenetic clinical and demographic factors as independent predictors. Results The model explained more than two-thirds of the observed variance in the warfarin dose among Puerto Ricans, and also produced significantly better ‘ideal dose’ estimates than two pharmacogenetic models and clinical algorithms published previously, with the greatest benefit seen in patients ultimately requiring <7 mg/day. We also assessed the clinical validity of the model using an independent validation cohort of 55 Puerto Rican patients from Hartford, CT, USA (R2 = 51%). Conclusion Our findings provide the basis for planning prospective pharmacogenetic studies to demonstrate the clinical utility of genotyping warfarin-treated Puerto Rican patients. PMID:23215886
Mathematical modeling of tetrahydroimidazole benzodiazepine-1-one derivatives as an anti HIV agent
NASA Astrophysics Data System (ADS)
Ojha, Lokendra Kumar
2017-07-01
The goal of the present work is the study of drug receptor interaction via QSAR (Quantitative Structure-Activity Relationship) analysis for 89 set of TIBO (Tetrahydroimidazole Benzodiazepine-1-one) derivatives. MLR (Multiple Linear Regression) method is utilized to generate predictive models of quantitative structure-activity relationships between a set of molecular descriptors and biological activity (IC50). The best QSAR model was selected having a correlation coefficient (r) of 0.9299 and Standard Error of Estimation (SEE) of 0.5022, Fisher Ratio (F) of 159.822 and Quality factor (Q) of 1.852. This model is statistically significant and strongly favours the substitution of sulphur atom, IS i.e. indicator parameter for -Z position of the TIBO derivatives. Two other parameter logP (octanol-water partition coefficient) and SAG (Surface Area Grid) also played a vital role in the generation of best QSAR model. All three descriptor shows very good stability towards data variation in leave-one-out (LOO).
Fujimoto, Kayo; Williams, Mark L
2015-06-01
Mixing patterns within sexual networks have been shown to have an effect on HIV transmission, both within and across groups. This study examined sexual mixing patterns involving HIV-unknown status and risky sexual behavior conditioned on assortative/dissortative mixing by race/ethnicity. The sample used for this study consisted of drug-using male sex workers and their male sex partners. A log-linear analysis of 257 most at-risk MSM and 3,072 sex partners was conducted. The analysis found two significant patterns. HIV-positive most at-risk Black MSM had a strong tendency to have HIV-unknown Black partners (relative risk, RR = 2.91, p < 0.001) and to engage in risky sexual behavior (RR = 2.22, p < 0.001). White most at-risk MSM with unknown HIV status also had a tendency to engage in risky sexual behavior with Whites (RR = 1.72, p < 0.001). The results suggest that interventions that target the most at-risk MSM and their sex partners should account for specific sexual network mixing patterns by HIV status.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seong W. Lee
During this reporting period, the literature survey including the gasifier temperature measurement literature, the ultrasonic application and its background study in cleaning application, and spray coating process are completed. The gasifier simulator (cold model) testing has been successfully conducted. Four factors (blower voltage, ultrasonic application, injection time intervals, particle weight) were considered as significant factors that affect the temperature measurement. The Analysis of Variance (ANOVA) was applied to analyze the test data. The analysis shows that all four factors are significant to the temperature measurements in the gasifier simulator (cold model). The regression analysis for the case with the normalizedmore » room temperature shows that linear model fits the temperature data with 82% accuracy (18% error). The regression analysis for the case without the normalized room temperature shows 72.5% accuracy (27.5% error). The nonlinear regression analysis indicates a better fit than that of the linear regression. The nonlinear regression model's accuracy is 88.7% (11.3% error) for normalized room temperature case, which is better than the linear regression analysis. The hot model thermocouple sleeve design and fabrication are completed. The gasifier simulator (hot model) design and the fabrication are completed. The system tests of the gasifier simulator (hot model) have been conducted and some modifications have been made. Based on the system tests and results analysis, the gasifier simulator (hot model) has met the proposed design requirement and the ready for system test. The ultrasonic cleaning method is under evaluation and will be further studied for the gasifier simulator (hot model) application. The progress of this project has been on schedule.« less
NASA Technical Reports Server (NTRS)
Nelson, Ross; Margolis, Hank; Montesano, Paul; Sun, Guoqing; Cook, Bruce; Corp, Larry; Andersen, Hans-Erik; DeJong, Ben; Pellat, Fernando Paz; Fickel, Thaddeus;
2016-01-01
Existing national forest inventory plots, an airborne lidar scanning (ALS) system, and a space profiling lidar system (ICESat-GLAS) are used to generate circa 2005 estimates of total aboveground dry biomass (AGB) in forest strata, by state, in the continental United States (CONUS) and Mexico. The airborne lidar is used to link ground observations of AGB to space lidar measurements. Two sets of models are generated, the first relating ground estimates of AGB to airborne laser scanning (ALS) measurements and the second set relating ALS estimates of AGB (generated using the first model set) to GLAS measurements. GLAS then, is used as a sampling tool within a hybrid estimation framework to generate stratum-, state-, and national-level AGB estimates. A two-phase variance estimator is employed to quantify GLAS sampling variability and, additively, ALS-GLAS model variability in this current, three-phase (ground-ALS-space lidar) study. The model variance component characterizes the variability of the regression coefficients used to predict ALS-based estimates of biomass as a function of GLAS measurements. Three different types of predictive models are considered in CONUS to determine which produced biomass totals closest to ground-based national forest inventory estimates - (1) linear (LIN), (2) linear-no-intercept (LNI), and (3) log-linear. For CONUS at the national level, the GLAS LNI model estimate (23.95 +/- 0.45 Gt AGB), agreed most closely with the US national forest inventory ground estimate, 24.17 +/- 0.06 Gt, i.e., within 1%. The national biomass total based on linear ground-ALS and ALS-GLAS models (25.87 +/- 0.49 Gt) overestimated the national ground-based estimate by 7.5%. The comparable log-linear model result (63.29 +/-1.36 Gt) overestimated ground results by 261%. All three national biomass GLAS estimates, LIN, LNI, and log-linear, are based on 241,718 pulses collected on 230 orbits. The US national forest inventory (ground) estimates are based on 119,414 ground plots. At the US state level, the average absolute value of the deviation of LNI GLAS estimates from the comparable ground estimate of total biomass was 18.8% (range: Oregon,-40.8% to North Dakota, 128.6%). Log-linear models produced gross overestimates in the continental US, i.e., N2.6x, and the use of this model to predict regional biomass using GLAS data in temperate, western hemisphere forests is not appropriate. The best model form, LNI, is used to produce biomass estimates in Mexico. The average biomass density in Mexican forests is 53.10 +/- 0.88 t/ha, and the total biomass for the country, given a total forest area of 688,096 sq km, is 3.65 +/- 0.06 Gt. In Mexico, our GLAS biomass total underestimated a 2005 FAO estimate (4.152 Gt) by 12% and overestimated a 2007/8 radar study's figure (3.06 Gt) by 19%.
NASA Astrophysics Data System (ADS)
Kim, Taeyoun; Hwang, Seho; Jang, Seonghyung
2017-01-01
When finding the "sweet spot" of a shale gas reservoir, it is essential to estimate the brittleness index (BI) and total organic carbon (TOC) of the formation. Particularly, the BI is one of the key factors in determining the crack propagation and crushing efficiency for hydraulic fracturing. There are several methods for estimating the BI of a formation, but most of them are empirical equations that are specific to particular rock types. We estimated the mineralogical BI based on elemental capture spectroscopy (ECS) log and elastic BI based on well log data, and we propose a new method for predicting S-wave velocity (VS) using mineralogical BI and elastic BI. The TOC is related to the gas content of shale gas reservoirs. Since it is difficult to perform core analysis for all intervals of shale gas reservoirs, we make empirical equations for the Horn River Basin, Canada, as well as TOC log using a linear relation between core-tested TOC and well log data. In addition, two empirical equations have been suggested for VS prediction based on density and gamma ray log used for TOC analysis. By applying the empirical equations proposed from the perspective of BI and TOC to another well log data and then comparing predicted VS log with real VS log, the validity of empirical equations suggested in this paper has been tested.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
European Multicenter Study on Analytical Performance of DxN Veris System HCV Assay.
Braun, Patrick; Delgado, Rafael; Drago, Monica; Fanti, Diana; Fleury, Hervé; Gismondo, Maria Rita; Hofmann, Jörg; Izopet, Jacques; Kühn, Sebastian; Lombardi, Alessandra; Marcos, Maria Angeles; Sauné, Karine; O'Shea, Siobhan; Pérez-Rivilla, Alfredo; Ramble, John; Trimoulet, Pascale; Vila, Jordi; Whittaker, Duncan; Artus, Alain; Rhodes, Daniel W
2017-04-01
The analytical performance of the Veris HCV Assay for use on the new and fully automated Beckman Coulter DxN Veris Molecular Diagnostics System (DxN Veris System) was evaluated at 10 European virology laboratories. Precision, analytical sensitivity, specificity, and performance with negative samples, linearity, and performance with hepatitis C virus (HCV) genotypes were evaluated. Precision for all sites showed a standard deviation (SD) of 0.22 log 10 IU/ml or lower for each level tested. Analytical sensitivity determined by probit analysis was between 6.2 and 9.0 IU/ml. Specificity on 94 unique patient samples was 100%, and performance with 1,089 negative samples demonstrated 100% not-detected results. Linearity using patient samples was shown from 1.34 to 6.94 log 10 IU/ml. The assay demonstrated linearity upon dilution with all HCV genotypes. The Veris HCV Assay demonstrated an analytical performance comparable to that of currently marketed HCV assays when tested across multiple European sites. Copyright © 2017 American Society for Microbiology.
ERIC Educational Resources Information Center
Jurs, Stephen; And Others
The scree test and its linear regression technique are reviewed, and results of its use in factor analysis and Delphi data sets are described. The scree test was originally a visual approach for making judgments about eigenvalues, which considered the relationships of the eigenvalues to one another as well as their actual values. The graph that is…
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis
ERIC Educational Resources Information Center
Luo, Wen; Azen, Razia
2013-01-01
Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…
Inoue, Tomoaki; Sonoda, Noriyuki; Hiramatsu, Shinsuke; Kimura, Shinichiro; Ogawa, Yoshihiro; Inoguchi, Toyoshi
2018-02-01
Previous studies have shown that serum bilirubin concentration is inversely associated with the risk of cardiovascular disease. The relationship between serum bilirubin concentration and left ventricular geometry, however, has not been investigated in patients with diabetes mellitus. In this cohort study, 158 asymptomatic patients with type 2 diabetes mellitus without overt heart disease were enrolled. Left ventricular structure and function were assessed using echocardiography. Serum bilirubin concentration, glycemic control, lipid profile, and other clinical characteristics were evaluated, and their association with left ventricular geometry was determined. Patients with New York Heart Association Functional Classification greater than I, left ventricular ejection fraction less than 50%, history of coronary artery disease, severe valvulopathy, chronic atrial fibrillation, or creatinine clearance less than 30 ml/min, and those receiving insulin treatment, were excluded. Univariate analyses showed that relative wall thickness (RWT) was significantly correlated with diastolic blood pressure (P = 0.003), HbA1c (P = 0.024), total cholesterol (P = 0.043), urinary albumin (P = 0.023), and serum bilirubin concentration (P = 0.009). There was no association between left ventricular mass index and serum bilirubin concentration. Multivariate linear regression analysis showed that log RWT was positively correlated with diastolic blood pressure (P = 0.010) and that log RWT was inversely correlated with log bilirubin (P = 0.003). In addition, the patients with bilirubin less than 0.8 mg/dl had a higher prevalence of concentric left ventricular remodeling compared with those with bilirubin 0.8 mg/dl or more. Our study shows that the serum bilirubin concentration may be associated with the progression of concentric left ventricular remodeling in patients with type 2 diabetes mellitus.
He, Jingjing; Shen, Xin; Fang, Aiping; Song, Jie; Li, He; Guo, Meihan; Li, Keji
2016-11-01
Current evidence of the relationship between diets and Fe status is mostly derived from studies in developed countries with Western diets, which may not be translatable to Chinese with a predominantly plant-based diet. We extracted data that were nationally sampled from the 2009 wave of China Health and Nutrition Survey; dietary information was collected using 24-h recalls combined with a food inventory for 3 consecutive days. Blood samples were collected to quantify Fe status, and log-ferritin, transferrin receptor and Hb were used as Fe status indicators. In total, 2905 (1360 males and 1545 females) adults aged 18-50 years were included for multiple linear regression and stratified analyses. The rates of Fe deficiency and Fe-deficiency anaemia were 1·6 and 0·7 % for males and 28·4 and 10·7 % for females, respectively. As red meat and haem Fe consumption differed about fifteen to twenty times throughout the five groups, divided by quintiles of animal protein intake per 4·2 MJ/d, only Fe status as indicated by log-ferritin (P=0·019) and transferrin receptor (P=0·024) concentrations in males was shown to be higher as intakes of animal foods increased. Log-ferritin was positively associated with intakes of red meat (B=0·3 %, P=0·01) and haem Fe (B=12·3 %, P=0·010) in males and with intake of non-haem Fe in females (B=2·2 %, P=0·024). We conclude that diet has a very limited association with Fe status in Chinese adults consuming a traditional Chinese diet, and a predominantly plant-based diet may not be necessarily responsible for poor Fe status.
Sabbioni, G
1993-01-01
Aromatic amines are important intermediates in industrial manufacturing. N-Oxidation to N-hydroxyarylamines is a key step in determining the genotoxic properties of aromatic amines. N-Hydroxyarylamines can form adducts with DNA, with tissue proteins, and with the blood proteins albumin and hemoglobin in a dose-dependent manner. The determination of hemoglobin adducts is a useful tool for biomonitoring exposed populations. We have established the hemoglobin binding index (HBI) [(mmole compound/mole hemoglobin)/(mmole compound/kg body weight)] of several aromatic amines in female Wistar rats. Including the values from other researchers obtained in the same rat strain, the logarithm of hemoglobin binding (logHBI) was plotted against the following parameters: the sum of the Hammett constants(sigma sigma = sigma p + sigma m), pKa, logP (octanol/water), the half-wave oxidation potential (E1/2), and the electronic descriptors of the amines and their corresponding nitrenium ions obtained by semi-empirical calculations (MNDO, AMI, and PM3), such as atomic charge densities, energies of the highest occupied molecular orbit and lowest occupied molecular orbit and their coefficients, the bond order of C-N, the dipole moments, and the reaction enthalpy [MNDOHF, AM1HF or PM3HF = Hf(nitrenium) - Hf(amine)]. The correlation coefficients were determined from the plots of all parameters against log HBI for all amines by means of linear regression analysis. The amines were classified in three groups: group 1, all parasubstituted amines (maximum, n = 9); group 2, all amines with halogens (maximun, n = 11); and group 3, all amines with alkyl groups (maximum, n = 13).(ABSTRACT TRUNCATED AT 250 WORDS) PMID:8319626
Elhakeem, Ahmed; Hannam, Kimberly; Deere, Kevin C; Hartley, April; Clark, Emma M; Moss, Charlotte; Edwards, Mark H; Dennison, Elaine; Gaysin, Tim; Kuh, Diana; Wong, Andrew; Fox, Kenneth R; Cooper, Cyrus; Cooper, Rachel; Tobias, Jon H
2017-12-01
High impact physical activity (PA) is thought to benefit bone. We examined associations of lifetime walking and weight bearing exercise with accelerometer-measured high impact and overall PA in later life. Data were from 848 participants (66.2% female, mean age = 72.4 years) from the Cohort for Skeletal Health in Bristol and Avon, Hertfordshire Cohort Study and MRC National Survey of Health and Development. Acceleration peaks from seven-day hip-worn accelerometer recordings were used to derive counts of high impact and overall PA. Walking and weight bearing exercise up to age 18, between 18-29, 30-49 and since age 50 were recalled using questionnaires. Responses in each age category were dichotomised and cumulative scores derived. Linear regression was used for analysis. Greater lifetime walking was related to higher overall, but not high impact PA, whereas greater lifetime weight bearing exercise was related to higher overall and high impact PA. For example, fully-adjusted differences in log-overall and log-high impact PA respectively for highest versus lowest lifetime scores were: walking [0.224 (0.087, 0.362) and 0.239 (- 0.058, 0.536)], and weight bearing exercise [0.754 (0.432, 1.076) and 0.587 (0.270, 0.904)]. For both walking and weight bearing exercise, associations were strongest in the 'since age 50' category. Those reporting the most walking and weight bearing exercise since age 50 had highest overall and high impact PA, e.g. fully-adjusted difference in log-high impact PA versus least walking and weight bearing exercise = 0.588 (0.226, 0.951). Promoting walking and weight bearing exercise from midlife may help increase potentially osteogenic PA levels in later life.
Basta, Maria; Lin, Hung-Mo; Pejovic, Slobodanka; Sarrigiannidis, Alexios; Bixler, Edward; Vgontzas, Alexandros N
2008-02-15
Apnea, depression, and metabolic abnormalities are independent predictors of excessive daytime sleepiness (EDS) in patients with sleep apnea. Exercise is beneficial for apnea, depression, and metabolic abnormalities; however, its association with EDS is not known. To evaluate the contribution of lack of regular exercise, depression, and apnea severity on daytime sleepiness in patients with sleep apnea. One thousand one hundred six consecutive patients (741 men and 365 women) referred to the sleep disorders clinic for symptoms consistent with sleep apnea. Daytime sleepiness was assessed with the Epworth Sleepiness Scale and activity was evaluated with a quantifiable Physical Activity Questionnaire. Compared with women, men had a higher apnea hypopnea index (AHI) (40.4 +/- 1.2 vs 31.0 +/- 1.8), lower body mass index (BMI) (35.3 +/- 0.3 kg/m2 vs 39.6 +/- 0.5 kg/m2), and higher rate of regular exercise (39.1% vs 28.8%) ( p < 0.05). Linear regression analysis of the total sample after adjusting for age, BMI, sex, central nervous system medication, and diabetes showed that logAHI, depression, and lack of regular exercise were significant predictors of sleepiness. Predictors of mild or moderate sleepiness for both sexes were depression and logAHI, whereas predictors of severe sleepiness for men were lack of regular exercise, depression, and minimum SaO2 and, for women, logAHI. In obese apneic patients, lack of regular exercise (only in men), depression, and degree of apnea are significant predictors of EDS. This association is modified by sex and degree of sleepiness. Assessment and management of depression and physical exercise should be part of a thorough evaluation of patients with sleep apnea.
NASA Astrophysics Data System (ADS)
Laha, Sibasish; Guainazzi, Matteo; Dewangan, Gulab C.; Chakravorty, Susmita; Kembhavi, Ajit K.
2014-07-01
We present results from a homogeneous analysis of the broad-band 0.3-10 keV CCD resolution as well as of the soft X-ray high-resolution grating spectra of a hard X-ray flux-limited sample of 26 Seyfert galaxies observed with XMM-Newton. Our goal is to characterize warm absorbers (WAs) along the line of sight to the active nucleus. We significantly detect WAs in 65 per cent of the sample sources. Our results are consistent with WAs being present in at least half of the Seyfert galaxies in the nearby Universe, in agreement with previous estimates. We find a gap in the distribution of the ionization parameter in the range 0.5 < log ξ < 1.5 which we interpret as a thermally unstable region for WA clouds. This may indicate that the WA flow is probably constituted by a clumpy distribution of discrete clouds rather than a continuous medium. The distribution of the WA column densities for the sources with broad Fe Kα lines are similar to those sources which do not have broadened emission lines. Therefore, the detected broad Fe Kα emission lines are bona fide and not artefacts of ionized absorption in the soft X-rays. The WA parameters show no correlation among themselves, with the exception of the ionization parameter versus column density. The shallow slope of the log ξ versus log vout linear regression (0.12 ± 0.03) is inconsistent with the scaling laws predicted by radiation or magnetohydrodynamic-driven winds. Our results also suggest that WA and ultra fast outflows do not represent extreme manifestation of the same astrophysical system.
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.; Marino, J. T., Jr.
1974-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-emperical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. Bit error probabilities for non-optimum threshold detection system were also investigated.
Threshold detection in an on-off binary communications channel with atmospheric scintillation
NASA Technical Reports Server (NTRS)
Webb, W. E.
1975-01-01
The optimum detection threshold in an on-off binary optical communications system operating in the presence of atmospheric turbulence was investigated assuming a poisson detection process and log normal scintillation. The dependence of the probability of bit error on log amplitude variance and received signal strength was analyzed and semi-empirical relationships to predict the optimum detection threshold derived. On the basis of this analysis a piecewise linear model for an adaptive threshold detection system is presented. The bit error probabilities for nonoptimum threshold detection systems were also investigated.
Matilla-Santander, Nuria; Valvi, Damaskini; Lopez-Espinosa, Maria-Jose; Manzano-Salgado, Cyntia B.; Ballester, Ferran; Ibarluzea, Jesús; Santa-Marina, Loreto; Schettgen, Thomas; Guxens, Mònica; Sunyer, Jordi
2017-01-01
Background: Exposure to perfluoroalkyl substances (PFASs) may increase risk for metabolic diseases; however, epidemiologic evidence is lacking at the present time. Pregnancy is a period of enhanced tissue plasticity for the fetus and the mother and may be a critical window of PFAS exposure susceptibility. Objective: We evaluated the associations between PFAS exposures and metabolic outcomes in pregnant women. Methods: We analyzed 1,240 pregnant women from the Spanish INMA [Environment and Childhood Project (INfancia y Medio Ambiente)] birth cohort study (recruitment period: 2003–2008) with measured first pregnancy trimester plasma concentrations of four PFASs (in nanograms/milliliter). We used logistic regression models to estimate associations of PFASs (log10-transformed and categorized into quartiles) with impaired glucose tolerance (IGT) and gestational diabetes mellitus (GDM), and we used linear regression models to estimate associations with first-trimester serum levels of triglycerides, total cholesterol, and C-reactive protein (CRP). Results: Perfluorooctane sulfonate (PFOS) and perfluorohexane sulfonate (PFHxS) were positively associated with IGT (137 cases) [OR per log10-unit increase=1.99 (95% CI: 1.06, 3.78) and OR=1.65 ( 95% CI: 0.99, 2.76), respectively]. PFOS and PFHxS associations with GDM (53 cases) were in a similar direction, but less precise. PFOS and perfluorononanoate (PFNA) were negatively associated with triglyceride levels [percent median change per log10-unit increase=−5.86% (95% CI: −9.91%, −1.63%) and percent median change per log10-unit increase=−4.75% (95% CI: −8.16%, −0.61%, respectively], whereas perfluorooctanoate (PFOA) was positively associated with total cholesterol [percent median change per log10-unit increase=1.26% (95% CI: 0.01%, 2.54%)]. PFASs were not associated with CRP in the subset of the population with available data (n=640). Conclusions: Although further confirmation is required, the findings from this study suggest that PFAS exposures during pregnancy may influence lipid metabolism and glucose tolerance and thus may impact the health of the mother and her child. https://doi.org/10.1289/EHP1062 PMID:29135438
Matilla-Santander, Nuria; Valvi, Damaskini; Lopez-Espinosa, Maria-Jose; Manzano-Salgado, Cyntia B; Ballester, Ferran; Ibarluzea, Jesús; Santa-Marina, Loreto; Schettgen, Thomas; Guxens, Mònica; Sunyer, Jordi; Vrijheid, Martine
2017-11-13
Exposure to perfluoroalkyl substances (PFASs) may increase risk for metabolic diseases; however, epidemiologic evidence is lacking at the present time. Pregnancy is a period of enhanced tissue plasticity for the fetus and the mother and may be a critical window of PFAS exposure susceptibility. We evaluated the associations between PFAS exposures and metabolic outcomes in pregnant women. We analyzed 1,240 pregnant women from the Spanish INMA [Environment and Childhood Project (INfancia y Medio Ambiente)] birth cohort study (recruitment period: 2003-2008) with measured first pregnancy trimester plasma concentrations of four PFASs (in nanograms/milliliter). We used logistic regression models to estimate associations of PFASs (log 10 -transformed and categorized into quartiles) with impaired glucose tolerance (IGT) and gestational diabetes mellitus (GDM), and we used linear regression models to estimate associations with first-trimester serum levels of triglycerides, total cholesterol, and C-reactive protein (CRP). Perfluorooctane sulfonate (PFOS) and perfluorohexane sulfonate (PFHxS) were positively associated with IGT (137 cases) [OR per log 10 -unit increase=1.99 (95% CI: 1.06, 3.78) and OR=1.65 ( 95% CI: 0.99, 2.76), respectively]. PFOS and PFHxS associations with GDM (53 cases) were in a similar direction, but less precise. PFOS and perfluorononanoate (PFNA) were negatively associated with triglyceride levels [percent median change per log 10 -unit increase=-5.86% (95% CI: -9.91%, -1.63%) and percent median change per log 10 -unit increase=-4.75% (95% CI: -8.16%, -0.61%, respectively], whereas perfluorooctanoate (PFOA) was positively associated with total cholesterol [percent median change per log 10 -unit increase=1.26% (95% CI: 0.01%, 2.54%)]. PFASs were not associated with CRP in the subset of the population with available data ( n =640). Although further confirmation is required, the findings from this study suggest that PFAS exposures during pregnancy may influence lipid metabolism and glucose tolerance and thus may impact the health of the mother and her child. https://doi.org/10.1289/EHP1062.
Liese, Angela D; Schulz, Mandy; Moore, Charity G; Mayer-Davis, Elizabeth J
2004-12-01
Epidemiological investigations increasingly employ dietary-pattern techniques to fully integrate dietary data. The present study evaluated the relationship of dietary patterns identified by cluster analysis with measures of insulin sensitivity (SI) and adiposity in the multi-ethnic, multi-centre Insulin Resistance Atherosclerosis Study (IRAS, 1992-94). Cross-sectional data from 980 middle-aged adults, of whom 67 % had normal and 33 % had impaired glucose tolerance, were analysed. Usual dietary intake was obtained by an interviewer-administered, validated food-frequency questionnaire. Outcomes included SI, fasting insulin (FI), BMI and waist circumference. The relationship of dietary patterns to log(SI+1), log(FI), BMI and waist circumference was modelled with multivariable linear regressions. Cluster analysis identified six distinct diet patterns--'dark bread', 'wine', 'fruits', 'low-frequency eaters', 'fries' and 'white bread'. The 'white bread' and the 'fries' patterns over-represented the Hispanic IRAS population predominantly from two centres, while the 'wine' and 'dark bread' groups were dominated by non-Hispanic whites. The dietary patterns were associated significantly with each of the outcomes first at the crude, clinical level (P<0.001). Furthermore, they were significantly associated with FI, BMI and waist circumference independent of age, sex, race or ethnicity, clinic, family history of diabetes, smoking and activity (P<0.004), whereas significance was lost for SI. Studying the total dietary behaviour via a pattern approach allowed us to focus both on the qualitative and quantitative dimensions of diet. The present study identified highly consistent associations of distinct dietary patterns with measures of insulin resistance and adiposity, which are risk factors for diabetes and heart disease.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Quantification of residual dose estimation error on log file-based patient dose calculation.
Katsuta, Yoshiyuki; Kadoya, Noriyuki; Fujita, Yukio; Shimizu, Eiji; Matsunaga, Kenichi; Matsushita, Haruo; Majima, Kazuhiro; Jingu, Keiichi
2016-05-01
The log file-based patient dose estimation includes a residual dose estimation error caused by leaf miscalibration, which cannot be reflected on the estimated dose. The purpose of this study is to determine this residual dose estimation error. Modified log files for seven head-and-neck and prostate volumetric modulated arc therapy (VMAT) plans simulating leaf miscalibration were generated by shifting both leaf banks (systematic leaf gap errors: ±2.0, ±1.0, and ±0.5mm in opposite directions and systematic leaf shifts: ±1.0mm in the same direction) using MATLAB-based (MathWorks, Natick, MA) in-house software. The generated modified and non-modified log files were imported back into the treatment planning system and recalculated. Subsequently, the generalized equivalent uniform dose (gEUD) was quantified for the definition of the planning target volume (PTV) and organs at risks. For MLC leaves calibrated within ±0.5mm, the quantified residual dose estimation errors that obtained from the slope of the linear regression of gEUD changes between non- and modified log file doses per leaf gap are in head-and-neck plans 1.32±0.27% and 0.82±0.17Gy for PTV and spinal cord, respectively, and in prostate plans 1.22±0.36%, 0.95±0.14Gy, and 0.45±0.08Gy for PTV, rectum, and bladder, respectively. In this work, we determine the residual dose estimation errors for VMAT delivery using the log file-based patient dose calculation according to the MLC calibration accuracy. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Individual and Group-Based Engagement in an Online Physical Activity Monitoring Program in Georgia.
Smith, Matthew Lee; Durrett, Nicholas K; Bowie, Maria; Berg, Alison; McCullick, Bryan A; LoPilato, Alexander C; Murray, Deborah
2018-06-07
Given the rising prevalence of obesity in the United States, innovative methods are needed to increase physical activity (PA) in community settings. Evidence suggests that individuals are more likely to engage in PA if they are given a choice of activities and have support from others (for encouragement, motivation, and accountability). The objective of this study was to describe the use of the online Walk Georgia PA tracking platform according to whether the user was an individual user or group user. Walk Georgia is a free, interactive online tracking platform that enables users to log PA by duration, activity, and perceived difficulty, and then converts these data into points based on metabolic equivalents. Users join individually or in groups and are encouraged to set weekly PA goals. Data were examined for 6,639 users (65.8% were group users) over 28 months. We used independent sample t tests and Mann-Whitney U tests to compare means between individual and group users. Two linear regression models were fitted to identify factors associated with activity logging. Users logged 218,766 activities (15,119,249 minutes of PA spanning 592,714 miles [41,858,446 points]). On average, group users had created accounts more recently than individual users (P < .001); however, group users logged more activities (P < .001). On average, group users logged more minutes of PA (P < .001) and earned more points (P < .001). Being in a group was associated with a larger proportion of weeks in which 150 minutes or more of weekly PA was logged (B = 20.47, P < .001). Use of Walk Georgia was significantly higher among group users than among individual users. To expand use and dissemination of online tracking of PA, programs should target naturally occurring groups (eg, workplaces, schools, faith-based groups).
ERIC Educational Resources Information Center
Richter, Tobias
2006-01-01
Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.
ERIC Educational Resources Information Center
Newman, Isadore; Fraas, John W.
The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…
Automating approximate Bayesian computation by local linear regression.
Thornton, Kevin R
2009-07-07
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
NASA Astrophysics Data System (ADS)
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Blakely, William F; Bolduc, David L; Debad, Jeff; Sigal, George; Port, Matthias; Abend, Michael; Valente, Marco; Drouet, Michel; Hérodin, Francis
2018-07-01
Use of plasma proteomic and hematological biomarkers represents a promising approach to provide useful diagnostic information for assessment of the severity of hematopoietic acute radiation syndrome. Eighteen baboons were evaluated in a radiation model that underwent total-body and partial-body irradiations at doses of Co gamma rays from 2.5 to 15 Gy at dose rates of 6.25 cGy min and 32 cGy min. Hematopoietic acute radiation syndrome severity levels determined by an analysis of blood count changes measured up to 60 d after irradiation were used to gauge overall hematopoietic acute radiation syndrome severity classifications. A panel of protein biomarkers was measured on plasma samples collected at 0 to 28 d after exposure using electrochemiluminescence-detection technology. The database was split into two distinct groups (i.e., "calibration," n = 11; "validation," n = 7). The calibration database was used in an initial stepwise regression multivariate model-fitting approach followed by down selection of biomarkers for identification of subpanels of hematopoietic acute radiation syndrome-responsive biomarkers for three time windows (i.e., 0-2 d, 2-7 d, 7-28 d). Model 1 (0-2 d) includes log C-reactive protein (p < 0.0001), log interleukin-13 (p < 0.0054), and procalcitonin (p < 0.0316) biomarkers; model 2 (2-7 d) includes log CD27 (p < 0.0001), log FMS-related tyrosine kinase 3 ligand (p < 0.0001), log serum amyloid A (p < 0.0007), and log interleukin-6 (p < 0.0002); and model 3 (7-28 d) includes log CD27 (p < 0.0012), log serum amyloid A (p < 0.0002), log erythropoietin (p < 0.0001), and log CD177 (p < 0.0001). The predicted risk of radiation injury categorization values, representing the hematopoietic acute radiation syndrome severity outcome for the three models, produced least squares multiple regression fit confidences of R = 0.73, 0.82, and 0.75, respectively. The resultant algorithms support the proof of concept that plasma proteomic biomarkers can supplement clinical signs and symptoms to assess hematopoietic acute radiation syndrome risk severity.
Sample Introduction Using the Hildebrand Grid Nebulizer for Plasma Spectrometry
1988-01-01
linear dynamic ranges, precision, and peak width were de- termined for elements in methanol and acetonitrile solutions. , (1)> The grid nebulizer was...FIA) with ICP-OES detection were evaluated. Detec- tion limits, linear dynamic ranges, precision, and peak width were de- termined for elements in...Concentration vs. Log Peak Area for Mn, 59 Cd, Zn, Au, Ni in Methanol (CMSC) 3-28 Log Concentration vs. Log Peak Area for Mn, 60 Cd, Au, Ni in
The association of genetic variants of type 2 diabetes with kidney function.
Franceschini, Nora; Shara, Nawar M; Wang, Hong; Voruganti, V Saroja; Laston, Sandy; Haack, Karin; Lee, Elisa T; Best, Lyle G; Maccluer, Jean W; Cochran, Barbara J; Dyer, Thomas D; Howard, Barbara V; Cole, Shelley A; North, Kari E; Umans, Jason G
2012-07-01
Type 2 diabetes is highly prevalent and is the major cause of progressive chronic kidney disease in American Indians. Genome-wide association studies identified several loci associated with diabetes but their impact on susceptibility to diabetic complications is unknown. We studied the association of 18 type 2 diabetes genome-wide association single-nucleotide polymorphisms (SNPs) with estimated glomerular filtration rate (eGFR; MDRD equation) and urine albumin-to-creatinine ratio in 6958 Strong Heart Study family and cohort participants. Center-specific residuals of eGFR and log urine albumin-to-creatinine ratio, obtained from linear regression models adjusted for age, sex, and body mass index, were regressed onto SNP dosage using variance component models in family data and linear regression in unrelated individuals. Estimates were then combined across centers. Four diabetic loci were associated with eGFR and one locus with urine albumin-to-creatinine ratio. A SNP in the WFS1 gene (rs10010131) was associated with higher eGFR in younger individuals and with increased albuminuria. SNPs in the FTO, KCNJ11, and TCF7L2 genes were associated with lower eGFR, but not albuminuria, and were not significant in prospective analyses. Our findings suggest a shared genetic risk for type 2 diabetes and its kidney complications, and a potential role for WFS1 in early-onset diabetic nephropathy in American Indian populations.
Johnston, J L; Leong, M S; Checkland, E G; Zuberbuhler, P C; Conger, P R; Quinney, H A
1988-12-01
Body density and skinfold thickness at four sites were measured in 140 normal boys, 168 normal girls, and 6 boys and 7 girls with cystic fibrosis, all aged 8-14 y. Prediction equations for the normal boys and girls for the estimation of body-fat content from skinfold measurements were derived from linear regression of body density vs the log of the sum of the skinfold thickness. The relationship between body density and the log of the sum of the skinfold measurements differed from normal for the boys and girls with cystic fibrosis because of their high body density even though their large residual volume was corrected for. However the sum of skinfold measurements in the children with cystic fibrosis did not differ from normal. Thus body fat percent of these children with cystic fibrosis was underestimated when calculated from body density and invalid when calculated from skinfold thickness.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reinert, K.H.
1987-12-01
Recent EPA scrutiny of acrylate and methacrylate monomers has resulted in restrictive consent orders and Significant New Use Rules under the Toxic Substances Control Act, based on structure-activity relationships using mouse skin painting studies. The concern is centered on human health issues regarding worker and consumer exposure. Environmental issues, such as aquatic toxicity, are still of concern. Understanding the relationships and environmental risks to aquatic organisms may improve the understanding of the potential risks to human health. This study evaluates the quantitative structure-activity relationships from measured log Kow's and log LC50's for Pimephales promelas (fathead minnow) and Carassius auratus (goldfish).more » Scientific support of the current regulations is also addressed. Two monomer classes were designated: acrylates and methacrylates. Spearman rank correlation and linear regression were run. Based on this study, an ecotoxicological difference exists between acrylates and methacrylates. Regulatory activities and scientific study should reflect this difference.« less
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Enumeration of verocytotoxigenic Escherichia coli (VTEC) O157 and O26 in milk by quantitative PCR.
Mancusi, Rocco; Trevisani, Marcello
2014-08-01
Quantitative real-time polymerase chain reaction (qPCR) can be a convenient alternative to the Most Probable Number (MPN) methods to count VTEC in milk. The number of VTEC is normally very low in milk; therefore with the aim of increasing the method sensitivity a qPCR protocol that relies on preliminary enrichment was developed. The growth pattern of six VTEC strains (serogroups O157 and O26) was studied using enrichment in Buffered Peptone Water (BPW) with or without acriflavine for 4-24h. Milk samples were inoculated with these strains over a five Log concentration range between 0.24-0.50 and 4.24-4.50 Log CFU/ml. DNA was extracted from the enriched samples in duplicate and each extract was analysed in duplicate by qPCR using pairs of primers specific for the serogroups O157 and O26. When samples were pre-enriched in BPW at 37°C for 8h, the relationship between threshold cycles (CT values) and VTEC Log numbers was linear over a five Log concentration range. The regression of PCR threshold cycle numbers on VTEC Log CFU/ml had a slope coefficient equal to -3.10 (R(2)=0.96) which is indicative of a 10-fold difference of the gene copy numbers between samples (with a 100 ± 10% PCR efficiency). The same 10-fold proportion used for inoculating the milk samples with VTEC was observed, therefore, also in the enriched samples at 8h. A comparison of the CT values of milk samples and controls revealed that the strains inoculated in milk grew with 3 Log increments in the 8h enrichment period. Regression lines that fitted the qPCR and MPN data revealed that the error of the qPCR estimates is lower than the error of the estimated MPN (r=0.982, R(2)=0.965 vs. r=0.967, R(2)=0.935). The growth rates of VTEC strains isolated from milk should be comparatively assessed before qPCR estimates based on the regression model are considered valid. Comparative assessment of the growth rates can be done using spectrophotometric measurements of standardized cultures of isolates and reference strains cultured in BPW at 37°C for 8h. The method developed for the serogroups O157 and O26 can be easily adapted to the other VTEC serogroups that are relevant for human health. The qPCR method is less laborious and faster than the standard MPN method and has been shown to be a good technique for quantifying VTEC in milk. Copyright © 2014 Elsevier B.V. All rights reserved.
Advanced microwave soil moisture studies. [Big Sioux River Basin, Iowa
NASA Technical Reports Server (NTRS)
Dalsted, K. J.; Harlan, J. C.
1983-01-01
Comparisons of low level L-band brightness temperature (TB) and thermal infrared (TIR) data as well as the following data sets: soil map and land cover data; direct soil moisture measurement; and a computer generated contour map were statistically evaluated using regression analysis and linear discriminant analysis. Regression analysis of footprint data shows that statistical groupings of ground variables (soil features and land cover) hold promise for qualitative assessment of soil moisture and for reducing variance within the sampling space. Dry conditions appear to be more conductive to producing meaningful statistics than wet conditions. Regression analysis using field averaged TB and TIR data did not approach the higher sq R values obtained using within-field variations. The linear discriminant analysis indicates some capacity to distinguish categories with the results being somewhat better on a field basis than a footprint basis.
Andrić, Filip; Šegan, Sandra; Dramićanin, Aleksandra; Majstorović, Helena; Milojković-Opsenica, Dušanka
2016-08-05
Soil-water partition coefficient normalized to the organic carbon content (KOC) is one of the crucial properties influencing the fate of organic compounds in the environment. Chromatographic methods are well established alternative for direct sorption techniques used for KOC determination. The present work proposes reversed-phase thin-layer chromatography (RP-TLC) as a simpler, yet equally accurate method as officially recommended HPLC technique. Several TLC systems were studied including octadecyl-(RP18) and cyano-(CN) modified silica layers in combination with methanol-water and acetonitrile-water mixtures as mobile phases. In total 50 compounds of different molecular shape, size, and various ability to establish specific interactions were selected (phenols, beznodiazepines, triazine herbicides, and polyaromatic hydrocarbons). Calibration set of 29 compounds with known logKOC values determined by sorption experiments was used to build simple univariate calibrations, Principal Component Regression (PCR) and Partial Least Squares (PLS) models between logKOC and TLC retention parameters. Models exhibit good statistical performance, indicating that CN-layers contribute better to logKOC modeling than RP18-silica. The most promising TLC methods, officially recommended HPLC method, and four in silico estimation approaches have been compared by non-parametric Sum of Ranking Differences approach (SRD). The best estimations of logKOC values were achieved by simple univariate calibration of TLC retention data involving CN-silica layers and moderate content of methanol (40-50%v/v). They were ranked far well compared to the officially recommended HPLC method which was ranked in the middle. The worst estimates have been obtained from in silico computations based on octanol-water partition coefficient. Linear Solvation Energy Relationship study revealed that increased polarity of CN-layers over RP18 in combination with methanol-water mixtures is the key to better modeling of logKOC through significant diminishing of dipolar and proton accepting influence of the mobile phase as well as enhancing molar refractivity in excess of the chromatographic systems. Copyright © 2016 Elsevier B.V. All rights reserved.
Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.
Ritz, Christian; Van der Vliet, Leana
2009-09-01
The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.
Estimating erosion risk on forest lands using improved methods of discriminant analysis
J. Lewis; R. M. Rice
1990-01-01
A population of 638 timber harvest areas in northwestern California was sampled for data related to the occurrence of critical amounts of erosion (>153 m3 within 0.81 ha). Separate analyses were done for forest roads and logged areas. Linear discriminant functions were computed in each analysis to contrast site conditions at critical plots with randomly selected...
Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi
2013-09-01
Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Kumar, K Vasanth; Sivanesan, S
2006-08-25
Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
Mor, Orna; Gozlan, Yael; Wax, Marina; Mileguir, Fernando; Rakovsky, Avia; Noy, Bina; Mendelson, Ella; Levy, Itzchak
2015-11-01
HIV-1 RNA monitoring, both before and during antiretroviral therapy, is an integral part of HIV management worldwide. Measurements of HIV-1 viral loads are expected to assess the copy numbers of all common HIV-1 subtypes accurately and to be equally sensitive at different viral loads. In this study, we compared for the first time the performance of the NucliSens v2.0, RealTime HIV-1, Aptima HIV-1 Quant Dx, and Xpert HIV-1 viral load assays. Plasma samples (n = 404) were selected on the basis of their NucliSens v2.0 viral load results and HIV-1 subtypes. Concordance, linear regression, and Bland-Altman plots were assessed, and mixed-model analysis was utilized to compare the analytical performance of the assays for different HIV-1 subtypes and for low and high HIV-1 copy numbers. Overall, high concordance (>83.89%), high correlation values (Pearson r values of >0.89), and good agreement were observed among all assays, although the Xpert and Aptima assays, which provided the most similar outputs (estimated mean viral loads of 2.67 log copies/ml [95% confidence interval [CI], 2.50 to 2.84 log copies/ml] and 2.68 log copies/ml [95% CI, 2.49 to 2.86 log copies/ml], respectively), correlated best with the RealTime assay (89.8% concordance, with Pearson r values of 0.97 to 0.98). These three assays exhibited greater precision than the NucliSens v2.0 assay. All assays were equally sensitive for subtype B and AG/G samples and for samples with viral loads of 1.60 to 3.00 log copies/ml. The NucliSens v2.0 assay underestimated A1 samples and those with viral loads of >3.00 log copies/ml. The RealTime assay tended to underquantify subtype C (compared to the Xpert and Aptima assays) and subtype A1 samples. The Xpert and Aptima assays were equally efficient for detection of all subtypes and viral loads, which renders these new assays most suitable for clinical HIV laboratories. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Magnus, Maria C.; Stigum, Hein; Håberg, Siri E.; Nafstad, Per; London, Stephanie J.; Nystad, Wenche
2015-01-01
Background The immediate postnatal period is the period of the fastest growth in the entire life span and a critical period for lung development. Therefore, it is interesting to examine the association between growth during this period and childhood respiratory disorders. Methods We examined the association of peak weight and height velocity to age 36 months with maternal report of current asthma at 36 months (n = 50,311), recurrent lower respiratory tract infections (LRTIs) by 36 months (n = 47,905) and current asthma at 7 years (n = 24,827) in the Norwegian Mother and Child Cohort Study. Peak weight and height velocity was calculated using the Reed1 model through multilevel mixed-effects linear regression. Multivariable log-binomial regression was used to calculate adjusted relative risks (adj.RR) and 95% confidence intervals (CI). We also conducted a sibling pair analysis using conditional logistic regression. Results Peak weight velocity was positively associated with current asthma at 36 months [adj.RR 1.22 (95%CI: 1.18, 1.26) per standard deviation (SD) increase], recurrent LRTIs by 36 months [adj.RR 1.14 (1.10, 1.19) per SD increase] and current asthma at 7 years [adj.RR 1.13 (95%CI: 1.07, 1.19) per SD increase]. Peak height velocity was not associated with any of the respiratory disorders. The positive association of peak weight velocity and asthma at 36 months remained in the sibling pair analysis. Conclusions Higher peak weight velocity, achieved during the immediate postnatal period, increased the risk of respiratory disorders. This might be explained by an influence on neonatal lung development, shared genetic/epigenetic mechanisms and/or environmental factors. PMID:25635872
Inactivation of Mycobacterium avium subsp. paratuberculosis during cooking of hamburger patties.
Hammer, Philipp; Walte, Hans-Georg C; Matzen, Sönke; Hensel, Jann; Kiesner, Christian
2013-07-01
The role of Mycobacterium avium subsp. paratuberculosis (MAP) in Crohn's disease in humans has been debated for many years. Milk and milk products have been suggested as possible vectors for transmission since the beginning of this debate, whereas recent publications show that slaughtered cattle and their carcasses, meat, and organs can also serve as reservoirs for MAP transmission. The objective of this study was to generate heat-inactivation data for MAP during the cooking of hamburger patties. Hamburger patties of lean ground beef weighing 70 and 50 g were cooked for 6, 5, 4, 3, and 2 min, which were sterilized by irradiation and spiked with three different MAP strains at levels between 10² and 10⁶ CFU/ml. Single-sided cooking with one flip was applied, and the temperatures within the patties were recorded by seven thermocouples. Counting of the surviving bacteria was performed by direct plating onto Herrold's egg yolk medium and a three-vial most-probable-number method by using modified Dubos medium. There was considerable variability in temperature throughout the patties during frying. In addition, the log reduction in MAP numbers showed strong variations. In patties weighing 70 g, considerable bacterial reduction of 4 log or larger could only be achieved after 6 min of cooking. For all other cooking times, the bacterial reduction was less than 2 log. Patties weighing 50 g showed a 5-log or larger reduction after cooking times of 5 and 6 min. To determine the inactivation kinetics, a log-linear regression model was used, showing a constant decrease of MAP numbers over cooking time.
Structure-activity relationships for chloro- and nitrophenol toxicity in the pollen tube growth test
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schueuermann, G.; Somashekar, R.K.; Kristen, U.
Acute toxicity of 10 chlorophenols and 10 nitrophenols with identical substitution patterns is analyzed with the pollen tube growth (PTG) test. Concentration values of 50% growth inhibition (IC50) between 0.1 and 300 mg/L indicate that the absolute sensitivity of this alternative biotest is comparable to conventional aquatic test systems. Analysis of quantitative structure-activity relationships using lipophilicity (log K{sub ow}), acidity (pK{sub a}), and quantum chemical parameters to model intrinsic acidity, solvation interactions, and nucleophilicity reveals substantial differences between the intraseries trends of log IC50. With chlorophenols, a narcotic-type relationship is derived, which, however, shows marked differences in slope and interceptmore » when compared to reference regression equations for polar narcosis. Regression analysis of nitrophenol toxicity suggests interpretation in terms of two modes of action: oxidative uncoupling activity is associated with a pK{sub a} window from 3.8 to 8.5, and more acidic congeners with diortho-substitution show a transition from uncoupling to a narcotic mode of action with decreasing pK{sub a} and log K{sub ow}. Model calculations for phenol nucleophilicity suggest that differences in the phenol readiness for glucuronic acid conjugation as a major phase-II detoxication pathway have no direct influence on acute PTG toxicity of the compounds.« less
Schistosomiasis Breeding Environment Situation Analysis in Dongting Lake Area
NASA Astrophysics Data System (ADS)
Li, Chuanrong; Jia, Yuanyuan; Ma, Lingling; Liu, Zhaoyan; Qian, Yonggang
2013-01-01
Monitoring environmental characteristics, such as vegetation, soil moisture et al., of Oncomelania hupensis (O. hupensis)’ spatial/temporal distribution is of vital importance to the schistosomiasis prevention and control. In this study, the relationship between environmental factors derived from remotely sensed data and the density of O. hupensis was analyzed by a multiple linear regression model. Secondly, spatial analysis of the regression residual was investigated by the semi-variogram method. Thirdly, spatial analysis of the regression residual and the multiple linear regression model were both employed to estimate the spatial variation of O. hupensis density. Finally, the approach was used to monitor and predict the spatial and temporal variations of oncomelania of Dongting Lake region, China. And the areas of potential O. hupensis habitats were predicted and the influence of Three Gorges Dam (TGB)project on the density of O. hupensis was analyzed.
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.
2013-01-01
Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A
2013-01-01
Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Obesity and age at diagnosis of endometrial cancer.
Nevadunsky, Nicole S; Van Arsdale, Anne; Strickler, Howard D; Moadel, Alyson; Kaur, Gurpreet; Levitt, Joshua; Girda, Eugenia; Goldfinger, Mendel; Goldberg, Gary L; Einstein, Mark H
2014-08-01
Obesity is an established risk factor for development of endometrial cancer. We hypothesized that obesity might also be associated with an earlier age at endometrial cancer diagnosis, because mechanisms that drive the obesity-endometrial cancer association might also accelerate tumorigenesis. A retrospective chart review was conducted of all cases of endometrial cancer diagnosed from 1999 to 2009 at a large medical center in New York City. The association of body mass index (BMI) with age at endometrial cancer diagnosis, comorbidities, stage, grade, and radiation treatment was examined using analysis of variance and linear regression. Overall survival by BMI category was assessed using Kaplan-Meier method and the log-rank test. A total of 985 cases of endometrial cancer were identified. The mean age at endometrial cancer diagnosis was 67.1 years (±11.9 standard deviation) in women with a normal BMI, whereas it was 56.3 years (±10.3 standard deviation) in women with a BMI greater than 50. Age at diagnosis of endometrioid-type cancer decreased linearly with increasing BMI (y=67.89-1.86x, R=0.049, P<.001). This association persisted after multivariable adjustment (R=0.181, P<.02). A linear association between BMI and age of nonendometrioid cancers was not found (P=.12). There were no differences in overall survival by BMI category. Obesity is associated with earlier age at diagnosis of endometrioid-type endometrial cancers. Similar associations were not, however, observed with nonendometrioid cancers, consistent with different pathways of tumorigenesis. II.
Genetic Polymorphisms in RNA Binding Proteins Contribute to Breast Cancer Survival
Upadhyay, Rohit; Sanduja, Sandhya; Kaza, Vimala; Dixon, Dan A.
2012-01-01
The RNA-binding proteins TTP and HuR control expression of numerous genes associated with breast cancer pathogenesis by regulating mRNA stability. However, the role of genetic variation in TTP (ZFP36) and HuR (ELAVL1) genes is unknown in breast cancer prognosis. A total of 251 breast cancer patients (170 Caucasians and 81 African-Americans) were enrolled and followed-up from 2001 to 2011 (or until death). Genotyping was performed for 10 SNPs in ZFP36 and 7 in ELAVL1 genes. On comparing both races with one another, significant differences were found for clinical and genetic variables. The influence of genetic polymorphisms on survival was analyzed by using Cox-regression, Kaplan-Meier analysis, and the log-rank test. Univariate (Kaplan-Meier/Cox-regression) and multivariate (Cox-regression) analysis showed that the TTP gene polymorphism ZFP36*2 A>G was significantly associated with poor prognosis of Caucasian patients (HR = 2.03; 95% CI = 1.09–3.76; P = 0.025; log-rank P = 0.022). None of the haplotypes, but presence of more than six risk genotypes in Caucasian patients, was significantly associated with poor prognosis (HR=2.42; 95% CI=1.17–4.99; P = 0.017; log-rank P = 0.007). The effect of ZFP36*2 A>G on gene expression was evaluated from patients' tissue samples. Both TTP mRNA and protein expression was significantly decreased in ZFP36*2 G allele carriers compared to A allele homozygotes. Conversely, upregulation of the TTP-target gene COX-2 was observed ZFP36*2 G allele carriers. Through its ability to attenuate TTP gene expression, the ZFP36*2 A>G gene polymorphism has appeared as a novel prognostic breast cancer marker in Caucasian patients. PMID:22907529
Breivik, Cathrine Nansdal; Nilsen, Roy Miodini; Myrseth, Erling; Pedersen, Paal Henning; Varughese, Jobin K; Chaudhry, Aqeel Asghar; Lund-Johansen, Morten
2013-07-01
There are few reports about the course of vestibular schwannoma (VS) patients following gamma knife radiosurgery (GKRS) compared with the course following conservative management (CM). In this study, we present prospectively collected data of 237 patients with unilateral VS extending outside the internal acoustic canal who received either GKRS (113) or CM (124). The aim was to measure the effect of GKRS compared with the natural course on tumor growth rate and hearing loss. Secondary end points were postinclusion additional treatment, quality of life (QoL), and symptom development. The patients underwent magnetic resonance imaging scans, clinical examination, and QoL assessment by SF-36 questionnaire. Statistics were performed by using Spearman correlation coefficient, Kaplan-Meier plot, Poisson regression model, mixed linear regression models, and mixed logistic regression models. Mean follow-up time was 55.0 months (26.1 standard deviation, range 10-132). Thirteen patients were lost to follow-up. Serviceable hearing was lost in 54 of 71 (76%) (CM) and 34 of 53 (64%) (GKRS) patients during the study period (not significant, log-rank test). There was a significant reduction in tumor volume over time in the GKRS group. The need for treatment following initial GKRS or CM differed at highly significant levels (log-rank test, P < .001). Symptom and QoL development did not differ significantly between the groups. In VS patients, GKRS reduces the tumor growth rate and thereby the incidence rate of new treatment about tenfold. Hearing is lost at similar rates in both groups. Symptoms and QoL seem not to be significantly affected by GKRS.
Predictive and mechanistic multivariate linear regression models for reaction development
Santiago, Celine B.; Guo, Jing-Yao
2018-01-01
Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
L-O-S-T: Logging Optimization Selection Technique
Jerry L. Koger; Dennis B. Webster
1984-01-01
L-O-S-T is a FORTRAN computer program developed to systematically quantify, analyze, and improve user selected harvesting methods. Harvesting times and costs are computed for road construction, landing construction, system move between landings, skidding, and trucking. A linear programming formulation utilizing the relationships among marginal analysis, isoquants, and...
Devos, Stefanie; Cox, Bianca; van Lier, Tom; Nawrot, Tim S; Putman, Koen
2016-09-01
We used log-linear and log-log exposure-response (E-R) functions to model the association between PM2.5 exposure and non-elective hospitalizations for pneumonia, and estimated the attributable hospital costs by using the effect estimates obtained from both functions. We used hospital discharge data on 3519 non-elective pneumonia admissions from UZ Brussels between 2007 and 2012 and we combined a case-crossover design with distributed lag models. The annual averted pneumonia hospitalization costs for a reduction in PM2.5 exposure from the mean (21.4μg/m(3)) to the WHO guideline for annual mean PM2.5 (10μg/m(3)) were estimated and extrapolated for Belgium. Non-elective hospitalizations for pneumonia were significantly associated with PM2.5 exposure in both models. Using a log-linear E-R function, the estimated risk reduction for pneumonia hospitalization associated with a decrease in mean PM2.5 exposure to 10μg/m(3) was 4.9%. The corresponding estimate for the log-log model was 10.7%. These estimates translate to an annual pneumonia hospital cost saving in Belgium of €15.5 million and almost €34 million for the log-linear and log-log E-R function, respectively. Although further research is required to assess the shape of the association between PM2.5 exposure and pneumonia hospitalizations, we demonstrated that estimates for health effects and associated costs heavily depend on the assumed E-R function. These results are important for policy making, as supra-linear E-R associations imply that significant health benefits may still be obtained from additional pollution control measures in areas where PM levels have already been reduced. Copyright © 2016 Elsevier Ltd. All rights reserved.
Log-linear human chorionic gonadotropin elimination in cases of retained placenta percreta.
Stitely, Michael L; Gerard Jackson, M; Holls, William H
2014-02-01
To describe the human chorionic gonadotropin (hCG) elimination rate in patients with intentionally retained placenta percreta. Medical records for cases of placenta percreta with intentional retention of the placenta were reviewed. The natural log of the hCG levels were plotted versus time and then the elimination rate equations were derived. The hCG elimination rate equations were log-linear in three cases individually (R (2) = 0.96-0.99) and in aggregate R (2) = 0.92). The mean half-life of hCG elimination was 146.3 h (6.1 days). The elimination of hCG in patients with intentionally retained placenta percreta is consistent with a two-compartment elimination model. The hCG elimination in retained placenta percreta is predictable in a log-linear manner that is similar to other reports of retained abnormally adherent placentae treated with or without methotrexate.
Local linear regression for function learning: an analysis based on sample discrepancy.
Cervellera, Cristiano; Macciò, Danilo
2014-11-01
Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Violanti, John M; Fekedulegn, Desta; Andrew, Michael E; Hartley, Tara A; Charles, Luenda E; Miller, Diane B; Burchfiel, Cecil M
2017-01-01
Police officers encounter unpredictable, evolving, and escalating stressful demands in their work. Utilizing the Spielberger Police Stress Survey (60-item instrument for assessing specific conditions or events considered to be stressors in police work), the present study examined the association of the top five highly rated and bottom five least rated work stressors among police officers with their awakening cortisol pattern. Participants were police officers enrolled in the Buffalo Cardio-Metabolic Occupational Police Stress (BCOPS) study (n=338). For each group, the total stress index (product of rating and frequency of the stressor) was calculated. Participants collected saliva by means of Salivettes at four time points: on awakening, 15, 30 and 45min after waking to examine the cortisol awakening response (CAR). Saliva samples were analyzed for free cortisol concentrations. A slope reflecting the awakening pattern of cortisol over time was estimated by fitting a linear regression model relating cortisol in log-scale to time of collection. The slope served as the outcome variable. Analysis of covariance, regression, and repeated measures models were used to determine if there was an association of the stress index with the waking cortisol pattern. There was a significant negative linear association between total stress index of the five highest stressful events and slope of the awakening cortisol regression line (trend p-value=0.0024). As the stress index increased, the pattern of the awakening cortisol regression line tended to flatten. Officers with a zero stress index showed a steep and steady increase in cortisol from baseline (which is often observed) while officers with a moderate or high stress index showed a dampened or flatter response over time. Conversely, the total stress index of the five least rated events was not significantly associated with the awakening cortisol pattern. The study suggests that police events or conditions considered highly stressful by the officers may be associated with disturbances of the typical awakening cortisol pattern. The results are consistent with previous research where chronic exposure to stressors is associated with a diminished awakening cortisol response pattern. Copyright © 2016 Elsevier Ltd. All rights reserved.
Violanti, John M.; Fekedulegn, Desta; Andrew, Michael E.; Hartley, Tara A.; Charles, Luenda E.; Miller, Diane B.; Burchfiel, Cecil M.
2016-01-01
Police officers encounter unpredictable, evolving, and escalating stressful demands in their work. Utilizing the Spielberger Police Stress Survey (60-item instrument for assessing specific conditions or events considered to be stressors in police work), the present study examined the association of the top five highly rated and bottom five least rated work stressors among police officers with their awakening cortisol pattern. Participants were police officers enrolled in the Buffalo Cardio-Metabolic Occupational Police Stress (BCOPS) study (n = 338). For each group, the total stress index (product of rating and frequency of the stressor) was calculated. Participants collected saliva by means of Salivettes at four time points: on awakening, 15, 30 and 45 min after waking to examine the cortisol awakening response (CAR). Saliva samples were analyzed for free cortisol concentrations. A slope reflecting the awakening pattern of cortisol over time was estimated by fitting a linear regression model relating cortisol in log-scale to time of collection. The slope served as the outcome variable. Analysis of covariance, regression, and repeated measures models were used to determine if there was an association of the stress index with the waking cortisol pattern. There was a significant negative linear association between total stress index of the five highest stressful events and slope of the awakening cortisol regression line (trend p-value = 0.0024). As the stress index increased, the pattern of the awakening cortisol regression line tended to flatten. Officers with a zero stress index showed a steep and steady increase in cortisol from baseline (which is often observed) while officers with a moderate or high stress index showed a dampened or flatter response over time. Conversely, the total stress index of the five least rated events was not significantly associated with the awakening cortisol pattern. The study suggests that police events or conditions considered highly stressful by the officers may be associated with disturbances of the typical awakening cortisol pattern. The results are consistent with previous research where chronic exposure to stressors is associated with a diminished awakening cortisol response pattern. PMID:27816820
ERIC Educational Resources Information Center
Hicks, Catherine
2018-01-01
Purpose: This paper aims to explore predicting employee learning activity via employee characteristics and usage for two online learning tools. Design/methodology/approach: Statistical analysis focused on observational data collected from user logs. Data are analyzed via regression models. Findings: Findings are presented for over 40,000…
Schønning, Kristian; Pedersen, Martin Schou; Johansen, Kim; Landt, Bodil; Nielsen, Lone Gilmor; Weis, Nina; Westh, Henrik
2017-10-01
Chronic hepatitis C virus (HCV) infection can be effectively treated with directly acting antiviral (DAA) therapy. Measurement of HCV RNA is used to evaluate patient compliance and virological response during and after treatment. To compare the analytical performance of the Aptima HCV Quant Dx Assay (Aptima) and the COBAS Ampliprep/COBAS TaqMan HCV Test v2.0 (CAPCTMv2) for the quantification of HCV RNA in plasma samples, and compare the clinical utility of the two tests in patients undergoing treatment with DAA therapy. Analytical performance was evaluated on two sets of plasma samples: 125 genotyped samples and 172 samples referred for quantification of HCV RNA. Furthermore, performance was evaluated using dilutions series of four samples containing HCV genotype 1a, 2b, 3a, and 4a, respectively. Clinical utility was evaluated on 118 plasma samples obtained from 13 patients undergoing treatment with DAAs. Deming regression of results from 187 plasma samples with HCV RNA >2 Log IU/mL indicated that the Aptima assay quantified higher than the CAPCTMv2 test for HCV RNA >4.9 Log IU/mL. The linearity of the Aptima assay was excellent across dilution series of four HCV genotypes (slope of the regression line: 1.00-1.02). The Aptima assay detected significantly more replicates below targeted 2 Log IU/mL than the CAPCTMv2 test, and yielded clearly interpretable results when used to analyze samples from patients treated with DAAs. The analytical performance of the Aptima assay makes it well suited for monitoring patients with chronic HCV infection undergoing antiviral treatment. Copyright © 2017 Elsevier B.V. All rights reserved.
Log and tree sawing times for hardwood mills
Everette D. Rast
1974-01-01
Data on 6,850 logs and 1,181 trees were analyzed to predict sawing times. For both logs and trees, regression equations were derived that express (in minutes) sawing time per log or tree and per Mbf. For trees, merchantable height is expressed in number of logs as well as in feet. One of the major uses for the tables of average sawing times is as a bench mark against...
Gut microbiota interactions with the immunomodulatory role of vitamin D in normal individuals.
Luthold, Renata V; Fernandes, Gabriel R; Franco-de-Moraes, Ana Carolina; Folchetti, Luciana G D; Ferreira, Sandra Roberta G
2017-04-01
Due to immunomodulatory properties, vitamin D status has been implicated in several diseases beyond the skeletal disorders. There is evidence that its deficiency deteriorates the gut barrier favoring translocation of endotoxins into the circulation and systemic inflammation. Few studies investigated whether the relationship between vitamin D status and metabolic disorders would be mediated by the gut microbiota composition. We examined the association between vitamin D intake and circulating levels of 25(OH)D with gut microbiota composition, inflammatory markers and biochemical profile in healthy individuals. In this cross-sectional analysis, 150 young healthy adults were stratified into tertiles of intake and concentrations of vitamin D and their clinical and inflammatory profiles were compared. The DESeq2 was used for comparisons of microbiota composition and the log2 fold changes (log2FC) represented the comparison against the reference level. The association between 25(OH)D and fecal microbiota (16S rRNA sequencing, V4 region) was tested by multiple linear regression. Vitamin D intake was associated with its concentration (r=0.220, p=0.008). There were no significant differences in clinical and inflammatory variables across tertiles of intake. However, lipopolysaccharides increased with the reduction of 25(OH)D (p-trend <0.05). Prevotella was more abundant (log2FC 1.67, p<0.01), while Haemophilus and Veillonella were less abundant (log2FC -2.92 and -1.46, p<0.01, respectively) in the subset with the highest vitamin D intake (reference) than that observed in the other subset (first plus second tertiles). PCR (r=-0.170, p=0.039), E-selectin (r=-0.220, p=0.007) and abundances of Coprococcus (r=-0.215, p=0.008) and Bifdobacterium (r=-0.269, p=0.001) were inversely correlated with 25(OH)D. After adjusting for age, sex, season and BMI, 25(OH)D maintained inversely associated with Coprococcus (β=-9.414, p=0.045) and Bifdobacterium (β=-1.881, p=0.051), but significance disappeared following the addition of inflammatory markers in the regression models. The role of vitamin D in the maintenance of immune homeostasis seems to occur in part by interacting with the gut microbiota. The attenuation of association of bacterial genera by inflammatory markers suggests that inflammation participate in part in the relationship between the gut microbiota and vitamin D concentration. Studies with appropriate design are necessary to address hypothesis raised in the current study. Copyright © 2017 Elsevier Inc. All rights reserved.
Vilar, Santiago; Chakrabarti, Mayukh; Costanzi, Stefano
2010-01-01
The distribution of compounds between blood and brain is a very important consideration for new candidate drug molecules. In this paper, we describe the derivation of two linear discriminant analysis (LDA) models for the prediction of passive blood-brain partitioning, expressed in terms of log BB values. The models are based on computationally derived physicochemical descriptors, namely the octanol/water partition coefficient (log P), the topological polar surface area (TPSA) and the total number of acidic and basic atoms, and were obtained using a homogeneous training set of 307 compounds, for all of which the published experimental log BB data had been determined in vivo. In particular, since molecules with log BB > 0.3 cross the blood-brain barrier (BBB) readily while molecules with log BB < −1 are poorly distributed to the brain, on the basis of these thresholds we derived two distinct models, both of which show a percentage of good classification of about 80%. Notably, the predictive power of our models was confirmed by the analysis of a large external dataset of compounds with reported activity on the central nervous system (CNS) or lack thereof. The calculation of straightforward physicochemical descriptors is the only requirement for the prediction of the log BB of novel compounds through our models, which can be conveniently applied in conjunction with drug design and virtual screenings. PMID:20427217
Vilar, Santiago; Chakrabarti, Mayukh; Costanzi, Stefano
2010-06-01
The distribution of compounds between blood and brain is a very important consideration for new candidate drug molecules. In this paper, we describe the derivation of two linear discriminant analysis (LDA) models for the prediction of passive blood-brain partitioning, expressed in terms of logBB values. The models are based on computationally derived physicochemical descriptors, namely the octanol/water partition coefficient (logP), the topological polar surface area (TPSA) and the total number of acidic and basic atoms, and were obtained using a homogeneous training set of 307 compounds, for all of which the published experimental logBB data had been determined in vivo. In particular, since molecules with logBB>0.3 cross the blood-brain barrier (BBB) readily while molecules with logBB<-1 are poorly distributed to the brain, on the basis of these thresholds we derived two distinct models, both of which show a percentage of good classification of about 80%. Notably, the predictive power of our models was confirmed by the analysis of a large external dataset of compounds with reported activity on the central nervous system (CNS) or lack thereof. The calculation of straightforward physicochemical descriptors is the only requirement for the prediction of the logBB of novel compounds through our models, which can be conveniently applied in conjunction with drug design and virtual screenings. Published by Elsevier Inc.
Nualkaekul, Sawaminee; Salmeron, Ivan; Charalampopoulos, Dimitris
2011-12-01
The survival of Bifidobacterium longum NCIMB 8809 was studied during refrigerated storage for 6weeks in model solutions, based on which a mathematical model was constructed describing cell survival as a function of pH, citric acid, protein and dietary fibre. A Central Composite Design (CCD) was developed studying the influence of four factors at three levels, i.e., pH (3.2-4), citric acid (2-15g/l), protein (0-10g/l), and dietary fibre (0-8g/l). In total, 31 experimental runs were carried out. Analysis of variance (ANOVA) of the regression model demonstrated that the model fitted well the data. From the regression coefficients it was deduced that all four factors had a statistically significant (P<0.05) negative effect on the log decrease [log10N0 week-log10N6 week], with the pH and citric acid being the most influential ones. Cell survival during storage was also investigated in various types of juices, including orange, grapefruit, blackcurrant, pineapple, pomegranate and strawberry. The highest cell survival (less than 0.4log decrease) after 6weeks of storage was observed in orange and pineapple, both of which had a pH of about 3.8. Although the pH of grapefruit and blackcurrant was similar (pH ∼3.2), the log decrease of the former was ∼0.5log, whereas of the latter was ∼0.7log. One reason for this could be the fact that grapefruit contained a high amount of citric acid (15.3g/l). The log decrease in pomegranate and strawberry juices was extremely high (∼8logs). The mathematical model was able to predict adequately the cell survival in orange, grapefruit, blackcurrant, and pineapple juices. However, the model failed to predict the cell survival in pomegranate and strawberry, most likely due to the very high levels of phenolic compounds in these two juices. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Abunama, Taher; Othman, Faridah
2017-06-01
Analysing the fluctuations of wastewater inflow rates in sewage treatment plants (STPs) is essential to guarantee a sufficient treatment of wastewater before discharging it to the environment. The main objectives of this study are to statistically analyze and forecast the wastewater inflow rates into the Bandar Tun Razak STP in Kuala Lumpur, Malaysia. A time series analysis of three years’ weekly influent data (156weeks) has been conducted using the Auto-Regressive Integrated Moving Average (ARIMA) model. Various combinations of ARIMA orders (p, d, q) have been tried to select the most fitted model, which was utilized to forecast the wastewater inflow rates. The linear regression analysis was applied to testify the correlation between the observed and predicted influents. ARIMA (3, 1, 3) model was selected with the highest significance R-square and lowest normalized Bayesian Information Criterion (BIC) value, and accordingly the wastewater inflow rates were forecasted to additional 52weeks. The linear regression analysis between the observed and predicted values of the wastewater inflow rates showed a positive linear correlation with a coefficient of 0.831.
Dinç, Erdal; Ozdemir, Abdil
2005-01-01
Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
Paillet, Frederick L.; Crowder, R.E.
1996-01-01
Quantitative analysis of geophysical logs in ground-water studies often involves at least as broad a range of applications and variation in lithology as is typically encountered in petroleum exploration, making such logs difficult to calibrate and complicating inversion problem formulation. At the same time, data inversion and analysis depend on inversion model formulation and refinement, so that log interpretation cannot be deferred to a geophysical log specialist unless active involvement with interpretation can be maintained by such an expert over the lifetime of the project. We propose a generalized log-interpretation procedure designed to guide hydrogeologists in the interpretation of geophysical logs, and in the integration of log data into ground-water models that may be systematically refined and improved in an iterative way. The procedure is designed to maximize the effective use of three primary contributions from geophysical logs: (1) The continuous depth scale of the measurements along the well bore; (2) The in situ measurement of lithologic properties and the correlation with hydraulic properties of the formations over a finite sample volume; and (3) Multiple independent measurements that can potentially be inverted for multiple physical or hydraulic properties of interest. The approach is formulated in the context of geophysical inversion theory, and is designed to be interfaced with surface geophysical soundings and conventional hydraulic testing. The step-by-step procedures given in our generalized interpretation and inversion technique are based on both qualitative analysis designed to assist formulation of the interpretation model, and quantitative analysis used to assign numerical values to model parameters. The approach bases a decision as to whether quantitative inversion is statistically warranted by formulating an over-determined inversion. If no such inversion is consistent with the inversion model, quantitative inversion is judged not possible with the given data set. Additional statistical criteria such as the statistical significance of regressions are used to guide the subsequent calibration of geophysical data in terms of hydraulic variables in those situations where quantitative data inversion is considered appropriate.
Zivlas, Christos; Triposkiadis, Filippos; Psarras, Stelios; Giamouzis, Gregory; Skoularigis, Ioannis; Chryssanthopoulos, Stavros; Kapelouzou, Alkistis; Ramcharitar, Steve; Barnes, Edward; Papasteriadis, Evangelos; Cokkinos, Dennis
2017-11-01
Backround: Left atrial (LA) enlargement plays an important role in the development of heart failure (HF) and is a robust prognostic factor. Fibrotic processes have also been advocated to evoke HF through finite signalling proteins. We examined the association of two such proteins, cystatin C (CysC) and galectin-3 (Gal-3), and other clinical, echocardiographic and biochemical parameters with LA volume index (LAVi) in patients with HF with severely impaired left ventricular ejection fraction (LVEF). Severe renal, liver, autoimmune disease and cancer were exclusion criteria. A total of 40 patients with HF (31 men, age 66.6 ± 1.7) with LVEF = 25.4 ± 0.9% were divided into two groups according to the mean LAVi (51.03 ± 2.9 ml/m 2 ) calculated by two-dimensional transthoracic echocardiography. Greater LAVi was positively associated with LV end-diastolic volume ( p = 0.017), LV end-systolic volume ( p = 0.025), mitral regurgitant volume (MRV) ( p = 0.001), right ventricular systolic pressure (RVSP) ( p < 0.001), restrictive diastolic filling pattern ( p = 0.003) and atrial fibrillation ( p = 0.005). Plasma CysC was positively correlated with LAVi ( R 2 = 0.135, p = 0.019) and log-transformed plasma Gal-3 ( R 2 = 0.109, p = 0.042) by simple linear regression analysis. Stepwise multiple linear regression analysis showed that only MRV ( t = 2.236, p = 0.032), CysC ( t = 2.467, p = 0.019) and RVSP ( t = 2.155, p = 0.038) were significant predictors of LAVi. Apart from known determinants of LAVi, circulating CysC and Gal-3 were associated with greater LA dilatation in patients with HF with reduced LVEF. Interestingly, the correlation between these two fibrotic proteins was positive.
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs
Daniel A. Yaussy; Robert L. Brisbin
1983-01-01
A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
Xu, Feng; Liang, Xinmiao; Lin, Bingcheng; Su, Fan; Schramm, Karl-Werner; Kettrup, Antonius
2002-08-01
The capacity factors of a series of hydrophobic organic compounds (HOCs) were measured in soil leaching column chromatography (SLCC) on a soil column, and in reversed-phase liquid chromatography on a C18 column with different volumetric fractions (phi) of methanol in methanol-water mixtures. A general equation of linear solvation energy relationships, log(XYZ) XYZ0 + mV(I)/100 + spi + bbetam + aalpham, was applied to analyze capacity factors (k'), soil organic partition coefficients (Koc) and octanol-water partition coefficients (P). The analyses exhibited high accuracy. The chief solute factors that control logKoc, log P, and logk' (on soil and on C18) are the solute size (V(I)/100) and hydrogen-bond basicity (betam). Less important solute factors are the dipolarity/polarizability (pi*) and hydrogen-bond acidity (alpham). Log k' on soil and log Koc have similar signs in four fitting coefficients (m, s, b and a) and similar ratios (m:s:b:a), while log k' on C18 and logP have similar signs in coefficients (m, s, b and a) and similar ratios (m:s:b:a). Consequently, logk' values on C18 have good correlations with logP (r > 0.97), while logk' values on soil have good correlations with logKoc (r > 0.98). Two Koc estimation methods were developed, one through solute solvatochromic parameters, and the other through correlations with k' on soil. For HOCs, a linear relationship between logarithmic capacity factor and methanol composition in methanol-water mixtures could also be derived in SLCC.
Liu, Yan; Salvendy, Gavriel
2009-05-01
This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.
Catching errors with patient-specific pretreatment machine log file analysis.
Rangaraj, Dharanipathy; Zhu, Mingyao; Yang, Deshan; Palaniswaamy, Geethpriya; Yaddanapudi, Sridhar; Wooten, Omar H; Brame, Scott; Mutic, Sasa
2013-01-01
A robust, efficient, and reliable quality assurance (QA) process is highly desired for modern external beam radiation therapy treatments. Here, we report the results of a semiautomatic, pretreatment, patient-specific QA process based on dynamic machine log file analysis clinically implemented for intensity modulated radiation therapy (IMRT) treatments delivered by high energy linear accelerators (Varian 2100/2300 EX, Trilogy, iX-D, Varian Medical Systems Inc, Palo Alto, CA). The multileaf collimator machine (MLC) log files are called Dynalog by Varian. Using an in-house developed computer program called "Dynalog QA," we automatically compare the beam delivery parameters in the log files that are generated during pretreatment point dose verification measurements, with the treatment plan to determine any discrepancies in IMRT deliveries. Fluence maps are constructed and compared between the delivered and planned beams. Since clinical introduction in June 2009, 912 machine log file analyses QA were performed by the end of 2010. Among these, 14 errors causing dosimetric deviation were detected and required further investigation and intervention. These errors were the result of human operating mistakes, flawed treatment planning, and data modification during plan file transfer. Minor errors were also reported in 174 other log file analyses, some of which stemmed from false positives and unreliable results; the origins of these are discussed herein. It has been demonstrated that the machine log file analysis is a robust, efficient, and reliable QA process capable of detecting errors originating from human mistakes, flawed planning, and data transfer problems. The possibility of detecting these errors is low using point and planar dosimetric measurements. Copyright © 2013 American Society for Radiation Oncology. Published by Elsevier Inc. All rights reserved.
Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.
Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard
2017-04-01
To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Brown, A M
2001-06-01
The objective of this present study was to introduce a simple, easily understood method for carrying out non-linear regression analysis based on user input functions. While it is relatively straightforward to fit data with simple functions such as linear or logarithmic functions, fitting data with more complicated non-linear functions is more difficult. Commercial specialist programmes are available that will carry out this analysis, but these programmes are expensive and are not intuitive to learn. An alternative method described here is to use the SOLVER function of the ubiquitous spreadsheet programme Microsoft Excel, which employs an iterative least squares fitting routine to produce the optimal goodness of fit between data and function. The intent of this paper is to lead the reader through an easily understood step-by-step guide to implementing this method, which can be applied to any function in the form y=f(x), and is well suited to fast, reliable analysis of data in all fields of biology.
Concentration-response of short-term ozone exposure and hospital admissions for asthma in Texas.
Zu, Ke; Liu, Xiaobin; Shi, Liuhua; Tao, Ge; Loftus, Christine T; Lange, Sabine; Goodman, Julie E
2017-07-01
Short-term exposure to ozone has been associated with asthma hospital admissions (HA) and emergency department (ED) visits, but the shape of the concentration-response (C-R) curve is unclear. We conducted a time series analysis of asthma HAs and ambient ozone concentrations in six metropolitan areas in Texas from 2001 to 2013. Using generalized linear regression models, we estimated the effect of daily 8-hour maximum ozone concentrations on asthma HAs for all ages combined, and for those aged 5-14, 15-64, and 65+years. We fit penalized regression splines to evaluate the shape of the C-R curves. Using a log-linear model, estimated risk per 10ppb increase in average daily 8-hour maximum ozone concentrations was highest for children (relative risk [RR]=1.047, 95% confidence interval [CI]: 1.025-1.069), lower for younger adults (RR=1.018, 95% CI: 1.005-1.032), and null for older adults (RR=1.002, 95% CI: 0.981-1.023). However, penalized spline models demonstrated significant nonlinear C-R relationships for all ages combined, children, and younger adults, indicating the existence of thresholds. We did not observe an increased risk of asthma HAs until average daily 8-hour maximum ozone concentrations exceeded approximately 40ppb. Ozone and asthma HAs are significantly associated with each other; susceptibility to ozone is age-dependent, with children at highest risk. C-R relationships between average daily 8-hour maximum ozone concentrations and asthma HAs are significantly curvilinear for all ages combined, children, and younger adults. These nonlinear relationships, as well as the lack of relationship between average daily 8-hour maximum and peak ozone concentrations, have important implications for assessing risks to human health in regulatory settings. Copyright © 2017. Published by Elsevier Ltd.
Lamm, Ryan; Mathews, Steven N; Yang, Jie; Park, Jihye; Talamini, Mark; Pryor, Aurora D; Telem, Dana
2017-05-01
This study sought to characterize in-hospital post-colectomy mortality in New York State. One hundred sixty thousand seven hundred ninety-two patients who underwent colectomy from 1995 to 2014 were analyzed from the all-payer New York Statewide Planning and Research Cooperative System (SPARCS) database. Linear trends of in-hospital mortality rate over 20 years were calculated using log-linear regression models. Chi-square tests were used to compare categorical variables between patients. Multivariable regression models were further used to calculate risk of in-hospital mortality associated with specific demographics, co-morbidities, and perioperative complications. From 1995 to 2014, 7308 (4.5%) in-hospital mortalities occurred within 30 days of surgery. Over this time period, the rate of overall in-hospital post-colectomy mortality decreased by 3.3% (6.3 to 3%, p < 0.0001). The risk of in-hospital mortality for patients receiving emergent and elective surgery decreased by 1% (RR 0.99 [0.98-1.00], p = 0.0005) and 5% (RR 0.95 [0.94-0.96], p < 0.0001) each year, respectively. Patients who underwent open surgeries were more likely to experience in-hospital mortality (adjusted OR 3.65 [3.16-4.21], p < 0.0001), with an increased risk of in-hospital mortality each year (RR 1.01 [1.00-1.03], p = 0.0387). Numerous other risk factors were identified. In-hospital post-colectomy mortality decreased at a slower rate in emergent versus elective surgeries. The risk of in-hospital mortality has increased in open colectomies.
Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence.
Haworth, Kristina M; Chandler, Heather L
2017-02-01
To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Fifty volunteers were seen for two visits 14 ± 2 days apart. Ocular sun exposure was estimated over a 2-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and, possibly, that UVAF remains for at least several months after sun exposure.
Assessing the role of pavement macrotexture in preventing crashes on highways.
Pulugurtha, Srinivas S; Kusam, Prasanna R; Patel, Kuvleshay J
2010-02-01
The objective of this article is to assess the role of pavement macrotexture in preventing crashes on highways in the State of North Carolina. Laser profilometer data obtained from the North Carolina Department of Transportation (NCDOT) for highways comprising four corridors are processed to calculate pavement macrotexture at 100-m (approximately 330-ft) sections according to the American Society for Testing and Materials (ASTM) standards. Crash data collected over the same lengths of the corridors were integrated with the calculated pavement macrotexture for each section. Scatterplots were generated to assess the role of pavement macrotexture on crashes and logarithm of crashes. Regression analyses were conducted by considering predictor variables such as million vehicle miles of travel (as a function of traffic volume and length), the number of interchanges, the number of at-grade intersections, the number of grade-separated interchanges, and the number of bridges, culverts, and overhead signs along with pavement macrotexture to study the statistical significance of relationship between pavement macrotexture and crashes (both linear and log-linear) when compared to other predictor variables. Scatterplots and regression analysis conducted indicate a more statistically significant relationship between pavement macrotexture and logarithm of crashes than between pavement macrotexture and crashes. The coefficient for pavement macrotexture, in general, is negative, indicating that the number of crashes or logarithm of crashes decreases as it increases. The relation between pavement macrotexture and logarithm of crashes is generally stronger than between most other predictor variables and crashes or logarithm of crashes. Based on results obtained, it can be concluded that maintaining pavement macrotexture greater than or equal to 1.524 mm (0.06 in.) as a threshold limit would possibly reduce crashes and provide safe transportation to road users on highways.
Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence
Haworth, Kristina M.; Chandler, Heather L.
2016-01-01
Purpose To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Methods Fifty volunteers were seen for 2 visits 14±2 days apart. Ocular sun exposure was estimated over a two-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Results Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm2, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. Conclusions We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and possibly, that UVAF remains for at least several months following sun exposure. PMID:27820717
Seasonal and temporal patterns of NDMA formation potentials in surface waters.
Uzun, Habibullah; Kim, Daekyun; Karanfil, Tanju
2015-02-01
The seasonal and temporal patterns of N-nitrosodimethylamine (NDMA) formation potentials (FPs) were examined with water samples collected monthly for 21 month period in 12 surface waters. This long term study allowed monitoring the patterns of NDMA FPs under dynamic weather conditions (e.g., rainy and dry periods) covering several seasons. Anthropogenically impacted waters which were determined by high sucralose levels (>100 ng/L) had higher NDMA FPs than limited impacted sources (<100 ng/L). In most sources, NDMA FP showed more variability in spring months, while seasonal mean values remained relatively consistent. The study also showed that watershed characteristics played an important role in the seasonal and temporal patterns. In the two dam-controlled river systems (SW A and G), the NDMA FP levels at the downstream sampling locations were controlled by the NDMA levels in the dams independent of either the increases in discharge rates due to water releases from the dams prior to or during the heavy rain events or intermittent high NDMA FP levels observed at the upstream of dams. The large reservoirs and impoundments on rivers examined in this study appeared serving as an equalization basin for NDMA precursors. On the other hand, in a river without an upstream reservoir (SW E), the NDMA levels were influenced by the ratio of an upstream wastewater treatment plant (WWTP) effluent discharge to the river discharge rate. The impact of WWTP effluent decreased during the high river flow periods due to rain events. Linear regression with independent variables DOC, DON, and sucralose yielded poor correlations with NDMA FP (R(2) < 0.27). Multiple linear regression analysis using DOC and log [sucralose] yielded a better correlation with NDMA FP (R(2) = 0.53). Copyright © 2014 Elsevier Ltd. All rights reserved.
Choi, Bryan Y; Kobayashi, Leo; Pathania, Shivany; Miller, Courtney B; Locke, Emma R; Stearns, Branden C; Hudepohl, Nathan J; Patefield, Scott S; Suner, Selim; Williams, Kenneth A; Machan, Jason T; Jay, Gregory D
2015-01-01
To measure unhealthy aerosol materials in an Emergency Department (ED) and identify their sources for mitigation efforts. Based on pilot findings of elevated ED particulate matter (PM) levels, investigators hypothesized that unhealthy aerosol materials derive from exogenous (vehicular) sources at ambulance receiving entrances. The Aerosol Environmental Toxicity in Healthcare-related Exposure and Risk program was conducted as an observational study. Calibrated sensors monitored PM and toxic gases at Ambulance Triage Exterior (ATE), Ambulance Triage Desk (ATD), and control Public Triage Desk (PTD) on a 3/3/3-day cycle. Cassette sampling characterized PM; meteorological and ambulance traffic data were logged. Descriptive and multiple linear regression analyses assessed for interactions between aerosol material levels, location, temporal variables, ambulance activity, and meteorological factors. Sensors acquired 93,682 PM0.3, 90,250 PM2.5, and 93,768 PM5 measurements over 366 days to generate a data set representing at least 85.6% of planned measurements. PM0.3, PM2.5, and PM5 mean counts were lowest in PTD; 56%, 224%, and 223% higher in ATD; and 996%, 200%, and 63% higher in ATE, respectively (all p < .001). Qualitative analyses showed similar PM compositions in ATD and ATE. On multiple linear regression analysis, PM0.3 counts correlated primarily with location; PM2.5 and PM5 counts correlated most strongly with location and ambulance presence. PM < 2.5 and toxic gas concentrations at ATD and PTD patient care areas did not exceed hazard levels; PM0.3 counts did not have formal safety thresholds for comparison. Higher levels of PM were linked with ED ambulance areas, although their health impact is unclear. © The Author(s) 2015.
NASA Astrophysics Data System (ADS)
Bourke, Sarah A.; Hermann, Kristian J.; Hendry, M. Jim
2017-11-01
Elevated groundwater salinity associated with produced water, leaching from landfills or secondary salinity can degrade arable soils and potable water resources. Direct-push electrical conductivity (EC) profiling enables rapid, relatively inexpensive, high-resolution in-situ measurements of subsurface salinity, without requiring core collection or installation of groundwater wells. However, because the direct-push tool measures the bulk EC of both solid and liquid phases (ECa), incorporation of ECa data into regional or historical groundwater data sets requires the prediction of pore water EC (ECw) or chloride (Cl-) concentrations from measured ECa. Statistical linear regression and physically based models for predicting ECw and Cl- from ECa profiles were tested on a brine plume in central Saskatchewan, Canada. A linear relationship between ECa/ECw and porosity was more accurate for predicting ECw and Cl- concentrations than a power-law relationship (Archie's Law). Despite clay contents of up to 96%, the addition of terms to account for electrical conductance in the solid phase did not improve model predictions. In the absence of porosity data, statistical linear regression models adequately predicted ECw and Cl- concentrations from direct-push ECa profiles (ECw = 5.48 ECa + 0.78, R 2 = 0.87; Cl- = 1,978 ECa - 1,398, R 2 = 0.73). These statistical models can be used to predict ECw in the absence of lithologic data and will be particularly useful for initial site assessments. The more accurate linear physically based model can be used to predict ECw and Cl- as porosity data become available and the site-specific ECw-Cl- relationship is determined.
Job strain and resting heart rate: a cross-sectional study in a Swedish random working sample.
Eriksson, Peter; Schiöler, Linus; Söderberg, Mia; Rosengren, Annika; Torén, Kjell
2016-03-05
Numerous studies have reported an association between stressing work conditions and cardiovascular disease. However, more evidence is needed, and the etiological mechanisms are unknown. Elevated resting heart rate has emerged as a possible risk factor for cardiovascular disease, but little is known about the relation to work-related stress. This study therefore investigated the association between job strain, job control, and job demands and resting heart rate. We conducted a cross-sectional survey of randomly selected men and women in Västra Götalandsregionen, Sweden (West county of Sweden) (n = 1552). Information about job strain, job demands, job control, heart rate and covariates was collected during the period 2001-2004 as part of the INTERGENE/ADONIX research project. Six different linear regression models were used with adjustments for gender, age, BMI, smoking, education, and physical activity in the fully adjusted model. Job strain was operationalized as the log-transformed ratio of job demands over job control in the statistical analyses. No associations were seen between resting heart rate and job demands. Job strain was associated with elevated resting heart rate in the unadjusted model (linear regression coefficient 1.26, 95 % CI 0.14 to 2.38), but not in any of the extended models. Low job control was associated with elevated resting heart rate after adjustments for gender, age, BMI, and smoking (linear regression coefficient -0.18, 95 % CI -0.30 to -0.02). However, there were no significant associations in the fully adjusted model. Low job control and job strain, but not job demands, were associated with elevated resting heart rate. However, the observed associations were modest and may be explained by confounding effects.
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Porcaro, Antonio B; Ghimenton, Claudio; Petrozziello, Aldo; Migliorini, Filippo; Romano, Mario; Sava, Teodoro; Caruso, Beatrice; Cocco, Claudio; Antoniolli, Stefano Zecchinini; Lacola, Vincenzo; Rubilotta, Emanuele; Monaco, Carmelo; Comunale, Luigi
2012-04-01
To evaluate the prolactin hormone (PRL) physiopathology along the pituitary testicular prostate axis at the time of initial diagnosis of prostate cancer and the subsequent cluster selection of the patient population after radical prostatectomy in relation to clinical and pathological variables. Ninety-two operated prostate cancer patients were retrospectively reviewed. No patient had previously received hormonal treatment. The investigated variables included PRL, follicle stimulating hormone (FSH), luteinizing hormone (LH), total testosterone (TT), free testosterone (FT), total prostate specific antigen (PSA), percentage of positive cores at transrectal ultrasound scan biopsy (TRUSB) (P+), biopsy Gleason score (bGS), pathology Gleason score (pGS), estimated tumor volume in relation to percentage of prostate volume (V+), overall prostate weight (Wi) and age. Empirical PRL correlations and multiple linear predictions were investigated along the pituitary testis prostate axis in the different groups of the prostate cancer population and clustered according to pT (2a/b, 3a, 3b/4) status. The patient population was classified according to the log(10) PRL/V+ ratio and clustered as follows: group A (log(10) PRL/V+ ≤1.5), B (1.5< log(10)PRL/V+ ≤2.0) and C (log(10) PRL/V+ >2.0). Simple linear regression analysis of V+ predicting PRL was computed for assessing the clustered model and analysis of variance was performed for assessing significant differences between the groups. PRL was independently predicted by FSH (p=0.01), LH (p=0.008) and P+ (p=0.06) in low-stage prostate cancer (pT2a/b). Interestingly, PRL was independently predicted by LH (p=0.03) and FSH, TT, FT, PSA, bGS, pGS, V+, Wi and age (all at p=0.01) in advanced stage-disease (pT3b/4). V+ was also significantly correlated (r=0.47) and predicted by P+ (p<0.0001) in the prostate cancer population. PRL was significantly correlated and predicted by V+ when the patient population was clustered according to the log(10)PRL/V+ ratio in group A (p=0.008), B (p<0.0001) and C (p<0.0001). Moreover, the three groups had significantly different mean values of PRL (p<0.0001), PSA (p=0.007), P+ (p=0.0001), V+ (p<0.0001), Wi (p=0.03), bGS (p=0.008), pGS (p=0.003); also, groups A, B and C had significant different pGS (p=0.03), pT (p=0.0008) and pR (p=0.01) frequency distributions. At diagnosis, in an operated prostate cancer population, PRL was significantly correlated and independently predicted along the pituitary testis prostate axis in high-stage disease; V+ was also significantly correlated and predicted by P+. Because of the high correlation and prediction of PRL by both V+ and P+, the prostate cancer population at diagnosis was clustered according to the log(10)PRL/V+ ratio into groups A, B and C that, in theory, might be models with prognostic potential and clinical applications in the prostate cancer population. However, confirmatory studies are needed.
A Linearized Model for Flicker and Contrast Thresholds at Various Retinal Illuminances
NASA Technical Reports Server (NTRS)
Ahumada, Albert; Watson, Andrew
2015-01-01
We previously proposed a flicker visibility metric for bright displays, based on psychophysical data collected at a high mean luminance. Here we extend the metric to other mean luminances. This extension relies on a linear relation between log sensitivity and critical fusion frequency, and a linear relation between critical fusion frequency and log retina lilluminance. Consistent with our previous metric, the extended flicker visibility metric is measured in just-noticeable differences (JNDs).
Method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Effect of Stress Corrosion and Cyclic Fatigue on Fluorapatite Glass-Ceramic
NASA Astrophysics Data System (ADS)
Joshi, Gaurav V.
2011-12-01
Objective: The objective of this study was to test the following hypotheses: 1. Both cyclic degradation and stress corrosion mechanisms result in subcritical crack growth in a fluorapatite glass-ceramic. 2. There is an interactive effect of stress corrosion and cyclic fatigue to cause subcritical crack growth (SCG) for this material. 3. The material that exhibits rising toughness curve (R-curve) behavior also exhibits a cyclic degradation mechanism. Materials and Methods: The material tested was a fluorapatite glass-ceramic (IPS e.max ZirPress, Ivoclar-Vivadent). Rectangular beam specimens with dimensions of 25 mm x 4 mm x 1.2 mm were fabricated using the press-on technique. Two groups of specimens (N=30) with polished (15 mum) or air abraded surface were tested under rapid monotonic loading. Additional polished specimens were subjected to cyclic loading at two frequencies, 2 Hz (N=44) and 10 Hz (N=36), and at different stress amplitudes. All tests were performed using a fully articulating four-point flexure fixture in deionized water at 37°C. The SCG parameters were determined by using a statistical approach by Munz and Fett (1999). The fatigue lifetime data were fit to a general log-linear model in ALTA PRO software (Reliasoft). Fractographic techniques were used to determine the critical flaw sizes to estimate fracture toughness. To determine the presence of R-curve behavior, non-linear regression was used. Results: Increasing the frequency of cycling did not cause a significant decrease in lifetime. The parameters of the general log-linear model showed that only stress corrosion has a significant effect on lifetime. The parameters are presented in the following table.* SCG parameters (n=19--21) were similar for both frequencies. The regression model showed that the fracture toughness was significantly dependent (p<0.05) on critical flaw size. Conclusions: 1. Cyclic fatigue does not have a significant effect on the SCG in the fluorapatite glass-ceramic IPS e.max ZirPress. 2. There was no interactive effect between cyclic degradation and stress corrosion for this material. 3. The material exhibited a low level of R-curve behavior. It did not exhibit cyclic degradation. *Please refer to dissertation for table.
Three-Dimensional City Determinants of the Urban Heat Island: A Statistical Approach
NASA Astrophysics Data System (ADS)
Chun, Bum Seok
There is no doubt that the Urban Heat Island (UHI) is a mounting problem in built-up environments, due to the energy retention by the surface materials of dense buildings, leading to increased temperatures, air pollution, and energy consumption. Much of the earlier research on the UHI has used two-dimensional (2-D) information, such as land uses and the distribution of vegetation. In the case of homogeneous land uses, it is possible to predict surface temperatures with reasonable accuracy with 2-D information. However, three-dimensional (3-D) information is necessary to analyze more complex sites, including dense building clusters. Recent research on the UHI has started to consider multi-dimensional models. The purpose of this research is to explore the urban determinants of the UHI, using 2-D/3-D urban information with statistical modeling. The research includes the following stages: (a) estimating urban temperature, using satellite images, (b) developing a 3-D city model by LiDAR data, (c) generating geometric parameters with regard to 2-/3-D geospatial information, and (d) conducting different statistical analyses: OLS and spatial regressions. The research area is part of the City of Columbus, Ohio. To effectively and systematically analyze the UHI, hierarchical grid scales (480m, 240m, 120m, 60m, and 30m) are proposed, together with linear and the log-linear regression models. The non-linear OLS models with Log(AST) as dependent variable have the highest R2 among all the OLS-estimated models. However, both SAR and GSM models are estimated for the 480m, 240m, 120m, and 60m grids to reduce their spatial dependency. Most GSM models have R2s higher than 0.9, except for the 240m grid. Overall, the urban characteristics having high impacts in all grids are embodied in solar radiation, 3-D open space, greenery, and water streams. These results demonstrate that it is possible to mitigate the UHI, providing guidelines for policies aiming to reduce the UHI.
Green lumber grade yields from factory grade logs of three oak species
Daniel A. Yaussy
1986-01-01
Multivariate regression models were developed to predict green board foot yields for the seven common factory lumber grades processed from white, black, and chestnut oak factory grade logs. These models use the standard log measurements of grade, scaling diameter, log length, and proportion of scaling defect. Any combination of lumber grades (such as 1 Common and...
Assessment of Uncertainty in the Determination of Activation Energy for Polymeric Materials
NASA Technical Reports Server (NTRS)
Darby, Stephania P.; Landrum, D. Brian; Coleman, Hugh W.
1998-01-01
An assessment of the experimental uncertainty in obtaining the kinetic activation energy from thermogravimetric analysis (TGA) data is presented. A neat phenolic resin, Borden SC1O08, was heated at three heating rates to obtain weight loss vs temperature data. Activation energy was calculated by two methods: the traditional Flynn and Wall method based on the slope of log(q) versus 1/T, and a modification of this method where the ordinate and abscissa are reversed in the linear regression. The modified method produced a more accurate curve fit of the data, was more sensitive to data nonlinearity, and gave a value of activation energy 75 percent greater than the original method. An uncertainty analysis using the modified method yielded a 60 percent uncertainty in the average activation energy. Based on this result, the activation energy for a carbon-phenolic material was doubled and used to calculate the ablation rate In a typical solid rocket environment. Doubling the activation energy increased surface recession by 3 percent. Current TGA data reduction techniques that use the traditional Flynn and Wall approach to calculate activation energy should be changed to the modified method.
Effect of temperature and humidity on formaldehyde emissions in temporary housing units.
Parthasarathy, Srinandini; Maddalena, Randy L; Russell, Marion L; Apte, Michael G
2011-06-01
The effect of temperature and humidity on formaldehyde emissions from samples collected from temporary housing units (THUs) was studied. The THUs were supplied by the U.S. Federal Emergency Management Administration (FEMA) to families that lost their homes in Louisiana and Mississippi during the Hurricane Katrina and Rita disasters. On the basis of a previous study, four of the composite wood surface materials that dominated contributions to indoor formaldehyde were selected to analyze the effects of temperature and humidity on the emission factors. Humidity equilibration experiments were carried out on two of the samples to determine how long the samples take to equilibrate with the surrounding environmental conditions. Small chamber experiments were then conducted to measure emission factors for the four surface materials at various temperature and humidity conditions. The samples were analyzed for formaldehyde via high-performance liquid chromatography. The experiments showed that increases in temperature or humidity contributed to an increase in emission factors. A linear regression model was built using the natural log of the percent relative humidity (RH) and inverse of temperature (in K) as independent variables and the natural log of emission factors as the dependent variable. The coefficients for the inverse of temperature and log RH with log emission factor were found to be statistically significant for all of the samples at the 95% confidence level. This study should assist in retrospectively estimating indoor formaldehyde exposure of occupants of THUs.
User's manual for SEDCALC, a computer program for computation of suspended-sediment discharge
Koltun, G.F.; Gray, John R.; McElhone, T.J.
1994-01-01
Sediment-Record Calculations (SEDCALC), a menu-driven set of interactive computer programs, was developed to facilitate computation of suspended-sediment records. The programs comprising SEDCALC were developed independently in several District offices of the U.S. Geological Survey (USGS) to minimize the intensive labor associated with various aspects of sediment-record computations. SEDCALC operates on suspended-sediment-concentration data stored in American Standard Code for Information Interchange (ASCII) files in a predefined card-image format. Program options within SEDCALC can be used to assist in creating and editing the card-image files, as well as to reformat card-image files to and from formats used by the USGS Water-Quality System. SEDCALC provides options for creating card-image files containing time series of equal-interval suspended-sediment concentrations from 1. digitized suspended-sediment-concentration traces, 2. linear interpolation between log-transformed instantaneous suspended-sediment-concentration data stored at unequal time intervals, and 3. nonlinear interpolation between log-transformed instantaneous suspended-sediment-concentration data stored at unequal time intervals. Suspended-sediment discharge can be computed from the streamflow and suspended-sediment-concentration data or by application of transport relations derived by regressing log-transformed instantaneous streamflows on log-transformed instantaneous suspended-sediment concentrations or discharges. The computed suspended-sediment discharge data are stored in card-image files that can be either directly imported to the USGS Automated Data Processing System or used to generate plots by means of other SEDCALC options.
Log-Linear Models for Gene Association
Hu, Jianhua; Joshi, Adarsh; Johnson, Valen E.
2009-01-01
We describe a class of log-linear models for the detection of interactions in high-dimensional genomic data. This class of models leads to a Bayesian model selection algorithm that can be applied to data that have been reduced to contingency tables using ranks of observations within subjects, and discretization of these ranks within gene/network components. Many normalization issues associated with the analysis of genomic data are thereby avoided. A prior density based on Ewens’ sampling distribution is used to restrict the number of interacting components assigned high posterior probability, and the calculation of posterior model probabilities is expedited by approximations based on the likelihood ratio statistic. Simulation studies are used to evaluate the efficiency of the resulting algorithm for known interaction structures. Finally, the algorithm is validated in a microarray study for which it was possible to obtain biological confirmation of detected interactions. PMID:19655032
Solving Graph Laplacian Systems Through Recursive Bisections and Two-Grid Preconditioning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ponce, Colin; Vassilevski, Panayot S.
2016-02-18
We present a parallelizable direct method for computing the solution to graph Laplacian-based linear systems derived from graphs that can be hierarchically bipartitioned with small edge cuts. For a graph of size n with constant-size edge cuts, our method decomposes a graph Laplacian in time O(n log n), and then uses that decomposition to perform a linear solve in time O(n log n). We then use the developed technique to design a preconditioner for graph Laplacians that do not have this property. Finally, we augment this preconditioner with a two-grid method that accounts for much of the preconditioner's weaknesses. Wemore » present an analysis of this method, as well as a general theorem for the condition number of a general class of two-grid support graph-based preconditioners. Numerical experiments illustrate the performance of the studied methods.« less
A method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Poverty and prevalence of antimicrobial resistance in invasive isolates.
Alvarez-Uria, Gerardo; Gandra, Sumanth; Laxminarayan, Ramanan
2016-11-01
To evaluate the association between the income status of a country and the prevalence of antimicrobial resistance (AMR) in the three most common bacteria causing infections in hospitals and in the community: third-generation cephalosporin (3GC)-resistant Escherichia coli, methicillin-resistant Staphylococcus aureus (MRSA), and 3GC-resistant Klebsiella species. Using 2013-2014 country-specific data from the ResistanceMap repository and the World Bank, the association between the prevalence of AMR in invasive samples and the gross national income (GNI) per capita was investigated through linear regression with robust standard errors. To account for non-linear association with the dependent variable, GNI per capita was log-transformed. The models predicted an 11.3% (95% confidence interval (CI) 6.5-16.2%), 18.2% (95% CI 11-25.5%), and 12.3% (95% CI 5.5-19.1%) decrease in the prevalence of 3GC-resistant E. coli, 3GC-resistant Klebsiella species, and MRSA, respectively, for each log GNI per capita. The association was stronger for 3GC-resistant E. coli and Klebsiella species than for MRSA. A significant negative association between GNI per capita and the prevalence of MRSA and 3GC-resistant E. coli and Klebsiella species was found. These results underscore the urgent need for new policies aimed at reducing AMR in resource-poor settings. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Simplified large African carnivore density estimators from track indices.
Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J
2016-01-01
The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
ERIC Educational Resources Information Center
Wang, Tianyou
2009-01-01
Holland and colleagues derived a formula for analytical standard error of equating using the delta-method for the kernel equating method. Extending their derivation, this article derives an analytical standard error of equating procedure for the conventional percentile rank-based equipercentile equating with log-linear smoothing. This procedure is…
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
ERIC Educational Resources Information Center
Bates, Reid A.; Holton, Elwood F., III; Burnett, Michael F.
1999-01-01
A case study of learning transfer demonstrates the possible effect of influential observation on linear regression analysis. A diagnostic method that tests for violation of assumptions, multicollinearity, and individual and multiple influential observations helps determine which observation to delete to eliminate bias. (SK)
Linear models for assessing mechanisms of sperm competition: the trouble with transformations.
Eggert, Anne-Katrin; Reinhardt, Klaus; Sakaluk, Scott K
2003-01-01
Although sperm competition is a pervasive selective force shaping the reproductive tactics of males, the mechanisms underlying different patterns of sperm precedence remain obscure. Parker et al. (1990) developed a series of linear models designed to identify two of the more basic mechanisms: sperm lotteries and sperm displacement; the models can be tested experimentally by manipulating the relative numbers of sperm transferred by rival males and determining the paternity of offspring. Here we show that tests of the model derived for sperm lotteries can result in misleading inferences about the underlying mechanism of sperm precedence because the required inverse transformations may lead to a violation of fundamental assumptions of linear regression. We show that this problem can be remedied by reformulating the model using the actual numbers of offspring sired by each male, and log-transforming both sides of the resultant equation. Reassessment of data from a previous study (Sakaluk and Eggert 1996) using the corrected version of the model revealed that we should not have excluded a simple sperm lottery as a possible mechanism of sperm competition in decorated crickets, Gryllodes sigillatus.
Automating linear accelerator quality assurance.
Eckhause, Tobias; Al-Hallaq, Hania; Ritter, Timothy; DeMarco, John; Farrey, Karl; Pawlicki, Todd; Kim, Gwe-Ya; Popple, Richard; Sharma, Vijeshwar; Perez, Mario; Park, SungYong; Booth, Jeremy T; Thorwarth, Ryan; Moran, Jean M
2015-10-01
The purpose of this study was 2-fold. One purpose was to develop an automated, streamlined quality assurance (QA) program for use by multiple centers. The second purpose was to evaluate machine performance over time for multiple centers using linear accelerator (Linac) log files and electronic portal images. The authors sought to evaluate variations in Linac performance to establish as a reference for other centers. The authors developed analytical software tools for a QA program using both log files and electronic portal imaging device (EPID) measurements. The first tool is a general analysis tool which can read and visually represent data in the log file. This tool, which can be used to automatically analyze patient treatment or QA log files, examines the files for Linac deviations which exceed thresholds. The second set of tools consists of a test suite of QA fields, a standard phantom, and software to collect information from the log files on deviations from the expected values. The test suite was designed to focus on the mechanical tests of the Linac to include jaw, MLC, and collimator positions during static, IMRT, and volumetric modulated arc therapy delivery. A consortium of eight institutions delivered the test suite at monthly or weekly intervals on each Linac using a standard phantom. The behavior of various components was analyzed for eight TrueBeam Linacs. For the EPID and trajectory log file analysis, all observed deviations which exceeded established thresholds for Linac behavior resulted in a beam hold off. In the absence of an interlock-triggering event, the maximum observed log file deviations between the expected and actual component positions (such as MLC leaves) varied from less than 1% to 26% of published tolerance thresholds. The maximum and standard deviations of the variations due to gantry sag, collimator angle, jaw position, and MLC positions are presented. Gantry sag among Linacs was 0.336 ± 0.072 mm. The standard deviation in MLC position, as determined by EPID measurements, across the consortium was 0.33 mm for IMRT fields. With respect to the log files, the deviations between expected and actual positions for parameters were small (<0.12 mm) for all Linacs. Considering both log files and EPID measurements, all parameters were well within published tolerance values. Variations in collimator angle, MLC position, and gantry sag were also evaluated for all Linacs. The performance of the TrueBeam Linac model was shown to be consistent based on automated analysis of trajectory log files and EPID images acquired during delivery of a standardized test suite. The results can be compared directly to tolerance thresholds. In addition, sharing of results from standard tests across institutions can facilitate the identification of QA process and Linac changes. These reference values are presented along with the standard deviation for common tests so that the test suite can be used by other centers to evaluate their Linac performance against those in this consortium.
NASA Technical Reports Server (NTRS)
Asner, Gregory P.; Keller, Michael M.; Silva, Jose Natalino; Zweede, Johan C.; Pereira, Rodrigo, Jr.
2002-01-01
Major uncertainties exist regarding the rate and intensity of logging in tropical forests worldwide: these uncertainties severely limit economic, ecological, and biogeochemical analyses of these regions. Recent sawmill surveys in the Amazon region of Brazil show that the area logged is nearly equal to total area deforested annually, but conversion of survey data to forest area, forest structural damage, and biomass estimates requires multiple assumptions about logging practices. Remote sensing could provide an independent means to monitor logging activity and to estimate the biophysical consequences of this land use. Previous studies have demonstrated that the detection of logging in Amazon forests is difficult and no studies have developed either the quantitative physical basis or remote sensing approaches needed to estimate the effects of various logging regimes on forest structure. A major reason for these limitations has been a lack of sufficient, well-calibrated optical satellite data, which in turn, has impeded the development and use of physically-based, quantitative approaches for detection and structural characterization of forest logging regimes. We propose to use data from the EO-1 Hyperion imaging spectrometer to greatly increase our ability to estimate the presence and structural attributes of selective logging in the Amazon Basin. Our approach is based on four "biogeophysical indicators" not yet derived simultaneously from any satellite sensor: 1) green canopy leaf area index; 2) degree of shadowing; 3) presence of exposed soil and; 4) non-photosynthetic vegetation material. Airborne, field and modeling studies have shown that the optical reflectance continuum (400-2500 nm) contains sufficient information to derive estimates of each of these indicators. Our ongoing studies in the eastern Amazon basin also suggest that these four indicators are sensitive to logging intensity. Satellite-based estimates of these indicators should provide a means to quantify both the presence and degree of structural disturbance caused by various logging regimes. Our quantitative assessment of Hyperion hyperspectral and ALI multi-spectral data for the detection and structural characterization of selective logging in Amazonia will benefit from data collected through an ongoing project run by the Tropical Forest Foundation, within which we have developed a study of the canopy and landscape biophysics of conventional and reduced-impact logging. We will add to our base of forest structural information in concert with an EO-1 overpass. Using a photon transport model inversion technique that accounts for non-linear mixing of the four biogeophysical indicators, we will estimate these parameters across a gradient of selective logging intensity provided by conventional and reduced impact logging sites. We will also compare our physical ly-based approach to both conventional (e.g., NDVI) and novel (e.g., SWIR-channel) vegetation indices as well as to linear mixture modeling methods. We will cross-compare these approaches using Hyperion and ALI imagers to determine the strengths and limitations of these two sensors for applications of forest biophysics. This effort will yield the first physical ly-based, quantitative analysis of the detection and intensity of selective logging in Amazonia, comparing hyperspectral and improved multi-spectral approaches as well as inverse modeling, linear mixture modeling, and vegetation index techniques.
ERIC Educational Resources Information Center
Jeske, Debora; Roßnagell, Christian Stamov; Backhaus, Joy
2014-01-01
We examined the role of learner characteristics as predictors of four aspects of e-learning performance, including knowledge test performance, learning confidence, learning efficiency, and navigational effectiveness. We used both self reports and log file records to compute the relevant statistics. Regression analyses showed that both need for…
Online Courses Assessment through Measuring and Archetyping of Usage Data
ERIC Educational Resources Information Center
Kazanidis, Ioannis; Theodosiou, Theodosios; Petasakis, Ioannis; Valsamidis, Stavros
2016-01-01
Database files and additional log files of Learning Management Systems (LMSs) contain an enormous volume of data which usually remain unexploited. A new methodology is proposed in order to analyse these data both on the level of both the courses and the learners. Specifically, "regression analysis" is proposed as a first step in the…
Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J
2011-06-01
In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.
Sorption and pH determine the long-term partitioning of cadmium in natural soils.
Ardestani, Masoud M; van Gestel, Cornelis A M
2016-09-01
The bioavailability of metals in soil is a dynamic process. For a proper extrapolation to the field of laboratory studies on fate and effects, it is important to understand the dynamics of metal bioavailability and the way it is influenced by soil properties. The aim of this study was to assess the parallel (concurrent) effect of pH and aging time on the partitioning of cadmium in natural LUFA 2.2 soil. Cadmium nitrate-spiked pH-amended LUFA 2.2 soils were incubated under laboratory conditions for up to 30 weeks. Measured pHpw was lower after 3 weeks and decreased only slightly toward the end of the test. Cadmium concentrations in the pore water increased with time for all soil pH levels, while they decreased with increasing pH. Freundlich kf values ranged between 4.26 and 934 L kg(-1) (n = 0.79 to 1.36) and were highest at the highest pH tested (pH = 6.5). Multiple linear regression analysis, based on a soil ligand modeling approach, resulted in affinity constants of 2.61 for Ca(2+) (log KCa-SL) and 5.05 for H(+) (log KH-SL) for their binding to the active sites on the soil surface. The results showed that pH and aging time are two important factors which together affect cadmium partitioning and mobility in spiked natural soils.
Arano, Ichiro; Sugimoto, Tomoyuki; Hamasaki, Toshimitsu; Ohno, Yuko
2010-04-23
Survival analysis methods such as the Kaplan-Meier method, log-rank test, and Cox proportional hazards regression (Cox regression) are commonly used to analyze data from randomized withdrawal studies in patients with major depressive disorder. However, unfortunately, such common methods may be inappropriate when a long-term censored relapse-free time appears in data as the methods assume that if complete follow-up were possible for all individuals, each would eventually experience the event of interest. In this paper, to analyse data including such a long-term censored relapse-free time, we discuss a semi-parametric cure regression (Cox cure regression), which combines a logistic formulation for the probability of occurrence of an event with a Cox proportional hazards specification for the time of occurrence of the event. In specifying the treatment's effect on disease-free survival, we consider the fraction of long-term survivors and the risks associated with a relapse of the disease. In addition, we develop a tree-based method for the time to event data to identify groups of patients with differing prognoses (cure survival CART). Although analysis methods typically adapt the log-rank statistic for recursive partitioning procedures, the method applied here used a likelihood ratio (LR) test statistic from a fitting of cure survival regression assuming exponential and Weibull distributions for the latency time of relapse. The method is illustrated using data from a sertraline randomized withdrawal study in patients with major depressive disorder. We concluded that Cox cure regression reveals facts on who may be cured, and how the treatment and other factors effect on the cured incidence and on the relapse time of uncured patients, and that cure survival CART output provides easily understandable and interpretable information, useful both in identifying groups of patients with differing prognoses and in utilizing Cox cure regression models leading to meaningful interpretations.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions
Fernandes, Bruno J. T.; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
NASA Astrophysics Data System (ADS)
Dhakal, Y. P.; Kunugi, T.; Suzuki, W.; Aoi, S.
2013-12-01
The Mw 9.1 Tohoku-oki earthquake caused strong shakings of super high rise and high rise buildings constructed on deep sedimentary basins in Japan. Many people felt difficulty in moving inside the high rise buildings even on the Osaka basin located at distances as far as 800 km from the epicentral area. Several empirical equations are proposed to estimate the peak ground motions and absolute acceleration response spectra applicable mainly within 300 to 500km from the source area. On the other hand, Japan Meteorological Agency has recently proposed four classes of absolute velocity response spectra as suitable indices to qualitatively describe the intensity of long-period ground motions based on the observed earthquake records, human experiences, and actual damages that occurred in the high rise and super high rise buildings. The empirical prediction equations have been used in disaster mitigation planning as well as earthquake early warning. In this study, we discuss the results of our preliminary analysis on attenuation relation of absolute velocity response spectra calculated from the observed strong motion records including those from the Mw 9.1 Tohoku-oki earthquake using simple regression models with various model parameters. We used earthquakes, having Mw 6.5 or greater, and focal depths shallower than 50km, which occurred in and around Japanese archipelago. We selected those earthquakes for which the good quality records are available over 50 observation sites combined from K-NET and KiK-net. After a visual inspection on approximately 21,000 three component records from 36 earthquakes, we used about 15,000 good quality records in the period range of 1 to 10s within the hypocentral distance (R) of 800km. We performed regression analyses assuming the following five regression models. (1) log10Y (T) = c+ aMw - log10R - bR (2) log10Y (T) = c+ aMw - log10R - bR +gS (3) log10Y (T) = c+ aMw - log10R - bR + hD (4) log10Y (T) = c+ aMw - log10R - bR +gS +hD (5) log10Y (T) = c+ aMw - log10R - bR +∑gS +hD where Y (T) is the 5% damped peak vector response in cm/s derived from two horizontal component records for a natural period T in second; in (2) S is a dummy variable which is one if a site is located inside a sedimentary basin, otherwise zero. In (3), D is depth to the top of layer having a particular S-wave velocity. We used the deep underground S-wave velocity model available from Japan Seismic Hazard Information Station (J-SHIS). In (5), sites are classified to various sedimentary basins. Analyses show that the standard deviations decrease in the order of the models listed and the all coefficients are significant. Interestingly, coefficients g are found to be different from basin to basin at most periods, and the depth to the top of layer having S-wave velocity of 1.7km/s gives the smallest standard deviation of 0.31 at T=4.4s in (5). This study shows the possibility of describing the observed peak absolute velocity response values by using simple model parameters like site location and sedimentary depth soon after the location and magnitude of an earthquake are known.