Ridge: a computer program for calculating ridge regression estimates
Donald E. Hilt; Donald W. Seegrist
1977-01-01
Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.
Farno, E; Coventry, K; Slatter, P; Eshtiaghi, N
2018-06-15
Sludge pumps in wastewater treatment plants are often oversized due to uncertainty in calculation of pressure drop. This issue costs millions of dollars for industry to purchase and operate the oversized pumps. Besides costs, higher electricity consumption is associated with extra CO 2 emission which creates huge environmental impacts. Calculation of pressure drop via current pipe flow theory requires model estimation of flow curve data which depends on regression analysis and also varies with natural variation of rheological data. This study investigates impact of variation of rheological data and regression analysis on variation of pressure drop calculated via current pipe flow theories. Results compare the variation of calculated pressure drop between different models and regression methods and suggest on the suitability of each method. Copyright © 2018 Elsevier Ltd. All rights reserved.
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Experimental and computational prediction of glass transition temperature of drugs.
Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S
2014-12-22
Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou
2013-05-01
A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.
Application of linear regression analysis in accuracy assessment of rolling force calculations
NASA Astrophysics Data System (ADS)
Poliak, E. I.; Shim, M. K.; Kim, G. S.; Choo, W. Y.
1998-10-01
Efficient operation of the computational models employed in process control systems require periodical assessment of the accuracy of their predictions. Linear regression is proposed as a tool which allows separate systematic and random prediction errors from those related to measurements. A quantitative characteristic of the model predictive ability is introduced in addition to standard statistical tests for model adequacy. Rolling force calculations are considered as an example for the application. However, the outlined approach can be used to assess the performance of any computational model.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Nishii, Takashi; Genkawa, Takuma; Watari, Masahiro; Ozaki, Yukihiro
2012-01-01
A new selection procedure of an informative near-infrared (NIR) region for regression model building is proposed that uses an online NIR/mid-infrared (mid-IR) dual-region spectrometer in conjunction with two-dimensional (2D) NIR/mid-IR heterospectral correlation spectroscopy. In this procedure, both NIR and mid-IR spectra of a liquid sample are acquired sequentially during a reaction process using the NIR/mid-IR dual-region spectrometer; the 2D NIR/mid-IR heterospectral correlation spectrum is subsequently calculated from the obtained spectral data set. From the calculated 2D spectrum, a NIR region is selected that includes bands of high positive correlation intensity with mid-IR bands assigned to the analyte, and used for the construction of a regression model. To evaluate the performance of this procedure, a partial least-squares (PLS) regression model of the ethanol concentration in a fermentation process was constructed. During fermentation, NIR/mid-IR spectra in the 10000 - 1200 cm(-1) region were acquired every 3 min, and a 2D NIR/mid-IR heterospectral correlation spectrum was calculated to investigate the correlation intensity between the NIR and mid-IR bands. NIR regions that include bands at 4343, 4416, 5778, 5904, and 5955 cm(-1), which result from the combinations and overtones of the C-H group of ethanol, were selected for use in the PLS regression models, by taking the correlation intensity of a mid-IR band at 2985 cm(-1) arising from the CH(3) asymmetric stretching vibration mode of ethanol as a reference. The predicted results indicate that the ethanol concentrations calculated from the PLS regression models fit well to those obtained by high-performance liquid chromatography. Thus, it can be concluded that the selection procedure using the NIR/mid-IR dual-region spectrometer combined with 2D NIR/mid-IR heterospectral correlation spectroscopy is a powerful method for the construction of a reliable regression model.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
[Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].
Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan
2015-01-01
To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.
Quantitative Structure Retention Relationships of Polychlorinated Dibenzodioxins and Dibenzofurans
1991-08-01
be a projection onto the X-Y plane. The algorithm for this calculation can be found in Stouch and Jurs (22), but was further refined by Rohrbaugh and...throughspace distances. WPSA2 (c) Weighted positive charged surface area. MOMH2 (c) Second major moment of inertia with hydrogens attached. CSTR 3 (d) Sum...of the models. The robust regression analysis method calculates a regression model using a least median squares algorithm which is not as susceptible
Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Neural Network and Regression Soft Model Extended for PAX-300 Aircraft Engine
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Hopkins, Dale A.
2002-01-01
In fiscal year 2001, the neural network and regression capabilities of NASA Glenn Research Center's COMETBOARDS design optimization testbed were extended to generate approximate models for the PAX-300 aircraft engine. The analytical model of the engine is defined through nine variables: the fan efficiency factor, the low pressure of the compressor, the high pressure of the compressor, the high pressure of the turbine, the low pressure of the turbine, the operating pressure, and three critical temperatures (T(sub 4), T(sub vane), and T(sub metal)). Numerical Propulsion System Simulation (NPSS) calculations of the specific fuel consumption (TSFC), as a function of the variables can become time consuming, and numerical instabilities can occur during these design calculations. "Soft" models can alleviate both deficiencies. These approximate models are generated from a set of high-fidelity input-output pairs obtained from the NPSS code and a design of the experiment strategy. A neural network and a regression model with 45 weight factors were trained for the input/output pairs. Then, the trained models were validated through a comparison with the original NPSS code. Comparisons of TSFC versus the operating pressure and of TSFC versus the three temperatures (T(sub 4), T(sub vane), and T(sub metal)) are depicted in the figures. The overall performance was satisfactory for both the regression and the neural network model. The regression model required fewer calculations than the neural network model, and it produced marginally superior results. Training the approximate methods is time consuming. Once trained, the approximate methods generated the solution with only a trivial computational effort, reducing the solution time from hours to less than a minute.
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.
Element enrichment factor calculation using grain-size distribution and functional data regression.
Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R
2015-01-01
In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Regression Framework for Effect Size Assessments in Longitudinal Modeling of Group Differences
Feingold, Alan
2013-01-01
The use of growth modeling analysis (GMA)--particularly multilevel analysis and latent growth modeling--to test the significance of intervention effects has increased exponentially in prevention science, clinical psychology, and psychiatry over the past 15 years. Model-based effect sizes for differences in means between two independent groups in GMA can be expressed in the same metric (Cohen’s d) commonly used in classical analysis and meta-analysis. This article first reviews conceptual issues regarding calculation of d for findings from GMA and then introduces an integrative framework for effect size assessments that subsumes GMA. The new approach uses the structure of the linear regression model, from which effect sizes for findings from diverse cross-sectional and longitudinal analyses can be calculated with familiar statistics, such as the regression coefficient, the standard deviation of the dependent measure, and study duration. PMID:23956615
Combustion performance and scale effect from N2O/HTPB hybrid rocket motor simulations
NASA Astrophysics Data System (ADS)
Shan, Fanli; Hou, Lingyun; Piao, Ying
2013-04-01
HRM code for the simulation of N2O/HTPB hybrid rocket motor operation and scale effect analysis has been developed. This code can be used to calculate motor thrust and distributions of physical properties inside the combustion chamber and nozzle during the operational phase by solving the unsteady Navier-Stokes equations using a corrected compressible difference scheme and a two-step, five species combustion model. A dynamic fuel surface regression technique and a two-step calculation method together with the gas-solid coupling are applied in the calculation of fuel regression and the determination of combustion chamber wall profile as fuel regresses. Both the calculated motor thrust from start-up to shut-down mode and the combustion chamber wall profile after motor operation are in good agreements with experimental data. The fuel regression rate equation and the relation between fuel regression rate and axial distance have been derived. Analysis of results suggests improvements in combustion performance to the current hybrid rocket motor design and explains scale effects in the variation of fuel regression rate with combustion chamber diameter.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
NASA Astrophysics Data System (ADS)
Skrzypek, Grzegorz; Sadler, Rohan; Wiśniewski, Andrzej
2017-04-01
The stable oxygen isotope composition of phosphates (δ18O) extracted from mammalian bone and teeth material is commonly used as a proxy for paleotemperature. Historically, several different analytical and statistical procedures for determining air paleotemperatures from the measured δ18O of phosphates have been applied. This inconsistency in both stable isotope data processing and the application of statistical procedures has led to large and unwanted differences between calculated results. This study presents the uncertainty associated with two of the most commonly used regression methods: least squares inverted fit and transposed fit. We assessed the performance of these methods by designing and applying calculation experiments to multiple real-life data sets, calculating in reverse temperatures, and comparing them with true recorded values. Our calculations clearly show that the mean absolute errors are always substantially higher for the inverted fit (a causal model), with the transposed fit (a predictive model) returning mean values closer to the measured values (Skrzypek et al. 2015). The predictive models always performed better than causal models, with 12-65% lower mean absolute errors. Moreover, the least-squares regression (LSM) model is more appropriate than Reduced Major Axis (RMA) regression for calculating the environmental water stable oxygen isotope composition from phosphate signatures, as well as for calculating air temperature from the δ18O value of environmental water. The transposed fit introduces a lower overall error than the inverted fit for both the δ18O of environmental water and Tair calculations; therefore, the predictive models are more statistically efficient than the causal models in this instance. The direct comparison of paleotemperature results from different laboratories and studies may only be achieved if a single method of calculation is applied. Reference Skrzypek G., Sadler R., Wiśniewski A., 2016. Reassessment of recommendations for processing mammal phosphate δ18O data for paleotemperature reconstruction. Palaeogeography, Palaeoclimatology, Palaeoecology 446, 162-167.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
USE OF LETHALITY DATA DURING CATEGORICAL REGRESSION MODELING OF ACUTE REFERENCE EXPOSURES
Categorical regression is being considered by the U.S. EPA as an additional tool for derivation of acute reference exposures (AREs) to be used for human health risk assessment for exposure to inhaled chemicals. Categorical regression is used to calculate probability-response fun...
UCODE, a computer code for universal inverse modeling
Poeter, E.P.; Hill, M.C.
1999-01-01
This article presents the US Geological Survey computer program UCODE, which was developed in collaboration with the US Army Corps of Engineers Waterways Experiment Station and the International Ground Water Modeling Center of the Colorado School of Mines. UCODE performs inverse modeling, posed as a parameter-estimation problem, using nonlinear regression. Any application model or set of models can be used; the only requirement is that they have numerical (ASCII or text only) input and output files and that the numbers in these files have sufficient significant digits. Application models can include preprocessors and postprocessors as well as models related to the processes of interest (physical, chemical and so on), making UCODE extremely powerful for model calibration. Estimated parameters can be defined flexibly with user-specified functions. Observations to be matched in the regression can be any quantity for which a simulated equivalent value can be produced, thus simulated equivalent values are calculated using values that appear in the application model output files and can be manipulated with additive and multiplicative functions, if necessary. Prior, or direct, information on estimated parameters also can be included in the regression. The nonlinear regression problem is solved by minimizing a weighted least-squares objective function with respect to the parameter values using a modified Gauss-Newton method. Sensitivities needed for the method are calculated approximately by forward or central differences and problems and solutions related to this approximation are discussed. Statistics are calculated and printed for use in (1) diagnosing inadequate data or identifying parameters that probably cannot be estimated with the available data, (2) evaluating estimated parameter values, (3) evaluating the model representation of the actual processes and (4) quantifying the uncertainty of model simulated values. UCODE is intended for use on any computer operating system: it consists of algorithms programmed in perl, a freeware language designed for text manipulation and Fortran90, which efficiently performs numerical calculations.
Comparing The Effectiveness of a90/95 Calculations (Preprint)
2006-09-01
Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H
2009-01-01
This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
Hybrid Rocket Performance Prediction with Coupling Method of CFD and Thermal Conduction Calculation
NASA Astrophysics Data System (ADS)
Funami, Yuki; Shimada, Toru
The final purpose of this study is to develop a design tool for hybrid rocket engines. This tool is a computer code which will be used in order to investigate rocket performance characteristics and unsteady phenomena lasting through the burning time, such as fuel regression or combustion oscillation. When phenomena inside a combustion chamber, namely boundary layer combustion, are described, it is difficult to use rigorous models for this target. It is because calculation cost may be too expensive. Therefore simple models are required for this calculation. In this study, quasi-one-dimensional compressible Euler equations for flowfields inside a chamber and the equation for thermal conduction inside a solid fuel are numerically solved. The energy balance equation at the solid fuel surface is solved to estimate fuel regression rate. Heat feedback model is Karabeyoglu's model dependent on total mass flux. Combustion model is global single step reaction model for 4 chemical species or chemical equilibrium model for 9 chemical species. As a first step, steady-state solutions are reported.
ERIC Educational Resources Information Center
Fraas, John W.; Newman, Isadore
1996-01-01
In a conjoint-analysis consumer-preference study, researchers must determine whether the product factor estimates, which measure consumer preferences, should be calculated and interpreted for each respondent or collectively. Multiple regression models can determine whether to aggregate data by examining factor-respondent interaction effects. This…
Kayser, W; Glaze, J B; Welch, C M; Kerley, M; Hill, R A
2015-07-01
The objective of this study was to determine the effects of alternative-measurements of body weight and DMI used to evaluate residual feed intake (RFI). Weaning weight (WW), ADG, and DMI were recorded on 970 growing purebred Charolais bulls (n = 519) and heifers (n = 451) and 153 Red Angus growing steers (n = 69) and heifers (n = 84) using a GrowSafe (GrowSafe, Airdrie, Alberta, Canada) system. Averages of individual DMI were calculated in 10-d increments and compared to the overall DMI to identify the magnitude of the errors associated with measuring DMI. These incremental measurements were also used in calculation of RFI, computed from the linear regression of DMI on ADG and midtest body weight0.75 (MMWT). RFI_Regress was calculated using ADG_Regress (ADG calculated as the response of BW gain and DOF) and MMWT_PWG (metabolic midweight calculated throughout the postweaning gain test), considered the control in Red Angus. A similar calculation served as control for Charolais; RFI was calculated using 2-d consecutive start and finish weights (RFI_Calc). The RFI weaning weight (RFI_WW) was calculated using ADG_WW (ADG from weaning till the final out weight of the postweaning gain test) and MMWT_WW, calculated similarly. Overall average estimated DMI was highly correlated to the measurements derived over shorter periods, with 10 d being the least correlated and 60 d being the most correlated. The ADG_Calc (calculated using 2-d consecutive start and finish weight/DOF) and ADG_WW were highly correlated in Charolais. The ADG_Regress and ADG_Calc were highly correlated, and ADG_Regress and ADG_WW were moderately correlated in Red Angus. The control measures of RFI were highly correlated with the RFI_WW in Charolais and Red Angus. The outcomes of including abbreviated period DMI in the model with the weaning weight gain measurements showed that the model using 10 d of intake (RFI WW_10) was the least correlated with the control measures. The model with 60 d of intake had the largest correlation with the control measures. The fewest measured intake days coupled with the weaning weight values providing acceptable predictive value was RFI_WW_40, being highly correlated with the control measures. As established in the literature, at least 70 d is required to accurately measure ADG. However, we conclude that a shorter period, possibly as few as 40 d is needed to accurately estimate DMI for a reliable calculation of RFI.
Espelt, Albert; Marí-Dell'Olmo, Marc; Penelo, Eva; Bosque-Prous, Marina
2016-06-14
To examine the differences between Prevalence Ratio (PR) and Odds Ratio (OR) in a cross-sectional study and to provide tools to calculate PR using two statistical packages widely used in substance use research (STATA and R). We used cross-sectional data from 41,263 participants of 16 European countries participating in the Survey on Health, Ageing and Retirement in Europe (SHARE). The dependent variable, hazardous drinking, was calculated using the Alcohol Use Disorders Identification Test - Consumption (AUDIT-C). The main independent variable was gender. Other variables used were: age, educational level and country of residence. PR of hazardous drinking in men with relation to women was estimated using Mantel-Haenszel method, log-binomial regression models and poisson regression models with robust variance. These estimations were compared to the OR calculated using logistic regression models. Prevalence of hazardous drinkers varied among countries. Generally, men have higher prevalence of hazardous drinking than women [PR=1.43 (1.38-1.47)]. Estimated PR was identical independently of the method and the statistical package used. However, OR overestimated PR, depending on the prevalence of hazardous drinking in the country. In cross-sectional studies, where comparisons between countries with differences in the prevalence of the disease or condition are made, it is advisable to use PR instead of OR.
Obtaining Predictions from Models Fit to Multiply Imputed Data
ERIC Educational Resources Information Center
Miles, Andrew
2016-01-01
Obtaining predictions from regression models fit to multiply imputed data can be challenging because treatments of multiple imputation seldom give clear guidance on how predictions can be calculated, and because available software often does not have built-in routines for performing the necessary calculations. This research note reviews how…
Linear models for calculating digestibile energy for sheep diets.
Fonnesbeck, P V; Christiansen, M L; Harris, L E
1981-05-01
Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.
Chai, Rui; Xu, Li-Sheng; Yao, Yang; Hao, Li-Ling; Qi, Lin
2017-01-01
This study analyzed ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO), and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. Invasively measured parameters were compared with parameters measured from brachial pulse waves by regression model and transfer function model. Accuracy of parameters estimated by regression and transfer function model, was compared too. Findings showed that k value, central pulse wave and brachial pulse wave parameters invasively measured, correlated positively. Regression model parameters including A_slope, DBP, SEVR, and transfer function model parameters had good consistency with parameters invasively measured. They had same effect of consistency. SBP, PP, SV, and CO could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
Nakatsuka, Haruo; Chiba, Keiko; Watanabe, Takao; Sawatari, Hideyuki; Seki, Takako
2016-11-01
Iodine intake by adults in farming districts in Northeastern Japan was evaluated by two methods: (1) government-approved food composition tables based calculation and (2) instrumental measurement. The correlation between these two values and a regression model for the calibration of calculated values was presented. Iodine intake was calculated, using the values in the Japan Standard Tables of Food Composition (FCT), through the analysis of duplicate samples of complete 24-h food consumption for 90 adult subjects. In cases where the value for iodine content was not available in the FCT, it was assumed to be zero for that food item (calculated values). Iodine content was also measured by ICP-MS (measured values). Calculated and measured values rendered geometric means (GM) of 336 and 279 μg/day, respectively. There was no statistically significant (p > 0.05) difference between calculated and measured values. The correlation coefficient was 0.646 (p < 0.05). With this high correlation coefficient, a simple regression line can be applied to estimate measured value from calculated value. A survey of the literature suggests that the values in this study were similar to values that have been reported to date for Japan, and higher than those for other countries in Asia. Iodine intake of Japanese adults was 336 μg/day (GM, calculated) and 279 μg/day (GM, measured). Both values correlated so well, with a correlation coefficient of 0.646, that a regression model (Y = 130.8 + 1.9479X, where X and Y are measured and calculated values, respectively) could be used to calibrate calculated values.
Mapping diffuse photosynthetically active radiation from satellite data in Thailand
NASA Astrophysics Data System (ADS)
Choosri, P.; Janjai, S.; Nunez, M.; Buntoung, S.; Charuchittipan, D.
2017-12-01
In this paper, calculation of monthly average hourly diffuse photosynthetically active radiation (PAR) using satellite data is proposed. Diffuse PAR was analyzed at four stations in Thailand. A radiative transfer model was used for calculating the diffuse PAR for cloudless sky conditions. Differences between the diffuse PAR under all sky conditions obtained from the ground-based measurements and those from the model are representative of cloud effects. Two models are developed, one describing diffuse PAR only as a function of solar zenith angle, and the second one as a multiple linear regression with solar zenith angle and satellite reflectivity acting linearly and aerosol optical depth acting in logarithmic functions. When tested with an independent data set, the multiple regression model performed best with a higher coefficient of variance R2 (0.78 vs. 0.70), lower root mean square difference (RMSD) (12.92% vs. 13.05%) and the same mean bias difference (MBD) of -2.20%. Results from the multiple regression model are used to map diffuse PAR throughout the country as monthly averages of hourly data.
A kinetic energy model of two-vehicle crash injury severity.
Sobhani, Amir; Young, William; Logan, David; Bahrololoom, Sareh
2011-05-01
An important part of any model of vehicle crashes is the development of a procedure to estimate crash injury severity. After reviewing existing models of crash severity, this paper outlines the development of a modelling approach aimed at measuring the injury severity of people in two-vehicle road crashes. This model can be incorporated into a discrete event traffic simulation model, using simulation model outputs as its input. The model can then serve as an integral part of a simulation model estimating the crash potential of components of the traffic system. The model is developed using Newtonian Mechanics and Generalised Linear Regression. The factors contributing to the speed change (ΔV(s)) of a subject vehicle are identified using the law of conservation of momentum. A Log-Gamma regression model is fitted to measure speed change (ΔV(s)) of the subject vehicle based on the identified crash characteristics. The kinetic energy applied to the subject vehicle is calculated by the model, which in turn uses a Log-Gamma Regression Model to estimate the Injury Severity Score of the crash from the calculated kinetic energy, crash impact type, presence of airbag and/or seat belt and occupant age. Copyright © 2010 Elsevier Ltd. All rights reserved.
New approach to probability estimate of femoral neck fracture by fall (Slovak regression model).
Wendlova, J
2009-01-01
3,216 Slovak women with primary or secondary osteoporosis or osteopenia, aged 20-89 years, were examined with the bone densitometer DXA (dual energy X-ray absorptiometry, GE, Prodigy - Primo), x = 58.9, 95% C.I. (58.42; 59.38). The values of the following variables for each patient were measured: FSI (femur strength index), T-score total hip left, alpha angle - left, theta angle - left, HAL (hip axis length) left, BMI (body mass index) was calculated from the height and weight of the patients. Regression model determined the following order of independent variables according to the intensity of their influence upon the occurrence of values of dependent FSI variable: 1. BMI, 2. theta angle, 3. T-score total hip, 4. alpha angle, 5. HAL. The regression model equation, calculated from the variables monitored in the study, enables a doctor in praxis to determine the probability magnitude (absolute risk) for the occurrence of pathological value of FSI (FSI < 1) in the femoral neck area, i. e., allows for probability estimate of a femoral neck fracture by fall for Slovak women. 1. The Slovak regression model differs from regression models, published until now, in chosen independent variables and a dependent variable, belonging to biomechanical variables, characterising the bone quality. 2. The Slovak regression model excludes the inaccuracies of other models, which are not able to define precisely the current and past clinical condition of tested patients (e.g., to define the length and dose of exposure to risk factors). 3. The Slovak regression model opens the way to a new method of estimating the probability (absolute risk) or the odds for a femoral neck fracture by fall, based upon the bone quality determination. 4. It is assumed that the development will proceed by improving the methods enabling to measure the bone quality, determining the probability of fracture by fall (Tab. 6, Fig. 3, Ref. 22). Full Text (Free, PDF) www.bmj.sk.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
Regression-based model of skin diffuse reflectance for skin color analysis
NASA Astrophysics Data System (ADS)
Tsumura, Norimichi; Kawazoe, Daisuke; Nakaguchi, Toshiya; Ojima, Nobutoshi; Miyake, Yoichi
2008-11-01
A simple regression-based model of skin diffuse reflectance is developed based on reflectance samples calculated by Monte Carlo simulation of light transport in a two-layered skin model. This reflectance model includes the values of spectral reflectance in the visible spectra for Japanese women. The modified Lambert Beer law holds in the proposed model with a modified mean free path length in non-linear density space. The averaged RMS and maximum errors of the proposed model were 1.1 and 3.1%, respectively, in the above range.
NASA Astrophysics Data System (ADS)
Wiley, E. O.
2010-07-01
Relative motion studies of visual double stars can be investigated using least squares regression techniques and readily accessible programs such as Microsoft Excel and a calculator. Optical pairs differ from physical pairs under most geometries in both their simple scatter plots and their regression models. A step-by-step protocol for estimating the rectilinear elements of an optical pair is presented. The characteristics of physical pairs using these techniques are discussed.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-25
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.
OPC modeling by genetic algorithm
NASA Astrophysics Data System (ADS)
Huang, W. C.; Lai, C. M.; Luo, B.; Tsai, C. K.; Tsay, C. S.; Lai, C. W.; Kuo, C. C.; Liu, R. G.; Lin, H. T.; Lin, B. J.
2005-05-01
Optical proximity correction (OPC) is usually used to pre-distort mask layouts to make the printed patterns as close to the desired shapes as possible. For model-based OPC, a lithographic model to predict critical dimensions after lithographic processing is needed. The model is usually obtained via a regression of parameters based on experimental data containing optical proximity effects. When the parameters involve a mix of the continuous (optical and resist models) and the discrete (kernel numbers) sets, the traditional numerical optimization method may have difficulty handling model fitting. In this study, an artificial-intelligent optimization method was used to regress the parameters of the lithographic models for OPC. The implemented phenomenological models were constant-threshold models that combine diffused aerial image models with loading effects. Optical kernels decomposed from Hopkin"s equation were used to calculate aerial images on the wafer. Similarly, the numbers of optical kernels were treated as regression parameters. This way, good regression results were obtained with different sets of optical proximity effect data.
Oceanic Extreme Model Atmospheres for Aerothermodynamic Calculations,
Atmospheric temperature, Atmospheric sounding, Regression analysis, Aerothermodynamics, Marine meteorology, Radiosondes, Weather stations, Newfoundland(Province), Marshall Islands , Arabia, Iran, Coastal regions
NASA Astrophysics Data System (ADS)
Oh, Hyun-Joo; Lee, Saro; Chotikasathien, Wisut; Kim, Chang Hwan; Kwon, Ju Hyoung
2009-04-01
For predictive landslide susceptibility mapping, this study applied and verified probability model, the frequency ratio and statistical model, logistic regression at Pechabun, Thailand, using a geographic information system (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and field surveys, and maps of the topography, geology and land cover were constructed to spatial database. The factors that influence landslide occurrence, such as slope gradient, slope aspect and curvature of topography and distance from drainage were calculated from the topographic database. Lithology and distance from fault were extracted and calculated from the geology database. Land cover was classified from Landsat TM satellite image. The frequency ratio and logistic regression coefficient were overlaid for landslide susceptibility mapping as each factor’s ratings. Then the landslide susceptibility map was verified and compared using the existing landslide location. As the verification results, the frequency ratio model showed 76.39% and logistic regression model showed 70.42% in prediction accuracy. The method can be used to reduce hazards associated with landslides and to plan land cover.
NASA Astrophysics Data System (ADS)
Beriro, D. J.; Abrahart, R. J.; Nathanail, C. P.
2012-04-01
Data-driven modelling is most commonly used to develop predictive models that will simulate natural processes. This paper, in contrast, uses Gene Expression Programming (GEP) to construct two alternative models of different pan evaporation estimations by means of symbolic regression: a simulator, a model of a real-world process developed on observed records, and an emulator, an imitator of some other model developed on predicted outputs calculated by that source model. The solutions are compared and contrasted for the purposes of determining whether any substantial differences exist between either option. This analysis will address recent arguments over the impact of using downloaded hydrological modelling datasets originating from different initial sources i.e. observed or calculated. These differences can be easily be overlooked by modellers, resulting in a model of a model developed on estimations derived from deterministic empirical equations and producing exceptionally high goodness-of-fit. This paper uses different lines-of-evidence to evaluate model output and in so doing paves the way for a new protocol in machine learning applications. Transparent modelling tools such as symbolic regression offer huge potential for explaining stochastic processes, however, the basic tenets of data quality and recourse to first principles with regard to problem understanding should not be trivialised. GEP is found to be an effective tool for the prediction of observed and calculated pan evaporation, with results supported by an understanding of the records, and of the natural processes concerned, evaluated using one-at-a-time response function sensitivity analysis. The results show that both architectures and response functions are very similar, implying that previously observed differences in goodness-of-fit can be explained by whether models are applied to observed or calculated data.
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.
Garabedian, Stephen P.
1986-01-01
A nonlinear, least-squares regression technique for the estimation of ground-water flow model parameters was applied to the regional aquifer underlying the eastern Snake River Plain, Idaho. The technique uses a computer program to simulate two-dimensional, steady-state ground-water flow. Hydrologic data for the 1980 water year were used to calculate recharge rates, boundary fluxes, and spring discharges. Ground-water use was estimated from irrigated land maps and crop consumptive-use figures. These estimates of ground-water withdrawal, recharge rates, and boundary flux, along with leakance, were used as known values in the model calibration of transmissivity. Leakance values were adjusted between regression solutions by comparing model-calculated to measured spring discharges. In other simulations, recharge and leakance also were calibrated as prior-information regression parameters, which limits the variation of these parameters using a normalized standard error of estimate. Results from a best-fit model indicate a wide areal range in transmissivity from about 0.05 to 44 feet squared per second and in leakance from about 2.2x10 -9 to 6.0 x 10 -8 feet per second per foot. Along with parameter values, model statistics also were calculated, including the coefficient of correlation between calculated and observed head (0.996), the standard error of the estimates for head (40 feet), and the parameter coefficients of variation (about 10-40 percent). Additional boundary flux was added in some areas during calibration to achieve proper fit to ground-water flow directions. Model fit improved significantly when areas that violated model assumptions were removed. It also improved slightly when y-direction (northwest-southeast) transmissivity values were larger than x-direction (northeast-southwest) transmissivity values. The model was most sensitive to changes in recharge, and in some areas, to changes in transmissivity, particularly near the spring discharge area from Milner Dam to King Hill.
González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O
2013-07-01
There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I
2007-10-01
To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.
NASA Astrophysics Data System (ADS)
Sultana, R.; Mroczek, M.; Dallman, S.; Sengupta, A.; Stein, E. D.
2016-12-01
The portion of the Total Impervious Area (TIA) that is hydraulically connected to the storm drainage network is called the Effective Impervious Area (EIA). The remaining fraction of impervious area, called the non-effective impervious area, drains onto pervious surfaces which do not contribute to runoff for smaller events. Using the TIA instead of EIA in models and calculations can lead to overestimates of runoff volumes peak discharges and oversizing of drainage system since it is assumed all impervious areas produce urban runoff that is directly connected to storm drains. This makes EIA a better predictor of actual runoff from urban catchments for hydraulic design of storm drain systems and modeling non-point source pollution. Compared to TIA, determining the EIA is considerably more difficult to calculate since it cannot be found by using remote sensing techniques, readily available EIA datasets, or aerial imagery interpretation alone. For this study, EIA percentages were calculated by two successive regression methods for five watersheds (with areas of 8.38 - 158mi2) located in Southern California using rainfall-runoff event data for the years 2004 - 2007. Runoff generated from the smaller storm events are considered to be emanating only from the effective impervious areas. Therefore, larger events that were considered to have runoff from both impervious and pervious surfaces were successively removed in the regression methods using a criterion of (1) 1mm and (2) a max (2 , 1mm) above the regression line. MSE is calculated from actual runoff and runoff predicted by the regression. Analysis of standard deviations showed that criterion of max (2 , 1mm) better fit the regression line and is the preferred method in predicting the EIA percentage. The estimated EIAs have shown to be approximately 78% to 43% of the TIA which shows use of EIA instead of TIA can have significant impact on the cost building urban hydraulic systems and stormwater capture devices.
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
NASA Astrophysics Data System (ADS)
Künne, A.; Fink, M.; Kipka, H.; Krause, P.; Flügel, W.-A.
2012-06-01
In this paper, a method is presented to estimate excess nitrogen on large scales considering single field processes. The approach was implemented by using the physically based model J2000-S to simulate the nitrogen balance as well as the hydrological dynamics within meso-scale test catchments. The model input data, the parameterization, the results and a detailed system understanding were used to generate the regression tree models with GUIDE (Loh, 2002). For each landscape type in the federal state of Thuringia a regression tree was calibrated and validated using the model data and results of excess nitrogen from the test catchments. Hydrological parameters such as precipitation and evapotranspiration were also used to predict excess nitrogen by the regression tree model. Hence they had to be calculated and regionalized as well for the state of Thuringia. Here the model J2000g was used to simulate the water balance on the macro scale. With the regression trees the excess nitrogen was regionalized for each landscape type of Thuringia. The approach allows calculating the potential nitrogen input into the streams of the drainage area. The results show that the applied methodology was able to transfer the detailed model results of the meso-scale catchments to the entire state of Thuringia by low computing time without losing the detailed knowledge from the nitrogen transport modeling. This was validated with modeling results from Fink (2004) in a catchment lying in the regionalization area. The regionalized and modeled excess nitrogen correspond with 94%. The study was conducted within the framework of a project in collaboration with the Thuringian Environmental Ministry, whose overall aim was to assess the effect of agro-environmental measures regarding load reduction in the water bodies of Thuringia to fulfill the requirements of the European Water Framework Directive (Bäse et al., 2007; Fink, 2006; Fink et al., 2007).
Simple linear and multivariate regression models.
Rodríguez del Águila, M M; Benítez-Parejo, N
2011-01-01
In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Chai Rui; Li Si-Man; Xu Li-Sheng; Yao Yang; Hao Li-Ling
2017-07-01
This study mainly analyzed the parameters such as ascending branch slope (A_slope), dicrotic notch height (Hn), diastolic area (Ad) and systolic area (As) diastolic blood pressure (DBP), systolic blood pressure (SBP), pulse pressure (PP), subendocardial viability ratio (SEVR), waveform parameter (k), stroke volume (SV), cardiac output (CO) and peripheral resistance (RS) of central pulse wave invasively and non-invasively measured. These parameters extracted from the central pulse wave invasively measured were compared with the parameters measured from the brachial pulse waves by a regression model and a transfer function model. The accuracy of the parameters which were estimated by the regression model and the transfer function model was compared too. Our findings showed that in addition to the k value, the above parameters of the central pulse wave and the brachial pulse wave invasively measured had positive correlation. Both the regression model parameters including A_slope, DBP, SEVR and the transfer function model parameters had good consistency with the parameters invasively measured, and they had the same effect of consistency. The regression equations of the three parameters were expressed by Y'=a+bx. The SBP, PP, SV, CO of central pulse wave could be calculated through the regression model, but their accuracies were worse than that of transfer function model.
Sunkara, Vasu; Hébert, James R.
2015-01-01
BACKGROUND Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. METHODS The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the “divergents” removed. RESULTS The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. CONCLUSIONS The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. PMID:25572676
Sunkara, Vasu; Hébert, James R
2015-05-15
Disparities in cancer screening, incidence, treatment, and survival are worsening globally. The mortality-to-incidence ratio (MIR) has been used previously to evaluate such disparities. The MIR for colorectal cancer is calculated for all Organisation for Economic Cooperation and Development (OECD) countries using the 2012 GLOBOCAN incidence and mortality statistics. Health system rankings were obtained from the World Health Organization. Two linear regression models were fit with the MIR as the dependent variable and health system ranking as the independent variable; one included all countries and one model had the "divergents" removed. The regression model for all countries explained 24% of the total variance in the MIR. Nine countries were found to have regression-calculated MIRs that differed from the actual MIR by >20%. Countries with lower-than-expected MIRs were found to have strong national health systems characterized by formal colorectal cancer screening programs. Conversely, countries with higher-than-expected MIRs lack screening programs. When these divergent points were removed from the data set, the recalculated regression model explained 60% of the total variance in the MIR. The MIR proved useful for identifying disparities in cancer screening and treatment internationally. It has potential as an indicator of the long-term success of cancer surveillance programs and may be extended to other cancer types for these purposes. © 2015 American Cancer Society.
Yu, Yi-Lin; Yang, Yun-Ju; Lin, Chin; Hsieh, Chih-Chuan; Li, Chiao-Zhu; Feng, Shao-Wei; Tang, Chi-Tun; Chung, Tzu-Tsao; Ma, Hsin-I; Chen, Yuan-Hao; Ju, Da-Tong; Hueng, Dueng-Yuan
2017-01-01
Tumor control rates of pituitary adenomas (PAs) receiving adjuvant CyberKnife stereotactic radiosurgery (CK SRS) are high. However, there is currently no uniform way to estimate the time course of the disease. The aim of this study was to analyze the volumetric responses of PAs after CK SRS and investigate the application of an exponential decay model in calculating an accurate time course and estimation of the eventual outcome.A retrospective review of 34 patients with PAs who received adjuvant CK SRS between 2006 and 2013 was performed. Tumor volume was calculated using the planimetric method. The percent change in tumor volume and tumor volume rate of change were compared at median 4-, 10-, 20-, and 36-month intervals. Tumor responses were classified as: progression for >15% volume increase, regression for ≤15% decrease, and stabilization for ±15% of the baseline volume at the time of last follow-up. For each patient, the volumetric change versus time was fitted with an exponential model.The overall tumor control rate was 94.1% in the 36-month (range 18-87 months) follow-up period (mean volume change of -43.3%). Volume regression (mean decrease of -50.5%) was demonstrated in 27 (79%) patients, tumor stabilization (mean change of -3.7%) in 5 (15%) patients, and tumor progression (mean increase of 28.1%) in 2 (6%) patients (P = 0.001). Tumors that eventually regressed or stabilized had a temporary volume increase of 1.07% and 41.5% at 4 months after CK SRS, respectively (P = 0.017). The tumor volume estimated using the exponential fitting equation demonstrated high positive correlation with the actual volume calculated by magnetic resonance imaging (MRI) as tested by Pearson correlation coefficient (0.9).Transient progression of PAs post-CK SRS was seen in 62.5% of the patients receiving CK SRS, and it was not predictive of eventual volume regression or progression. A three-point exponential model is of potential predictive value according to relative distribution. An exponential decay model can be used to calculate the time course of tumors that are ultimately controlled.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Applicability of Cameriere's and Drusini's age estimation methods to a sample of Turkish adults.
Hatice, Boyacioglu Dogru; Nihal, Avcu; Nursel, Akkaya; Humeyra Ozge, Yilanci; Goksuluk, Dincer
2017-10-01
The aim of this study was to investigate the applicability of Drusini's and Cameriere's methods to a sample of Turkish people. Panoramic images of 200 individuals were allocated into two groups as study and test groups and examined by two observers. Tooth coronal indexes (TCI), which is the ratio between coronal pulp cavity height and crown height, were calculated in the mandibular first and second premolars and molars. Pulp/tooth area ratios (ARs) were calculated in the maxillary and mandibular canine teeth. Study group measurements were used to derive a regression model. Test group measurements were used to evaluate the accuracy of the regression model. Pearson's correlation coefficients and regression analysis were used. The correlations between TCIs and age were -0.230, -0.301, -0.344 and -0.257 for mandibular first premolar, second premolar, first molar and second molar, respectively. Those for the maxillary canine (MX) and mandibular canine (MN) ARs were -0.716 and -0.514, respectively. The MX ARs were used to build the linear regression model that explained 51.2% of the total variation, with a standard error of 9.23 years. The mean error of the estimates in test group was 8 years and age of 64% of the individuals were estimated with an error of <±10 years which is acceptable in forensic age prediction. The low correlation coefficients between age and TCI indicate that Drusini's method was not applicable to the estimation of age in a Turkish population. Using Cameriere's method, we derived a regression model.
Evaluation of Cation Hydrolysis Schemes with a Pocket Calculator.
ERIC Educational Resources Information Center
Clare, Brian W.
1979-01-01
Described is the use of two models of pocket calculators. The Hewlett-Packard HP67 and the Texas Instruments TI59, to solve problems arising in connection with ionic equilibria in solution. A three-parameter regression program is described and listed as a specific example, the hydrolysis of hexavalent uranium, is provided. (BT)
Reflexion on linear regression trip production modelling method for ensuring good model quality
NASA Astrophysics Data System (ADS)
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
Yobbi, D.K.
2000-01-01
A nonlinear least-squares regression technique for estimation of ground-water flow model parameters was applied to an existing model of the regional aquifer system underlying west-central Florida. The regression technique minimizes the differences between measured and simulated water levels. Regression statistics, including parameter sensitivities and correlations, were calculated for reported parameter values in the existing model. Optimal parameter values for selected hydrologic variables of interest are estimated by nonlinear regression. Optimal estimates of parameter values are about 140 times greater than and about 0.01 times less than reported values. Independently estimating all parameters by nonlinear regression was impossible, given the existing zonation structure and number of observations, because of parameter insensitivity and correlation. Although the model yields parameter values similar to those estimated by other methods and reproduces the measured water levels reasonably accurately, a simpler parameter structure should be considered. Some possible ways of improving model calibration are to: (1) modify the defined parameter-zonation structure by omitting and/or combining parameters to be estimated; (2) carefully eliminate observation data based on evidence that they are likely to be biased; (3) collect additional water-level data; (4) assign values to insensitive parameters, and (5) estimate the most sensitive parameters first, then, using the optimized values for these parameters, estimate the entire data set.
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
A web-based normative calculator for the uniform data set (UDS) neuropsychological test battery.
Shirk, Steven D; Mitchell, Meghan B; Shaughnessy, Lynn W; Sherman, Janet C; Locascio, Joseph J; Weintraub, Sandra; Atri, Alireza
2011-11-11
With the recent publication of new criteria for the diagnosis of preclinical Alzheimer's disease (AD), there is a need for neuropsychological tools that take premorbid functioning into account in order to detect subtle cognitive decline. Using demographic adjustments is one method for increasing the sensitivity of commonly used measures. We sought to provide a useful online z-score calculator that yields estimates of percentile ranges and adjusts individual performance based on sex, age and/or education for each of the neuropsychological tests of the National Alzheimer's Coordinating Center Uniform Data Set (NACC, UDS). In addition, we aimed to provide an easily accessible method of creating norms for other clinical researchers for their own, unique data sets. Data from 3,268 clinically cognitively-normal older UDS subjects from a cohort reported by Weintraub and colleagues (2009) were included. For all neuropsychological tests, z-scores were estimated by subtracting the raw score from the predicted mean and then dividing this difference score by the root mean squared error term (RMSE) for a given linear regression model. For each neuropsychological test, an estimated z-score was calculated for any raw score based on five different models that adjust for the demographic predictors of SEX, AGE and EDUCATION, either concurrently, individually or without covariates. The interactive online calculator allows the entry of a raw score and provides five corresponding estimated z-scores based on predictions from each corresponding linear regression model. The calculator produces percentile ranks and graphical output. An interactive, regression-based, normative score online calculator was created to serve as an additional resource for UDS clinical researchers, especially in guiding interpretation of individual performances that appear to fall in borderline realms and may be of particular utility for operationalizing subtle cognitive impairment present according to the newly proposed criteria for Stage 3 preclinical Alzheimer's disease.
Bayesian Regression of Thermodynamic Models of Redox Active Materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnston, Katherine
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from themore » model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).« less
NASA Astrophysics Data System (ADS)
Staley, Dennis; Negri, Jacquelyn; Kean, Jason
2016-04-01
Population expansion into fire-prone steeplands has resulted in an increase in post-fire debris-flow risk in the western United States. Logistic regression methods for determining debris-flow likelihood and the calculation of empirical rainfall intensity-duration thresholds for debris-flow initiation represent two common approaches for characterizing hazard and reducing risk. Logistic regression models are currently being used to rapidly assess debris-flow hazard in response to design storms of known intensities (e.g. a 10-year recurrence interval rainstorm). Empirical rainfall intensity-duration thresholds comprise a major component of the United States Geological Survey (USGS) and the National Weather Service (NWS) debris-flow early warning system at a regional scale in southern California. However, these two modeling approaches remain independent, with each approach having limitations that do not allow for synergistic local-scale (e.g. drainage-basin scale) characterization of debris-flow hazard during intense rainfall. The current logistic regression equations consider rainfall a unique independent variable, which prevents the direct calculation of the relation between rainfall intensity and debris-flow likelihood. Regional (e.g. mountain range or physiographic province scale) rainfall intensity-duration thresholds fail to provide insight into the basin-scale variability of post-fire debris-flow hazard and require an extensive database of historical debris-flow occurrence and rainfall characteristics. Here, we present a new approach that combines traditional logistic regression and intensity-duration threshold methodologies. This method allows for local characterization of both the likelihood that a debris-flow will occur at a given rainfall intensity, the direct calculation of the rainfall rates that will result in a given likelihood, and the ability to calculate spatially explicit rainfall intensity-duration thresholds for debris-flow generation in recently burned areas. Our approach synthesizes the two methods by incorporating measured rainfall intensity into each model variable (based on measures of topographic steepness, burn severity and surface properties) within the logistic regression equation. This approach provides a more realistic representation of the relation between rainfall intensity and debris-flow likelihood, as likelihood values asymptotically approach zero when rainfall intensity approaches 0 mm/h, and increase with more intense rainfall. Model performance was evaluated by comparing predictions to several existing regional thresholds. The model, based upon training data collected in southern California, USA, has proven to accurately predict rainfall intensity-duration thresholds for other areas in the western United States not included in the original training dataset. In addition, the improved logistic regression model shows promise for emergency planning purposes and real-time, site-specific early warning. With further validation, this model may permit the prediction of spatially-explicit intensity-duration thresholds for debris-flow generation in areas where empirically derived regional thresholds do not exist. This improvement would permit the expansion of the early-warning system into other regions susceptible to post-fire debris flow.
Gu, Yingxin; Wylie, Bruce K.; Boyte, Stephen; Picotte, Joshua J.; Howard, Danny; Smith, Kelcy; Nelson, Kurtis
2016-01-01
Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data) may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI) were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD) between the predicted and actual NDVI (scaled NDVI, value from 0–200) and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4), which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.
Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam
2017-11-01
The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
A critical re-evaluation of the regression model specification in the US D1 EQ-5D value function
2012-01-01
Background The EQ-5D is a generic health-related quality of life instrument (five dimensions with three levels, 243 health states), used extensively in cost-utility/cost-effectiveness analyses. EQ-5D health states are assigned values on a scale anchored in perfect health (1) and death (0). The dominant procedure for defining values for EQ-5D health states involves regression modeling. These regression models have typically included a constant term, interpreted as the utility loss associated with any movement away from perfect health. The authors of the United States EQ-5D valuation study replaced this constant with a variable, D1, which corresponds to the number of impaired dimensions beyond the first. The aim of this study was to illustrate how the use of the D1 variable in place of a constant is problematic. Methods We compared the original D1 regression model with a mathematically equivalent model with a constant term. Comparisons included implications for the magnitude and statistical significance of the coefficients, multicollinearity (variance inflation factors, or VIFs), number of calculation steps needed to determine tariff values, and consequences for tariff interpretation. Results Using the D1 variable in place of a constant shifted all dummy variable coefficients away from zero by the value of the constant, greatly increased the multicollinearity of the model (maximum VIF of 113.2 vs. 21.2), and increased the mean number of calculation steps required to determine health state values. Discussion Using the D1 variable in place of a constant constitutes an unnecessary complication of the model, obscures the fact that at least two of the main effect dummy variables are statistically nonsignificant, and complicates and biases interpretation of the tariff algorithm. PMID:22244261
A critical re-evaluation of the regression model specification in the US D1 EQ-5D value function.
Rand-Hendriksen, Kim; Augestad, Liv A; Dahl, Fredrik A
2012-01-13
The EQ-5D is a generic health-related quality of life instrument (five dimensions with three levels, 243 health states), used extensively in cost-utility/cost-effectiveness analyses. EQ-5D health states are assigned values on a scale anchored in perfect health (1) and death (0).The dominant procedure for defining values for EQ-5D health states involves regression modeling. These regression models have typically included a constant term, interpreted as the utility loss associated with any movement away from perfect health. The authors of the United States EQ-5D valuation study replaced this constant with a variable, D1, which corresponds to the number of impaired dimensions beyond the first. The aim of this study was to illustrate how the use of the D1 variable in place of a constant is problematic. We compared the original D1 regression model with a mathematically equivalent model with a constant term. Comparisons included implications for the magnitude and statistical significance of the coefficients, multicollinearity (variance inflation factors, or VIFs), number of calculation steps needed to determine tariff values, and consequences for tariff interpretation. Using the D1 variable in place of a constant shifted all dummy variable coefficients away from zero by the value of the constant, greatly increased the multicollinearity of the model (maximum VIF of 113.2 vs. 21.2), and increased the mean number of calculation steps required to determine health state values. Using the D1 variable in place of a constant constitutes an unnecessary complication of the model, obscures the fact that at least two of the main effect dummy variables are statistically nonsignificant, and complicates and biases interpretation of the tariff algorithm.
A Novel Degradation Identification Method for Wind Turbine Pitch System
NASA Astrophysics Data System (ADS)
Guo, Hui-Dong
2018-04-01
It’s difficult for traditional threshold value method to identify degradation of operating equipment accurately. An novel degradation evaluation method suitable for wind turbine condition maintenance strategy implementation was proposed in this paper. Based on the analysis of typical variable-speed pitch-to-feather control principle and monitoring parameters for pitch system, a multi input multi output (MIMO) regression model was applied to pitch system, where wind speed, power generation regarding as input parameters, wheel rotation speed, pitch angle and motor driving currency for three blades as output parameters. Then, the difference between the on-line measurement and the calculated value from the MIMO regression model applying least square support vector machines (LSSVM) method was defined as the Observed Vector of the system. The Gaussian mixture model (GMM) was applied to fitting the distribution of the multi dimension Observed Vectors. Applying the model established, the Degradation Index was calculated using the SCADA data of a wind turbine damaged its pitch bearing retainer and rolling body, which illustrated the feasibility of the provided method.
Yoneoka, Daisuke; Henmi, Masayuki
2017-06-01
Recently, the number of regression models has dramatically increased in several academic fields. However, within the context of meta-analysis, synthesis methods for such models have not been developed in a commensurate trend. One of the difficulties hindering the development is the disparity in sets of covariates among literature models. If the sets of covariates differ across models, interpretation of coefficients will differ, thereby making it difficult to synthesize them. Moreover, previous synthesis methods for regression models, such as multivariate meta-analysis, often have problems because covariance matrix of coefficients (i.e. within-study correlations) or individual patient data are not necessarily available. This study, therefore, proposes a brief explanation regarding a method to synthesize linear regression models under different covariate sets by using a generalized least squares method involving bias correction terms. Especially, we also propose an approach to recover (at most) threecorrelations of covariates, which is required for the calculation of the bias term without individual patient data. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nimbalkar, Sachin U.; Wenning, Thomas J.; Guo, Wei
In the United States, manufacturing facilities account for about 32% of total domestic energy consumption in 2014. Robust energy tracking methodologies are critical to understanding energy performance in manufacturing facilities. Due to its simplicity and intuitiveness, the classic energy intensity method (i.e. the ratio of total energy use over total production) is the most widely adopted. However, the classic energy intensity method does not take into account the variation of other relevant parameters (i.e. product type, feed stock type, weather, etc.). Furthermore, the energy intensity method assumes that the facilities’ base energy consumption (energy use at zero production) is zero,more » which rarely holds true. Therefore, it is commonly recommended to utilize regression models rather than the energy intensity approach for tracking improvements at the facility level. Unfortunately, many energy managers have difficulties understanding why regression models are statistically better than utilizing the classic energy intensity method. While anecdotes and qualitative information may convince some, many have major reservations about the accuracy of regression models and whether it is worth the time and effort to gather data and build quality regression models. This paper will explain why regression models are theoretically and quantitatively more accurate for tracking energy performance improvements. Based on the analysis of data from 114 manufacturing plants over 12 years, this paper will present quantitative results on the importance of utilizing regression models over the energy intensity methodology. This paper will also document scenarios where regression models do not have significant relevance over the energy intensity method.« less
Influence of soil pH on the sorption of ionizable chemicals: modeling advances.
Franco, Antonio; Fu, Wenjing; Trapp, Stefan
2009-03-01
The soil-water distribution coefficient of ionizable chemicals (K(d)) depends on the soil acidity, mainly because the pH governs speciation. Using pH-specific K(d) values normalized to organic carbon (K(OC)) from the literature, a method was developed to estimate the K(OC) of monovalent organic acids and bases. The regression considers pH-dependent speciation and species-specific partition coefficients, calculated from the dissociation constant (pK(a)) and the octanol-water partition coefficient of the neutral molecule (log P(n)). Probably because of the lower pH near the organic colloid-water interface, the optimal pH to model dissociation was lower than the bulk soil pH. The knowledge of the soil pH allows calculation of the fractions of neutral and ionic molecules in the system, thus improving the existing regression for acids. The same approach was not successful with bases, for which the impact of pH on the total sorption is contrasting. In fact, the shortcomings of the model assumptions affect the predictive power for acids and for bases differently. We evaluated accuracy and limitations of the regressions for their use in the environmental fate assessment of ionizable chemicals.
Detecting isotopic ratio outliers
NASA Astrophysics Data System (ADS)
Bayne, C. K.; Smith, D. H.
An alternative method is proposed for improving isotopic ratio estimates. This method mathematically models pulse-count data and uses iterative reweighted Poisson regression to estimate model parameters to calculate the isotopic ratios. This computer-oriented approach provides theoretically better methods than conventional techniques to establish error limits and to identify outliers.
Independent contrasts and PGLS regression estimators are equivalent.
Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary
2012-05-01
We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
Wherry, Susan A.; Wood, Tamara M.
2018-04-27
A whole lake eutrophication (WLE) model approach for phosphorus and cyanobacterial biomass in Upper Klamath Lake, south-central Oregon, is presented here. The model is a successor to a previous model developed to inform a Total Maximum Daily Load (TMDL) for phosphorus in the lake, but is based on net primary production (NPP), which can be calculated from dissolved oxygen, rather than scaling up a small-scale description of cyanobacterial growth and respiration rates. This phase 3 WLE model is a refinement of the proof-of-concept developed in phase 2, which was the first attempt to use NPP to simulate cyanobacteria in the TMDL model. The calibration of the calculated NPP WLE model was successful, with performance metrics indicating a good fit to calibration data, and the calculated NPP WLE model was able to simulate mid-season bloom decreases, a feature that previous models could not reproduce.In order to use the model to simulate future scenarios based on phosphorus load reduction, a multivariate regression model was created to simulate NPP as a function of the model state variables (phosphorus and chlorophyll a) and measured meteorological and temperature model inputs. The NPP time series was split into a low- and high-frequency component using wavelet analysis, and regression models were fit to the components separately, with moderate success.The regression models for NPP were incorporated in the WLE model, referred to as the “scenario” WLE (SWLE), and the fit statistics for phosphorus during the calibration period were mostly unchanged. The fit statistics for chlorophyll a, however, were degraded. These statistics are still an improvement over prior models, and indicate that the SWLE is appropriate for long-term predictions even though it misses some of the seasonal variations in chlorophyll a.The complete whole lake SWLE model, with multivariate regression to predict NPP, was used to make long-term simulations of the response to 10-, 20-, and 40-percent reductions in tributary nutrient loads. The long-term mean water column concentration of total phosphorus was reduced by 9, 18, and 36 percent, respectively, in response to these load reductions. The long-term water column chlorophyll a concentration was reduced by 4, 13, and 44 percent, respectively. The adjustment to a new equilibrium between the water column and sediments occurred over about 30 years.
Simulation of land use change in the three gorges reservoir area based on CART-CA
NASA Astrophysics Data System (ADS)
Yuan, Min
2018-05-01
This study proposes a new method to simulate spatiotemporal complex multiple land uses by using classification and regression tree algorithm (CART) based CA model. In this model, we use classification and regression tree algorithm to calculate land class conversion probability, and combine neighborhood factor, random factor to extract cellular transformation rules. The overall Kappa coefficient is 0.8014 and the overall accuracy is 0.8821 in the land dynamic simulation results of the three gorges reservoir area from 2000 to 2010, and the simulation results are satisfactory.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Geographically weighted regression model on poverty indicator
NASA Astrophysics Data System (ADS)
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
Yu, Yi-Lin; Yang, Yun-Ju; Lin, Chin; Hsieh, Chih-Chuan; Li, Chiao-Zhu; Feng, Shao-Wei; Tang, Chi-Tun; Chung, Tzu-Tsao; Ma, Hsin-I; Chen, Yuan-Hao; Ju, Da-Tong; Hueng, Dueng-Yuan
2017-01-01
Abstract Tumor control rates of pituitary adenomas (PAs) receiving adjuvant CyberKnife stereotactic radiosurgery (CK SRS) are high. However, there is currently no uniform way to estimate the time course of the disease. The aim of this study was to analyze the volumetric responses of PAs after CK SRS and investigate the application of an exponential decay model in calculating an accurate time course and estimation of the eventual outcome. A retrospective review of 34 patients with PAs who received adjuvant CK SRS between 2006 and 2013 was performed. Tumor volume was calculated using the planimetric method. The percent change in tumor volume and tumor volume rate of change were compared at median 4-, 10-, 20-, and 36-month intervals. Tumor responses were classified as: progression for >15% volume increase, regression for ≤15% decrease, and stabilization for ±15% of the baseline volume at the time of last follow-up. For each patient, the volumetric change versus time was fitted with an exponential model. The overall tumor control rate was 94.1% in the 36-month (range 18–87 months) follow-up period (mean volume change of −43.3%). Volume regression (mean decrease of −50.5%) was demonstrated in 27 (79%) patients, tumor stabilization (mean change of −3.7%) in 5 (15%) patients, and tumor progression (mean increase of 28.1%) in 2 (6%) patients (P = 0.001). Tumors that eventually regressed or stabilized had a temporary volume increase of 1.07% and 41.5% at 4 months after CK SRS, respectively (P = 0.017). The tumor volume estimated using the exponential fitting equation demonstrated high positive correlation with the actual volume calculated by magnetic resonance imaging (MRI) as tested by Pearson correlation coefficient (0.9). Transient progression of PAs post-CK SRS was seen in 62.5% of the patients receiving CK SRS, and it was not predictive of eventual volume regression or progression. A three-point exponential model is of potential predictive value according to relative distribution. An exponential decay model can be used to calculate the time course of tumors that are ultimately controlled. PMID:28121913
Nonlinear-regression flow model of the Gulf Coast aquifer systems in the south-central United States
Kuiper, L.K.
1994-01-01
A multiple-regression methodology was used to help answer questions concerning model reliability, and to calibrate a time-dependent variable-density ground-water flow model of the gulf coast aquifer systems in the south-central United States. More than 40 regression models with 2 to 31 regressions parameters are used and detailed results are presented for 12 of the models. More than 3,000 values for grid-element volume-averaged head and hydraulic conductivity are used for the regression model observations. Calculated prediction interval half widths, though perhaps inaccurate due to a lack of normality of the residuals, are the smallest for models with only four regression parameters. In addition, the root-mean weighted residual decreases very little with an increase in the number of regression parameters. The various models showed considerable overlap between the prediction inter- vals for shallow head and hydraulic conductivity. Approximate 95-percent prediction interval half widths for volume-averaged freshwater head exceed 108 feet; for volume-averaged base 10 logarithm hydraulic conductivity, they exceed 0.89. All of the models are unreliable for the prediction of head and ground-water flow in the deeper parts of the aquifer systems, including the amount of flow coming from the underlying geopressured zone. Truncating the domain of solution of one model to exclude that part of the system having a ground-water density greater than 1.005 grams per cubic centimeter or to exclude that part of the systems below a depth of 3,000 feet, and setting the density to that of freshwater does not appreciably change the results for head and ground-water flow, except for locations close to the truncation surface.
ASR4. Anelastic Strain Recovery Analysis Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Warpinski, N.R.
ASR4 is a nonlinear least-squares regression of Anelastic Strain Recovery (ASR) data for the purpose of determining in situ stress orientations and magnitudes. ASR4 fits the viscoelastic model of Warpinski and Teufel to measure ASR data, calculates the stress orientations directly, and stress magnitudes if sufficient input data are available. The code also calculates the stress orientation using strain-rosette equations, and it calculates stress magnitudes using Blanton`s approach, assuming sufficient input data are available.
Anelastic Strain Recovery Analysis Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
ASR4 is a nonlinear least-squares regression of Anelastic Strain Recovery (ASR) data for the purpose of determining in situ stress orientations and magnitudes. ASR4 fits the viscoelastic model of Warpinski and Teufel to measure ASR data, calculates the stress orientations directly, and stress magnitudes if sufficient input data are available. The code also calculates the stress orientation using strain-rosette equations, and it calculates stress magnitudes using Blanton''s approach, assuming sufficient input data are available.
Hart, Carl R; Reznicek, Nathan J; Wilson, D Keith; Pettit, Chris L; Nykaza, Edward T
2016-05-01
Many outdoor sound propagation models exist, ranging from highly complex physics-based simulations to simplified engineering calculations, and more recently, highly flexible statistical learning methods. Several engineering and statistical learning models are evaluated by using a particular physics-based model, namely, a Crank-Nicholson parabolic equation (CNPE), as a benchmark. Narrowband transmission loss values predicted with the CNPE, based upon a simulated data set of meteorological, boundary, and source conditions, act as simulated observations. In the simulated data set sound propagation conditions span from downward refracting to upward refracting, for acoustically hard and soft boundaries, and low frequencies. Engineering models used in the comparisons include the ISO 9613-2 method, Harmonoise, and Nord2000 propagation models. Statistical learning methods used in the comparisons include bagged decision tree regression, random forest regression, boosting regression, and artificial neural network models. Computed skill scores are relative to sound propagation in a homogeneous atmosphere over a rigid ground. Overall skill scores for the engineering noise models are 0.6%, -7.1%, and 83.8% for the ISO 9613-2, Harmonoise, and Nord2000 models, respectively. Overall skill scores for the statistical learning models are 99.5%, 99.5%, 99.6%, and 99.6% for bagged decision tree, random forest, boosting, and artificial neural network regression models, respectively.
Chen, Rui; Xie, Liping; Xue, Wei; Ye, Zhangqun; Ma, Lulin; Gao, Xu; Ren, Shancheng; Wang, Fubo; Zhao, Lin; Xu, Chuanliang; Sun, Yinghao
2016-09-01
Substantial differences exist in the relationship of prostate cancer (PCa) detection rate and prostate-specific antigen (PSA) level between Western and Asian populations. Classic Western risk calculators, European Randomized Study for Screening of Prostate Cancer Risk Calculator, and Prostate Cancer Prevention Trial Risk Calculator, were shown to be not applicable in Asian populations. We aimed to develop and validate a risk calculator for predicting the probability of PCa and high-grade PCa (defined as Gleason Score sum 7 or higher) at initial prostate biopsy in Chinese men. Urology outpatients who underwent initial prostate biopsy according to the inclusion criteria were included. The multivariate logistic regression-based Chinese Prostate Cancer Consortium Risk Calculator (CPCC-RC) was constructed with cases from 2 hospitals in Shanghai. Discriminative ability, calibration and decision curve analysis were externally validated in 3 CPCC member hospitals. Of the 1,835 patients involved, PCa was identified in 338/924 (36.6%) and 294/911 (32.3%) men in the development and validation cohort, respectively. Multivariate logistic regression analyses showed that 5 predictors (age, logPSA, logPV, free PSA ratio, and digital rectal examination) were associated with PCa (Model 1) or high-grade PCa (Model 2), respectively. The area under the curve of Model 1 and Model 2 was 0.801 (95% CI: 0.771-0.831) and 0.826 (95% CI: 0.796-0.857), respectively. Both models illustrated good calibration and substantial improvement in decision curve analyses than any single predictors at all threshold probabilities. Higher predicting accuracy, better calibration, and greater clinical benefit were achieved by CPCC-RC, compared with European Randomized Study for Screening of Prostate Cancer Risk Calculator and Prostate Cancer Prevention Trial Risk Calculator in predicting PCa. CPCC-RC performed well in discrimination and calibration and decision curve analysis in external validation compared with Western risk calculators. CPCC-RC may aid in decision-making of prostate biopsy in Chinese or in other Asian populations with similar genetic and environmental backgrounds. Copyright © 2016 Elsevier Inc. All rights reserved.
Regression analysis of current-status data: an application to breast-feeding.
Grummer-strawn, L M
1993-09-01
"Although techniques for calculating mean survival time from current-status data are well known, their use in multiple regression models is somewhat troublesome. Using data on current breast-feeding behavior, this article considers a number of techniques that have been suggested in the literature, including parametric, nonparametric, and semiparametric models as well as the application of standard schedules. Models are tested in both proportional-odds and proportional-hazards frameworks....I fit [the] models to current status data on breast-feeding from the Demographic and Health Survey (DHS) in six countries: two African (Mali and Ondo State, Nigeria), two Asian (Indonesia and Sri Lanka), and two Latin American (Colombia and Peru)." excerpt
Models for predicting the mass of lime fruits by some engineering properties.
Miraei Ashtiani, Seyed-Hassan; Baradaran Motie, Jalal; Emadi, Bagher; Aghkhani, Mohammad-Hosein
2014-11-01
Grading fruits based on mass is important in packaging and reduces the waste, also increases the marketing value of agricultural produce. The aim of this study was mass modeling of two major cultivars of Iranian limes based on engineering attributes. Models were classified into three: 1-Single and multiple variable regressions of lime mass and dimensional characteristics. 2-Single and multiple variable regressions of lime mass and projected areas. 3-Single regression of lime mass based on its actual volume and calculated volume assumed as ellipsoid and prolate spheroid shapes. All properties considered in the current study were found to be statistically significant (ρ < 0.01). The results indicated that mass modeling of lime based on minor diameter and first projected area are the most appropriate models in the first and the second classifications, respectively. In third classification, the best model was obtained on the basis of the prolate spheroid volume. It was finally concluded that the suitable grading system of lime mass is based on prolate spheroid volume.
Modeling and forecasting US presidential election using learning algorithms
NASA Astrophysics Data System (ADS)
Zolghadr, Mohammad; Niaki, Seyed Armin Akhavan; Niaki, S. T. A.
2017-09-01
The primary objective of this research is to obtain an accurate forecasting model for the US presidential election. To identify a reliable model, artificial neural networks (ANN) and support vector regression (SVR) models are compared based on some specified performance measures. Moreover, six independent variables such as GDP, unemployment rate, the president's approval rate, and others are considered in a stepwise regression to identify significant variables. The president's approval rate is identified as the most significant variable, based on which eight other variables are identified and considered in the model development. Preprocessing methods are applied to prepare the data for the learning algorithms. The proposed procedure significantly increases the accuracy of the model by 50%. The learning algorithms (ANN and SVR) proved to be superior to linear regression based on each method's calculated performance measures. The SVR model is identified as the most accurate model among the other models as this model successfully predicted the outcome of the election in the last three elections (2004, 2008, and 2012). The proposed approach significantly increases the accuracy of the forecast.
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test
NASA Technical Reports Server (NTRS)
Messer, Bradley
2007-01-01
Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.
How is the weather? Forecasting inpatient glycemic control
Saulnier, George E; Castro, Janna C; Cook, Curtiss B; Thompson, Bithika M
2017-01-01
Aim: Apply methods of damped trend analysis to forecast inpatient glycemic control. Method: Observed and calculated point-of-care blood glucose data trends were determined over 62 weeks. Mean absolute percent error was used to calculate differences between observed and forecasted values. Comparisons were drawn between model results and linear regression forecasting. Results: The forecasted mean glucose trends observed during the first 24 and 48 weeks of projections compared favorably to the results provided by linear regression forecasting. However, in some scenarios, the damped trend method changed inferences compared with linear regression. In all scenarios, mean absolute percent error values remained below the 10% accepted by demand industries. Conclusion: Results indicate that forecasting methods historically applied within demand industries can project future inpatient glycemic control. Additional study is needed to determine if forecasting is useful in the analyses of other glucometric parameters and, if so, how to apply the techniques to quality improvement. PMID:29134125
Doran, Kara S.; Howd, Peter A.; Sallenger,, Asbury H.
2016-01-04
Recent studies, and most of their predecessors, use tide gage data to quantify SL acceleration, ASL(t). In the current study, three techniques were used to calculate acceleration from tide gage data, and of those examined, it was determined that the two techniques based on sliding a regression window through the time series are more robust compared to the technique that fits a single quadratic form to the entire time series, particularly if there is temporal variation in the magnitude of the acceleration. The single-fit quadratic regression method has been the most commonly used technique in determining acceleration in tide gage data. The inability of the single-fit method to account for time-varying acceleration may explain some of the inconsistent findings between investigators. Properly quantifying ASL(t) from field measurements is of particular importance in evaluating numerical models of past, present, and future SLR resulting from anticipated climate change.
Healthy life expectancy in Hong Kong Special Administrative Region of China.
Law, C. K.; Yip, P. S. F.
2003-01-01
Sullivan's method and a regression model were used to calculate healthy life expectancy (HALE) for men and women in Hong Kong Special Administrative Region (Hong Kong SAR) of China. These methods need estimates of the prevalence and information on disability distributions of 109 diseases and HALE for 191 countries by age, sex and region of the world from the WHO's health assessment of 2000. The population of Hong Kong SAR has one of the highest healthy life expectancies in the world. Sullivan's method gives higher estimates than the classic linear regression method. Although Sullivan's method accurately calculates the influence of disease prevalence within small areas and regions, the regression method can approximate HALE for all economies for which information on life expectancy is available. This paper identifies some problems of the two methods and discusses the accuracy of estimates of HALE that rely on data from the WHO assessment. PMID:12640475
Nonlinear Simulation of the Tooth Enamel Spectrum for EPR Dosimetry
NASA Astrophysics Data System (ADS)
Kirillov, V. A.; Dubovsky, S. V.
2016-07-01
Software was developed where initial EPR spectra of tooth enamel were deconvoluted based on nonlinear simulation, line shapes and signal amplitudes in the model initial spectrum were calculated, the regression coefficient was evaluated, and individual spectra were summed. Software validation demonstrated that doses calculated using it agreed excellently with the applied radiation doses and the doses reconstructed by the method of additive doses.
NASA Astrophysics Data System (ADS)
Plegnière, Sabrina; Casper, Markus; Hecker, Benjamin; Müller-Fürstenberger, Georg
2014-05-01
The basis of many models to calculate and assess climate change and its consequences are annual means of temperature and precipitation. This method leads to many uncertainties especially at the regional or local level: the results are not realistic or too coarse. Particularly in agriculture, single events and the distribution of precipitation and temperature during the growing season have enormous influences on plant growth. Therefore, the temporal distribution of climate variables should not be ignored. To reach this goal, a high-resolution ecological-economic model was developed which combines a complex plant growth model (STICS) and an economic model. In this context, input data of the plant growth model are daily climate values for a specific climate station calculated by the statistical climate model (WETTREG). The economic model is deduced from the results of the plant growth model STICS. The chosen plant is corn because corn is often cultivated and used in many different ways. First of all, a sensitivity analysis showed that the plant growth model STICS is suitable to calculate the influences of different cultivation methods and climate on plant growth or yield as well as on soil fertility, e.g. by nitrate leaching, in a realistic way. Additional simulations helped to assess a production function that is the key element of the economic model. Thereby the problems when using mean values of temperature and precipitation in order to compute a production function by linear regression are pointed out. Several examples show why a linear regression to assess a production function based on mean climate values or smoothed natural distribution leads to imperfect results and why it is not possible to deduce a unique climate factor in the production function. One solution for this problem is the additional consideration of stress indices that show the impairment of plants by water or nitrate shortage. Thus, the resulting model takes into account not only the ecological factors (e.g. the plant growth) or the economical factors as a simple monetary calculation, but also their mutual influences. Finally, the ecological-economic model enables us to make a risk assessment or evaluate adaptation strategies.
Zhang, Hong-guang; Lu, Jian-gang
2016-02-01
Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.
NASA Astrophysics Data System (ADS)
Pascuet, M. I.; Castin, N.; Becquart, C. S.; Malerba, L.
2011-05-01
An atomistic kinetic Monte Carlo (AKMC) method has been applied to study the stability and mobility of copper-vacancy clusters in Fe. This information, which cannot be obtained directly from experimental measurements, is needed to parameterise models describing the nanostructure evolution under irradiation of Fe alloys (e.g. model alloys for reactor pressure vessel steels). The physical reliability of the AKMC method has been improved by employing artificial intelligence techniques for the regression of the activation energies required by the model as input. These energies are calculated allowing for the effects of local chemistry and relaxation, using an interatomic potential fitted to reproduce them as accurately as possible and the nudged-elastic-band method. The model validation was based on comparison with available ab initio calculations for verification of the used cohesive model, as well as with other models and theories.
Predicting the demand of physician workforce: an international model based on "crowd behaviors".
Tsai, Tsuen-Chiuan; Eliasziw, Misha; Chen, Der-Fang
2012-03-26
Appropriateness of physician workforce greatly influences the quality of healthcare. When facing the crisis of physician shortages, the correction of manpower always takes an extended time period, and both the public and health personnel suffer. To calculate an appropriate number of Physician Density (PD) for a specific country, this study was designed to create a PD prediction model, based on health-related data from many countries. Twelve factors that could possibly impact physicians' demand were chosen, and data of these factors from 130 countries (by reviewing 195) were extracted. Multiple stepwise-linear regression was used to derive the PD prediction model, and a split-sample cross-validation procedure was performed to evaluate the generalizability of the results. Using data from 130 countries, with the consideration of the correlation between variables, and preventing multi-collinearity, seven out of the 12 predictor variables were selected for entry into the stepwise regression procedure. The final model was: PD = (5.014 - 0.128 × proportion under age 15 years + 0.034 × life expectancy)2, with R2 of 80.4%. Using the prediction equation, 70 countries had PDs with "negative discrepancy", while 58 had PDs with "positive discrepancy". This study provided a regression-based PD model to calculate a "norm" number of PD for a specific country. A large PD discrepancy in a country indicates the needs to examine physician's workloads and their well-being, the effectiveness/efficiency of medical care, the promotion of population health and the team resource management.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.
Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S
2017-06-01
The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Benchmark dose analysis via nonparametric regression modeling
Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen
2013-01-01
Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits’ small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Forecasting municipal solid waste generation using prognostic tools and regression analysis.
Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria
2016-11-01
For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Forecast model applications of retrieved three dimensional liquid water fields
NASA Technical Reports Server (NTRS)
Raymond, William H.; Olson, William S.
1990-01-01
Forecasts are made for tropical storm Emily using heating rates derived from the SSM/I physical retrievals described in chapters 2 and 3. Average values of the latent heating rates from the convective and stratiform cloud simulations, used in the physical retrieval, are obtained for individual 1.1 km thick vertical layers. Then, the layer-mean latent heating rates are regressed against the slant path-integrated liquid and ice precipitation water contents to determine the best fit two parameter regression coefficients for each layer. The regression formulae and retrieved precipitation water contents are utilized to infer the vertical distribution of heating rates for forecast model applications. In the forecast model, diabatic temperature contributions are calculated and used in a diabatic initialization, or in a diabatic initialization combined with a diabatic forcing procedure. Our forecasts show that the time needed to spin-up precipitation processes in tropical storm Emily is greatly accelerated through the application of the data.
New insights into faster computation of uncertainties
NASA Astrophysics Data System (ADS)
Bhattacharya, Atreyee
2012-11-01
Heavy computation power, lengthy simulations, and an exhaustive number of model runs—often these seem like the only statistical tools that scientists have at their disposal when computing uncertainties associated with predictions, particularly in cases of environmental processes such as groundwater movement. However, calculation of uncertainties need not be as lengthy, a new study shows. Comparing two approaches—the classical Bayesian “credible interval” and a less commonly used regression-based “confidence interval” method—Lu et al. show that for many practical purposes both methods provide similar estimates of uncertainties. The advantage of the regression method is that it demands 10-1000 model runs, whereas the classical Bayesian approach requires 10,000 to millions of model runs.
2011-01-01
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586
Keithley, Richard B; Wightman, R Mark
2011-06-07
Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.
Kontic, Dean; Zenic, Natasa; Uljevic, Ognjen; Sekulic, Damir; Lesnik, Blaz
2017-06-01
Swimming capacities are hypothesized to be important determinants of water polo performance but there is an evident lack of studies examining different swimming capacities in relation to specific offensive and defensive performance variables in this sport. The aim of this study was to determine the relationship between five swimming capacities and six performance determinants in water polo. The sample comprised 79 high-level youth water polo players (all males, 17-18 years of age). The variables included six performance-related variables (agility in offence and defense, efficacy in offence and defense, polyvalence in offence and defense), and five swimming-capacity tests (water polo sprint test [15 m], swimming sprint test [25 m], short-distance [100 m], aerobic endurance [400 m] and an anaerobic lactate endurance test [4× 50 m]). First, multiple regressions were calculated for one-half of the sample of subjects which were then validated with the remaining half of the sample. The 25-m swim was not included in the regression analyses due to the multicollinearity with other predictors. The originally calculated regression models were validated for defensive agility (R=0.67 and R=0.55 for the original regression calculation and validation subsample, respectively) offensive agility (R=0.59 and R=0.61), and offensive efficacy (R=0.64 and R=0.58). Anaerobic lactate endurance is a significant predictor of offensive and defensive agility, while 15 m sprint significantly contributes to offensive efficacy. Swimming capacities are not found to be related to the polyvalence of the players. The most superior offensive performance can be expected from those players with a high level of anaerobic lactate endurance and advanced sprinting capacity, while anaerobic lactate endurance is recognized as most important quality in defensive duties. Future studies should observe players' polyvalence in relation to (theoretical) knowledge of technical and tactical tasks. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.
Regression Models of Quarterly Overhead Costs for Six Government Aerospace Contractors.
1986-03-01
34 Testing ,, for Serial Correlation After Least Squares %Regression, Econometrica, Vol. 36, No. 1, pp. 133-150, January 1968. Intrili8ator M.D., Econometric ...to be superior. These two estimators are both two-stage estimators that are calculated utilizing Wallis’s test statistic for fourth-order...utilizing Wallis’s test statistic for fourth-order autocorrelation. NTIS C F’,& D tI1C T - .1 I -. . . ..- rJ ,. *p J • - DA 3
Effects of Atmospheric and Surface Dust on the Sublimation Rates of CO2 on Mars
NASA Technical Reports Server (NTRS)
Bonev, B. P.; James, P. B.; Bjorkman, J. E.; Hansen, G. B.; Wolff, M. J.
2003-01-01
We present an overview of our modeling work dedicated to study the effects of atmospheric dust on the sublimation of CO2 on Mars. The purpose of this study is to better understand the extent to which dust storm activity can be a root cause for interannual variability in the planetary CO2 seasonal cycle, through modifying the springtime regression rates of the south polar cap. We obtain calculations of the sublimation fluxes for various types of polar surfaces and different amounts of atmospheric dust. These calculations have been compared qualitatively with the regression patterns observed by Mars Global Surveyor (MGS) in both visible and infrared wavelengths, for two years of very different dust histories (1999, and 2001).
Genkawa, Takuma; Shinzawa, Hideyuki; Kato, Hideaki; Ishikawa, Daitaro; Murayama, Kodai; Komiyama, Makoto; Ozaki, Yukihiro
2015-12-01
An alternative baseline correction method for diffuse reflection near-infrared (NIR) spectra, searching region standard normal variate (SRSNV), was proposed. Standard normal variate (SNV) is an effective pretreatment method for baseline correction of diffuse reflection NIR spectra of powder and granular samples; however, its baseline correction performance depends on the NIR region used for SNV calculation. To search for an optimal NIR region for baseline correction using SNV, SRSNV employs moving window partial least squares regression (MWPLSR), and an optimal NIR region is identified based on the root mean square error (RMSE) of cross-validation of the partial least squares regression (PLSR) models with the first latent variable (LV). The performance of SRSNV was evaluated using diffuse reflection NIR spectra of mixture samples consisting of wheat flour and granular glucose (0-100% glucose at 5% intervals). From the obtained NIR spectra of the mixture in the 10 000-4000 cm(-1) region at 4 cm intervals (1501 spectral channels), a series of spectral windows consisting of 80 spectral channels was constructed, and then SNV spectra were calculated for each spectral window. Using these SNV spectra, a series of PLSR models with the first LV for glucose concentration was built. A plot of RMSE versus the spectral window position obtained using the PLSR models revealed that the 8680–8364 cm(-1) region was optimal for baseline correction using SNV. In the SNV spectra calculated using the 8680–8364 cm(-1) region (SRSNV spectra), a remarkable relative intensity change between a band due to wheat flour at 8500 cm(-1) and that due to glucose at 8364 cm(-1) was observed owing to successful baseline correction using SNV. A PLSR model with the first LV based on the SRSNV spectra yielded a determination coefficient (R2) of 0.999 and an RMSE of 0.70%, while a PLSR model with three LVs based on SNV spectra calculated in the full spectral region gave an R2 of 0.995 and an RMSE of 2.29%. Additional evaluation of SRSNV was carried out using diffuse reflection NIR spectra of marzipan and corn samples, and PLSR models based on SRSNV spectra showed good prediction results. These evaluation results indicate that SRSNV is effective in baseline correction of diffuse reflection NIR spectra and provides regression models with good prediction accuracy.
Linear Modeling and Evaluation of Controls on Flow Response in Western Post-Fire Watersheds
NASA Astrophysics Data System (ADS)
Saxe, S.; Hogue, T. S.; Hay, L.
2015-12-01
This research investigates the impact of wildfires on watershed flow regimes throughout the western United States, specifically focusing on evaluation of fire events within specified subregions and determination of the impact of climate and geophysical variables in post-fire flow response. Fire events were collected through federal and state-level databases and streamflow data were collected from U.S. Geological Survey stream gages. 263 watersheds were identified with at least 10 years of continuous pre-fire daily streamflow records and 5 years of continuous post-fire daily flow records. For each watershed, percent changes in runoff ratio (RO), annual seven day low-flows (7Q2) and annual seven day high-flows (7Q10) were calculated from pre- to post-fire. Numerous independent variables were identified for each watershed and fire event, including topographic, land cover, climate, burn severity, and soils data. The national watersheds were divided into five regions through K-clustering and a lasso linear regression model, applying the Leave-One-Out calibration method, was calculated for each region. Nash-Sutcliffe Efficiency (NSE) was used to determine the accuracy of the resulting models. The regions encompassing the United States along and west of the Rocky Mountains, excluding the coastal watersheds, produced the most accurate linear models. The Pacific coast region models produced poor and inconsistent results, indicating that the regions need to be further subdivided. Presently, RO and HF response variables appear to be more easily modeled than LF. Results of linear regression modeling showed varying importance of watershed and fire event variables, with conflicting correlation between land cover types and soil types by region. The addition of further independent variables and constriction of current variables based on correlation indicators is ongoing and should allow for more accurate linear regression modeling.
NASA Astrophysics Data System (ADS)
Li, Wang; Niu, Zheng; Gao, Shuai; Wang, Cheng
2014-11-01
Light Detection and Ranging (LiDAR) and Synthetic Aperture Radar (SAR) are two competitive active remote sensing techniques in forest above ground biomass estimation, which is important for forest management and global climate change study. This study aims to further explore their capabilities in temperate forest above ground biomass (AGB) estimation by emphasizing the spatial auto-correlation of variables obtained from these two remote sensing tools, which is a usually overlooked aspect in remote sensing applications to vegetation studies. Remote sensing variables including airborne LiDAR metrics, backscattering coefficient for different SAR polarizations and their ratio variables for Radarsat-2 imagery were calculated. First, simple linear regression models (SLR) was established between the field-estimated above ground biomass and the remote sensing variables. Pearson's correlation coefficient (R2) was used to find which LiDAR metric showed the most significant correlation with the regression residuals and could be selected as co-variable in regression co-kriging (RCoKrig). Second, regression co-kriging was conducted by choosing the regression residuals as dependent variable and the LiDAR metric (Hmean) with highest R2 as co-variable. Third, above ground biomass over the study area was estimated using SLR model and RCoKrig model, respectively. The results for these two models were validated using the same ground points. Results showed that both of these two methods achieved satisfactory prediction accuracy, while regression co-kriging showed the lower estimation error. It is proved that regression co-kriging model is feasible and effective in mapping the spatial pattern of AGB in the temperate forest using Radarsat-2 data calibrated by airborne LiDAR metrics.
A primer on marginal effects-part II: health services research applications.
Onukwugha, E; Bergtold, J; Jain, R
2015-02-01
Marginal analysis evaluates changes in a regression function associated with a unit change in a relevant variable. The primary statistic of marginal analysis is the marginal effect (ME). The ME facilitates the examination of outcomes for defined patient profiles or individuals while measuring the change in original units (e.g., costs, probabilities). The ME has a long history in economics; however, it is not widely used in health services research despite its flexibility and ability to provide unique insights. This article, the second in a two-part series, discusses practical issues that arise in the estimation and interpretation of the ME for a variety of regression models often used in health services research. Part one provided an overview of prior studies discussing ME followed by derivation of ME formulas for various regression models relevant for health services research studies examining costs and utilization. The current article illustrates the calculation and interpretation of ME in practice and discusses practical issues that arise during the implementation, including: understanding differences between software packages in terms of functionality available for calculating the ME and its confidence interval, interpretation of average marginal effect versus marginal effect at the mean, and the difference between ME and relative effects (e.g., odds ratio). Programming code to calculate ME using SAS, STATA, LIMDEP, and MATLAB are also provided. The illustration, discussion, and application of ME in this two-part series support the conduct of future studies applying the concept of marginal analysis.
Caruso, Rosario; Scordino, Monica; Traulo, Pasqualino; Gagliano, Giacomo
2012-01-01
A capillary GC-flame ionization detection (FID) method to determine volatile compounds (ethyl acetate, 1,1-diethoxyethane, methyl alcohol, 1-propanol, 2-methyl-1-propanol, 2-methyl-1-butanol, 3-methyl-1-butanol, 1-butanol, and 2-butanol) in wine was investigated in terms of calculation of detection limits and calibration method. The main objectives were: (1) calculation of regression coefficient parameters by ordinary least-squares (OLS) and bivariate least-squares (BLS) regression models, taking into account errors in both axes; (2) estimation of linear dynamic range (LDR) according to International Conference on Harmonization recommendations; (3) performance evaluation of a method by using three different internal standards (ISs) such as acetonitrile, acetone, and 1-pentanol; (4) evaluation of LODs according to the U.S. Environmental Protection Agency (EPA) 3sigma approach and the Hubaux-Vos (H-V) method; (5) application of H-V theory to a gas chromatographic analytical method and to a food matrix; and (6) accuracy assessment of the method relative to methyl alcohol content through a Unione Italiana Vini (UIV) interlaboratory proficiency test. Calibration curves calculated via BLS and OLS show similar slopes, while intercepts are closer to zero in the first case, independent of the chosen IS. The studied ISs show a substantially equivalent behavior, even though the IS closer to the analyte retention time seems to be more appropriate in terms of LDR and LOD. Results indicate an underestimation of LODs using the EPA 3sigma approach instead of the more realistic H-V method, both with OLS and BLS regression models. Methanol contents compared with UIV average values indicate recovery between 90 and 110%.
Raines, G.L.; Mihalasky, M.J.
2002-01-01
The U.S. Geological Survey (USGS) is proposing to conduct a global mineral-resource assessment using geologic maps, significant deposits, and exploration history as minimal data requirements. Using a geologic map and locations of significant pluton-related deposits, the pluton-related-deposit tract maps from the USGS national mineral-resource assessment have been reproduced with GIS-based analysis and modeling techniques. Agreement, kappa, and Jaccard's C correlation statistics between the expert USGS and calculated tract maps of 87%, 40%, and 28%, respectively, have been achieved using a combination of weights-of-evidence and weighted logistic regression methods. Between the experts' and calculated maps, the ranking of states measured by total permissive area correlates at 84%. The disagreement between the experts and calculated results can be explained primarily by tracts defined by geophysical evidence not considered in the calculations, generalization of tracts by the experts, differences in map scales, and the experts' inclusion of large tracts that are arguably not permissive. This analysis shows that tracts for regional mineral-resource assessment approximating those delineated by USGS experts can be calculated using weights of evidence and weighted logistic regression, a geologic map, and the location of significant deposits. Weights of evidence and weighted logistic regression applied to a global geologic map could provide quickly a useful reconnaissance definition of tracts for mineral assessment that is tied to the data and is reproducible. ?? 2002 International Association for Mathematical Geology.
Prediction of cavity growth by solution of salt around boreholes. (Report No. IITRI-C--6313-14)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snow, R.H. Chang, D.S.
1975-06-30
A mathematical model is developed to simulate the process of salt dissolution in a salt formation. The calibration of this model using Detroit Mine data is done systematically by the method of nonlinear regression. The brine concentrations calculated from the regression fit the measured data from Detroit Mine experiment within 10 percent. Because the Detroit data includes periods when the inlet flow is shut off, the agreement with Detroit data indicates that the model adequately represents natural convection effects to predict the cavity growth at very slow feed rates. The prediction has been done to calculate the cavity growth atmore » feed rate of one gal/h and one gal/day over a period of 10,000 y. Result shows that the cavity growth is a wide-flaring type and that the significant growth of the cavity only occurs at top layer. The prediction involves a very great extrapolation of time from the Detroit data, but it will be valid if the mechanism of solution does not change.« less
NASA Astrophysics Data System (ADS)
Lolli, Simone; Campbell, James R.; Lewis, Jasper R.; Gu, Yu; Welton, Ellsworth J.
2017-06-01
We compare, for the first time, the performance of a simplified atmospheric radiative transfer algorithm package, the Corti-Peter (CP) model, versus the more complex Fu-Liou-Gu (FLG) model, for resolving top-of-the-atmosphere radiative forcing characteristics from single-layer cirrus clouds obtained from the NASA Micro-Pulse Lidar Network database in 2010 and 2011 at Singapore and in Greenbelt, Maryland, USA, in 2012. Specifically, CP simplifies calculation of both clear-sky longwave and shortwave radiation through regression analysis applied to radiative calculations, which contributes significantly to differences between the two. The results of the intercomparison show that differences in annual net top-of-the-atmosphere (TOA) cloud radiative forcing can reach 65 %. This is particularly true when land surface temperatures are warmer than 288 K, where the CP regression analysis becomes less accurate. CP proves useful for first-order estimates of TOA cirrus cloud forcing, but may not be suitable for quantitative accuracy, including the absolute sign of cirrus cloud daytime TOA forcing that can readily oscillate around zero globally.
Evaluating and Improving the SAMA (Segmentation Analysis and Market Assessment) Recruiting Model
2015-06-01
and rewarding me with your love every day. xx THIS PAGE INTENTIONALLY LEFT BLANK 1 I. INTRODUCTION A. THE UNITED STATES ARMY RECRUITING...the relationship between the calculated SAMA potential and the actual 2014 performance. The scatterplot in Figure 8 shows a strong linear... relationship between the SAMA calculated potential and the contracting achievement for 2014, with an R-squared value of 0.871. Simple Linear Regression of
Can shoulder dystocia be reliably predicted?
Dodd, Jodie M; Catcheside, Britt; Scheil, Wendy
2012-06-01
To evaluate factors reported to increase the risk of shoulder dystocia, and to evaluate their predictive value at a population level. The South Australian Pregnancy Outcome Unit's population database from 2005 to 2010 was accessed to determine the occurrence of shoulder dystocia in addition to reported risk factors, including age, parity, self-reported ethnicity, presence of diabetes and infant birth weight. Odds ratios (and 95% confidence interval) of shoulder dystocia was calculated for each risk factor, which were then incorporated into a logistic regression model. Test characteristics for each variable in predicting shoulder dystocia were calculated. As a proportion of all births, the reported rate of shoulder dystocia increased significantly from 0.95% in 2005 to 1.38% in 2010 (P = 0.0002). Using a logistic regression model, induction of labour and infant birth weight greater than both 4000 and 4500 g were identified as significant independent predictors of shoulder dystocia. The value of risk factors alone and when incorporated into the logistic regression model was poorly predictive of the occurrence of shoulder dystocia. While there are a number of factors associated with an increased risk of shoulder dystocia, none are of sufficient sensitivity or positive predictive value to allow their use clinically to reliably and accurately identify the occurrence of shoulder dystocia. © 2012 The Authors ANZJOG © 2012 The Royal Australian and New Zealand College of Obstetricians and Gynaecologists.
Regression model for estimating inactivation of microbial aerosols by solar radiation.
Ben-David, Avishai; Sagripanti, Jose-Luis
2013-01-01
The inactivation of pathogenic aerosols by solar radiation is relevant to public health and biodefense. We investigated whether a relatively simple method to calculate solar diffuse and total irradiances could be developed and used in environmental photobiology estimations instead of complex atmospheric radiative transfer computer programs. The second-order regression model that we developed reproduced 13 radiation quantities calculated for equinoxes and solstices at 35(°) latitude with a computer-intensive and rather complex atmospheric radiative transfer program (MODTRAN) with a mean error <6% (2% for most radiation quantities). Extending the application of the regression model from a reference latitude and date (chosen as 35° latitude for 21 March) to different latitudes and days of the year was accomplished with variable success: usually with a mean error <15% (but as high as 150% for some combination of latitudes and days of year). This accuracy of the methodology proposed here compares favorably to photobiological experiments where the microbial survival is usually measured with an accuracy no better than ±0.5 log10 units. The approach and equations presented in this study should assist in estimating the maximum time during which microbial pathogens remain infectious after accidental or intentional aerosolization in open environments. © Published 2013. This article is a U.S. Government work and is in the public domain in the USA. Photochemistry and Photobiology © 2013 The American Society of Photobiology.
Aulenbach, Brent T.
2013-01-01
A regression-model based approach is a commonly used, efficient method for estimating streamwater constituent load when there is a relationship between streamwater constituent concentration and continuous variables such as streamwater discharge, season and time. A subsetting experiment using a 30-year dataset of daily suspended sediment observations from the Mississippi River at Thebes, Illinois, was performed to determine optimal sampling frequency, model calibration period length, and regression model methodology, as well as to determine the effect of serial correlation of model residuals on load estimate precision. Two regression-based methods were used to estimate streamwater loads, the Adjusted Maximum Likelihood Estimator (AMLE), and the composite method, a hybrid load estimation approach. While both methods accurately and precisely estimated loads at the model’s calibration period time scale, precisions were progressively worse at shorter reporting periods, from annually to monthly. Serial correlation in model residuals resulted in observed AMLE precision to be significantly worse than the model calculated standard errors of prediction. The composite method effectively improved upon AMLE loads for shorter reporting periods, but required a sampling interval of at least 15-days or shorter, when the serial correlations in the observed load residuals were greater than 0.15. AMLE precision was better at shorter sampling intervals and when using the shortest model calibration periods, such that the regression models better fit the temporal changes in the concentration–discharge relationship. The models with the largest errors typically had poor high flow sampling coverage resulting in unrepresentative models. Increasing sampling frequency and/or targeted high flow sampling are more efficient approaches to ensure sufficient sampling and to avoid poorly performing models, than increasing calibration period length.
Validation of Metrics as Error Predictors
NASA Astrophysics Data System (ADS)
Mendling, Jan
In this chapter, we test the validity of metrics that were defined in the previous chapter for predicting errors in EPC business process models. In Section 5.1, we provide an overview of how the analysis data is generated. Section 5.2 describes the sample of EPCs from practice that we use for the analysis. Here we discuss a disaggregation by the EPC model group and by error as well as a correlation analysis between metrics and error. Based on this sample, we calculate a logistic regression model for predicting error probability with the metrics as input variables in Section 5.3. In Section 5.4, we then test the regression function for an independent sample of EPC models from textbooks as a cross-validation. Section 5.5 summarizes the findings.
Shillcutt, Samuel D; LeFevre, Amnesty E; Fischer-Walker, Christa L; Taneja, Sunita; Black, Robert E; Mazumder, Sarmila
2017-01-01
This study evaluates the cost-effectiveness of the DAZT program for scaling up treatment of acute child diarrhea in Gujarat India using a net-benefit regression framework. Costs were calculated from societal and caregivers' perspectives and effectiveness was assessed in terms of coverage of zinc and both zinc and Oral Rehydration Salt. Regression models were tested in simple linear regression, with a specified set of covariates, and with a specified set of covariates and interaction terms using linear regression with endogenous treatment effects was used as the reference case. The DAZT program was cost-effective with over 95% certainty above $5.50 and $7.50 per appropriately treated child in the unadjusted and adjusted models respectively, with specifications including interaction terms being cost-effective with 85-97% certainty. Findings from this study should be combined with other evidence when considering decisions to scale up programs such as the DAZT program to promote the use of ORS and zinc to treat child diarrhea.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters
NASA Astrophysics Data System (ADS)
Huang, Lin-Shan; Chen, Yan-Guang
Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
NASA Astrophysics Data System (ADS)
Yilmaz, Işık
2009-06-01
The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.
A statistical method for predicting seizure onset zones from human single-neuron recordings
NASA Astrophysics Data System (ADS)
Valdez, André B.; Hickman, Erin N.; Treiman, David M.; Smith, Kris A.; Steinmetz, Peter N.
2013-02-01
Objective. Clinicians often use depth-electrode recordings to localize human epileptogenic foci. To advance the diagnostic value of these recordings, we applied logistic regression models to single-neuron recordings from depth-electrode microwires to predict seizure onset zones (SOZs). Approach. We collected data from 17 epilepsy patients at the Barrow Neurological Institute and developed logistic regression models to calculate the odds of observing SOZs in the hippocampus, amygdala and ventromedial prefrontal cortex, based on statistics such as the burst interspike interval (ISI). Main results. Analysis of these models showed that, for a single-unit increase in burst ISI ratio, the left hippocampus was approximately 12 times more likely to contain a SOZ; and the right amygdala, 14.5 times more likely. Our models were most accurate for the hippocampus bilaterally (at 85% average sensitivity), and performance was comparable with current diagnostics such as electroencephalography. Significance. Logistic regression models can be combined with single-neuron recording to predict likely SOZs in epilepsy patients being evaluated for resective surgery, providing an automated source of clinically useful information.
Two-dimensional advective transport in ground-water flow parameter estimation
Anderman, E.R.; Hill, M.C.; Poeter, E.P.
1996-01-01
Nonlinear regression is useful in ground-water flow parameter estimation, but problems of parameter insensitivity and correlation often exist given commonly available hydraulic-head and head-dependent flow (for example, stream and lake gain or loss) observations. To address this problem, advective-transport observations are added to the ground-water flow, parameter-estimation model MODFLOWP using particle-tracking methods. The resulting model is used to investigate the importance of advective-transport observations relative to head-dependent flow observations when either or both are used in conjunction with hydraulic-head observations in a simulation of the sewage-discharge plume at Otis Air Force Base, Cape Cod, Massachusetts, USA. The analysis procedure for evaluating the probable effect of new observations on the regression results consists of two steps: (1) parameter sensitivities and correlations calculated at initial parameter values are used to assess the model parameterization and expected relative contributions of different types of observations to the regression; and (2) optimal parameter values are estimated by nonlinear regression and evaluated. In the Cape Cod parameter-estimation model, advective-transport observations did not significantly increase the overall parameter sensitivity; however: (1) inclusion of advective-transport observations decreased parameter correlation enough for more unique parameter values to be estimated by the regression; (2) realistic uncertainties in advective-transport observations had a small effect on parameter estimates relative to the precision with which the parameters were estimated; and (3) the regression results and sensitivity analysis provided insight into the dynamics of the ground-water flow system, especially the importance of accurate boundary conditions. In this work, advective-transport observations improved the calibration of the model and the estimation of ground-water flow parameters, and use of regression and related techniques produced significant insight into the physical system.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States
NASA Astrophysics Data System (ADS)
Yang, J.; Astitha, M.; Schwartz, C. S.
2017-12-01
Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
An empirical study using permutation-based resampling in meta-regression
2012-01-01
Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.
Evaluation of confidence intervals for a steady-state leaky aquifer model
Christensen, S.; Cooley, R.L.
1999-01-01
The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley [Vecchia, A.V. and Cooley, R.L., Water Resources Research, 1987, 23(7), 1237-1250] was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.The fact that dependent variables of groundwater models are generally nonlinear functions of model parameters is shown to be a potentially significant factor in calculating accurate confidence intervals for both model parameters and functions of the parameters, such as the values of dependent variables calculated by the model. The Lagrangian method of Vecchia and Cooley was used to calculate nonlinear Scheffe-type confidence intervals for the parameters and the simulated heads of a steady-state groundwater flow model covering 450 km2 of a leaky aquifer. The nonlinear confidence intervals are compared to corresponding linear intervals. As suggested by the significant nonlinearity of the regression model, linear confidence intervals are often not accurate. The commonly made assumption that widths of linear confidence intervals always underestimate the actual (nonlinear) widths was not correct. Results show that nonlinear effects can cause the nonlinear intervals to be asymmetric and either larger or smaller than the linear approximations. Prior information on transmissivities helps reduce the size of the confidence intervals, with the most notable effects occurring for the parameters on which there is prior information and for head values in parameter zones for which there is prior information on the parameters.
An improved model for the combustion of AP composite propellants
NASA Technical Reports Server (NTRS)
Cohen, N. S.; Strand, L. D.
1981-01-01
This paper presents several improvements to the BDP model of steady-state burning of AP composite solid propellants. The Price-Boggs-Derr model of AP monopropellant burning is incorporated to represent the AP. A separate energy equation is written for the binder to permit a different surface temperature from the AP; this includes an analysis of the sharing of primary diffusion flame energy, and correction of a BDP model inconsistency in treating the binder regression rate. A method for assembling component contributions to calculate the burning rates of multimodal propellants is also presented. Results are shown in the form of representative burning rate curves, comparisons with data, and calculated internal details of interest. Ideas for future work are discussed in an Appendix.
lazar: a modular predictive toxicology framework
Maunz, Andreas; Gütlein, Martin; Rautenberg, Micha; Vorgrimmler, David; Gebele, Denis; Helma, Christoph
2013-01-01
lazar (lazy structure–activity relationships) is a modular framework for predictive toxicology. Similar to the read across procedure in toxicological risk assessment, lazar creates local QSAR (quantitative structure–activity relationship) models for each compound to be predicted. Model developers can choose between a large variety of algorithms for descriptor calculation and selection, chemical similarity indices, and model building. This paper presents a high level description of the lazar framework and discusses the performance of example classification and regression models. PMID:23761761
Calibration Model for Apnea-Hypopnea Indices: Impact of Alternative Criteria for Hypopneas
Ho, Vu; Crainiceanu, Ciprian M.; Punjabi, Naresh M.; Redline, Susan; Gottlieb, Daniel J.
2015-01-01
Study Objective: To characterize the association among apnea-hypopnea indices (AHIs) determined using three common metrics for defining hypopnea, and to develop a model to calibrate between these AHIs. Design: Cross-sectional analysis of Sleep Heart Health Study Data. Setting: Community-based. Participants: There were 6,441 men and women age 40 y or older. Measurement and Results: Three separate AHIs have been calculated, using all apneas (defined as a decrease in airflow greater than 90% from baseline for ≥ 10 sec) plus hypopneas (defined as a decrease in airflow or chest wall or abdominal excursion greater than 30% from baseline, but not meeting apnea definitions) associated with either: (1) a 4% or greater fall in oxyhemoglobin saturation—AHI4; (2) a 3% or greater fall in oxyhemoglobin saturation—AHI3; or (3) a 3% or greater fall in oxyhemoglobin saturation or an event-related arousal—AHI3a. Median values were 5.4, 9.7, and 13.4 for AHI4, AHI3, and AHI3a, respectively (P < 0.0001). Penalized spline regression models were used to compare AHI values across the three metrics and to calculate prediction intervals. Comparison of regression models demonstrates divergence in AHI scores among the three methods at low AHI values and gradual convergence at higher levels of AHI. Conclusions: The three methods of scoring hypopneas yielded significantly different estimates of the apnea-hypopnea index (AHI), although the relative difference is reduced in severe disease. The regression models presented will enable clinicians and researchers to more appropriately compare AHI values obtained using differing metrics for hypopnea. Citation: Ho V, Crainiceanu CM, Punjabi NM, Redline S, Gottlieb DJ. Calibration model for apnea-hypopnea indices: impact of alternative criteria for hypopneas. SLEEP 2015;38(12):1887–1892. PMID:26564122
Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X
2016-09-01
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Zhang, J; Feng, J-Y; Ni, Y-L; Wen, Y-J; Niu, Y; Tamba, C L; Yue, C; Song, Q; Zhang, Y-M
2017-06-01
Multilocus genome-wide association studies (GWAS) have become the state-of-the-art procedure to identify quantitative trait nucleotides (QTNs) associated with complex traits. However, implementation of multilocus model in GWAS is still difficult. In this study, we integrated least angle regression with empirical Bayes to perform multilocus GWAS under polygenic background control. We used an algorithm of model transformation that whitened the covariance matrix of the polygenic matrix K and environmental noise. Markers on one chromosome were included simultaneously in a multilocus model and least angle regression was used to select the most potentially associated single-nucleotide polymorphisms (SNPs), whereas the markers on the other chromosomes were used to calculate kinship matrix as polygenic background control. The selected SNPs in multilocus model were further detected for their association with the trait by empirical Bayes and likelihood ratio test. We herein refer to this method as the pLARmEB (polygenic-background-control-based least angle regression plus empirical Bayes). Results from simulation studies showed that pLARmEB was more powerful in QTN detection and more accurate in QTN effect estimation, had less false positive rate and required less computing time than Bayesian hierarchical generalized linear model, efficient mixed model association (EMMA) and least angle regression plus empirical Bayes. pLARmEB, multilocus random-SNP-effect mixed linear model and fast multilocus random-SNP-effect EMMA methods had almost equal power of QTN detection in simulation experiments. However, only pLARmEB identified 48 previously reported genes for 7 flowering time-related traits in Arabidopsis thaliana.
NASA Astrophysics Data System (ADS)
Saleh, D.; Domagalski, J. L.
2012-12-01
Sources and factors affecting the transport of total nitrogen are being evaluated for a study area that covers most of California and some areas in Oregon and Nevada, by using the SPARROW model (SPAtially Referenced Regression On Watershed attributes) developed by the U.S. Geological Survey. Mass loads of total nitrogen calculated for monitoring sites at stream gauging stations are regressed against land-use factors affecting nitrogen transport, including fertilizer use, recharge, atmospheric deposition, stream characteristics, and other factors to understand how total nitrogen is transported under average conditions. SPARROW models have been used successfully in other parts of the country to understand how nutrients are transported, and how management strategies can be formulated, such as with Total Maximum Daily Load (TMDL) assessments. Fertilizer use, atmospheric deposition, and climatic data were obtained for 2002, and loads for that year were calculated for monitored streams and point sources (mostly from wastewater treatment plants). The stream loads were calculated by using the adjusted maximum likelihood estimation method (AMLE). River discharge and nitrogen concentrations were de-trended in these calculations in order eliminate the effect of temporal changes on stream load. Effluent discharge information as well as total nitrogen concentrations from point sources were obtained from USEPA databases and from facility records. The model indicates that atmospheric deposition and fertilizer use account for a large percentage of the total nitrogen load in many of the larger watersheds throughout the study area. Point sources, on the other hand, are generally localized around large cities, are considered insignificant sources, and account for a small percentage of the total nitrogen loads throughout the study area.
Improved estimation of PM2.5 using Lagrangian satellite-measured aerosol optical depth
NASA Astrophysics Data System (ADS)
Olivas Saunders, Rolando
Suspended particulate matter (aerosols) with aerodynamic diameters less than 2.5 mum (PM2.5) has negative effects on human health, plays an important role in climate change and also causes the corrosion of structures by acid deposition. Accurate estimates of PM2.5 concentrations are thus relevant in air quality, epidemiology, cloud microphysics and climate forcing studies. Aerosol optical depth (AOD) retrieved by the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument has been used as an empirical predictor to estimate ground-level concentrations of PM2.5 . These estimates usually have large uncertainties and errors. The main objective of this work is to assess the value of using upwind (Lagrangian) MODIS-AOD as predictors in empirical models of PM2.5. The upwind locations of the Lagrangian AOD were estimated using modeled backward air trajectories. Since the specification of an arrival elevation is somewhat arbitrary, trajectories were calculated to arrive at four different elevations at ten measurement sites within the continental United States. A systematic examination revealed trajectory model calculations to be sensitive to starting elevation. With a 500 m difference in starting elevation, the 48-hr mean horizontal separation of trajectory endpoints was 326 km. When the difference in starting elevation was doubled and tripled to 1000 m and 1500m, the mean horizontal separation of trajectory endpoints approximately doubled and tripled to 627 km and 886 km, respectively. A seasonal dependence of this sensitivity was also found: the smallest mean horizontal separation of trajectory endpoints was exhibited during the summer and the largest separations during the winter. A daily average AOD product was generated and coupled to the trajectory model in order to determine AOD values upwind of the measurement sites during the period 2003-2007. Empirical models that included in situ AOD and upwind AOD as predictors of PM2.5 were generated by multivariate linear regressions using the least squares method. The multivariate models showed improved performance over the single variable regression (PM2.5 and in situ AOD) models. The statistical significance of the improvement of the multivariate models over the single variable regression models was tested using the extra sum of squares principle. In many cases, even when the R-squared was high for the multivariate models, the improvement over the single models was not statistically significant. The R-squared of these multivariate models varied with respect to seasons, with the best performance occurring during the summer months. A set of seasonal categorical variables was included in the regressions to exploit this variability. The multivariate regression models that included these categorical seasonal variables performed better than the models that didn't account for seasonal variability. Furthermore, 71% of these regressions exhibited improvement over the single variable models that was statistically significant at a 95% confidence level.
Acquisition Challenge: The Importance of Incompressibility in Comparing Learning Curve Models
2015-10-01
parameters for all four learning mod- els used in the study . The learning rate factor, b, is the slope of the linear regression line, which in this case is...incorporated within the DoD acquisition environment. This study tested three alternative learning models (the Stanford-B model, DeJong’s learning formula...appropriate tools to calculate accurate and reliable predictions. However, conventional learning curve methodology has been in practice since the pre
Prediction of sweetness and amino acid content in soybean crops from hyperspectral imagery
NASA Astrophysics Data System (ADS)
Monteiro, Sildomar Takahashi; Minekawa, Yohei; Kosugi, Yukio; Akazawa, Tsuneya; Oda, Kunio
Hyperspectral image data provides a powerful tool for non-destructive crop analysis. This paper investigates a hyperspectral image data-processing method to predict the sweetness and amino acid content of soybean crops. Regression models based on artificial neural networks were developed in order to calculate the level of sucrose, glucose, fructose, and nitrogen concentrations, which can be related to the sweetness and amino acid content of vegetables. A performance analysis was conducted comparing regression models obtained using different preprocessing methods, namely, raw reflectance, second derivative, and principal components analysis. This method is demonstrated using high-resolution hyperspectral data of wavelengths ranging from the visible to the near infrared acquired from an experimental field of green vegetable soybeans. The best predictions were achieved using a nonlinear regression model of the second derivative transformed dataset. Glucose could be predicted with greater accuracy, followed by sucrose, fructose and nitrogen. The proposed method provides the possibility to provide relatively accurate maps predicting the chemical content of soybean crop fields.
Group Additivity Determination for Oxygenates, Oxonium Ions, and Oxygen-Containing Carbenium Ions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dellon, Lauren D.; Sung, Chun-Yi; Robichaud, David J.
Bio-oil produced from biomass fast pyrolysis often requires catalytic upgrading to remove oxygen and acidic species over zeolite catalysts. The elementary reactions in the mechanism for this process involve carbenium and oxonium ions. In order to develop a detailed kinetic model for the catalytic upgrading of biomass, rate constants are required for these elementary reactions. The parameters in the Arrhenius equation can be related to thermodynamic properties through structure-reactivity relationships, such as the Evans-Polanyi relationship. For this relationship, enthalpies of formation of each species are required, which can be reasonably estimated using group additivity. However, the literature previously lacked groupmore » additivity values for oxygenates, oxonium ions, and oxygen-containing carbenium ions. In this work, 71 group additivity values for these types of groups were regressed, 65 of which had not been reported previously and six of which were newly estimated based on regression in the context of the 65 new groups. Heats of formation based on atomization enthalpy calculations for a set of reference molecules and isodesmic reactions for a small set of larger species for which experimental data was available were used to demonstrate the accuracy of the Gaussian-4 quantum mechanical method in estimating enthalpies of formation for species involving the moieties of interest. Isodesmic reactions for a total of 195 species were constructed from the reference molecules to calculate enthalpies of formation that were used to regress the group additivity values. The results showed an average deviation of 1.95 kcal/mol between the values calculated from Gaussian-4 and isodesmic reactions versus those calculated from the group additivity values that were newly regressed. Importantly, the new groups enhance the database for group additivity values, especially those involving oxonium ions.« less
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
[Mapping environmental vulnerability from ETM + data in the Yellow River Mouth Area].
Wang, Rui-Yan; Yu, Zhen-Wen; Xia, Yan-Ling; Wang, Xiang-Feng; Zhao, Geng-Xing; Jiang, Shu-Qian
2013-10-01
The environmental vulnerability retrieval is important to support continuing data. The spatial distribution of regional environmental vulnerability was got through remote sensing retrieval. In view of soil and vegetation, the environmental vulnerability evaluation index system was built, and the environmental vulnerability of sampling points was calculated by the AHP-fuzzy method, then the correlation between the sampling points environmental vulnerability and ETM + spectral reflectance ratio including some kinds of conversion data was analyzed to determine the sensitive spectral parameters. Based on that, models of correlation analysis, traditional regression, BP neural network and support vector regression were taken to explain the quantitative relationship between the spectral reflectance and the environmental vulnerability. With this model, the environmental vulnerability distribution was retrieved in the Yellow River Mouth Area. The results showed that the correlation between the environmental vulnerability and the spring NDVI, the September NDVI and the spring brightness was better than others, so they were selected as the sensitive spectral parameters. The model precision result showed that in addition to the support vector model, the other model reached the significant level. While all the multi-variable regression was better than all one-variable regression, and the model accuracy of BP neural network was the best. This study will serve as a reliable theoretical reference for the large spatial scale environmental vulnerability estimation based on remote sensing data.
Sumner, Anne E; Luercio, Marcella F; Frempong, Barbara A; Ricks, Madia; Sen, Sabyasachi; Kushner, Harvey; Tulloch-Reid, Marshall K
2009-02-01
The disposition index, the product of the insulin sensitivity index (S(I)) and the acute insulin response to glucose, is linked in African Americans to chromosome 11q. This link was determined with S(I) calculated with the nonlinear regression approach to the minimal model and data from the reduced-sample insulin-modified frequently-sampled intravenous glucose tolerance test (Reduced-Sample-IM-FSIGT). However, the application of the nonlinear regression approach to calculate S(I) using data from the Reduced-Sample-IM-FSIGT has been challenged as being not only inaccurate but also having a high failure rate in insulin-resistant subjects. Our goal was to determine the accuracy and failure rate of the Reduced-Sample-IM-FSIGT using the nonlinear regression approach to the minimal model. With S(I) from the Full-Sample-IM-FSIGT considered the standard and using the nonlinear regression approach to the minimal model, we compared the agreement between S(I) from the Full- and Reduced-Sample-IM-FSIGT protocols. One hundred African Americans (body mass index, 31.3 +/- 7.6 kg/m(2) [mean +/- SD]; range, 19.0-56.9 kg/m(2)) had FSIGTs. Glucose (0.3 g/kg) was given at baseline. Insulin was infused from 20 to 25 minutes (total insulin dose, 0.02 U/kg). For the Full-Sample-IM-FSIGT, S(I) was calculated based on the glucose and insulin samples taken at -1, 1, 2, 3, 4, 5, 6, 7, 8,10, 12, 14, 16, 19, 22, 23, 24, 25, 27, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150, and 180 minutes. For the Reduced-Sample-FSIGT, S(I) was calculated based on the time points that appear in bold. Agreement was determined by Spearman correlation, concordance, and the Bland-Altman method. In addition, for both protocols, the population was divided into tertiles of S(I). Insulin resistance was defined by the lowest tertile of S(I) from the Full-Sample-IM-FSIGT. The distribution of subjects across tertiles was compared by rank order and kappa statistic. We found that the rate of failure of resolution of S(I) by the Reduced-Sample-IM-FSIGT was 3% (3/100). For the remaining 97 subjects, S(I) for the Full- and Reduced-Sample-IM-FSIGTs were as follows: 3.76 +/- 2.41 L mU(-1) min(-1) (range, 0.58-14.50) and 4.29 +/- 2.89 L mU(-1) min(-1) (range, 0.52-14.42); relative error, 21% +/- 18%; Spearman r = 0.97; and concordance, 0.94 (both P < .001). After log transformation, the Bland-Altman limits of agreement were -0.29 and 0.53. The exact agreement for distribution of the population in the insulin-resistant tertile vs the insulin-sensitive tertiles was 92%, kappa of 0.82 +/- 0.06. Using the nonlinear regression approach and data from the Reduced-Sample-IM-FSIGT in subjects with a wide range of insulin sensitivity, failure to resolve S(I) occurred in only 3% of subjects. The agreement and maintenance of rank order of S(I) between protocols support the use of the nonlinear regression approach to the minimal model and the Reduced-Sample-IM-FSIGT in clinical studies.
Who Will Win?: Predicting the Presidential Election Using Linear Regression
ERIC Educational Resources Information Center
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Combined analysis of magnetic and gravity anomalies using normalized source strength (NSS)
NASA Astrophysics Data System (ADS)
Li, L.; Wu, Y.
2017-12-01
Gravity field and magnetic field belong to potential fields which lead inherent multi-solution. Combined analysis of magnetic and gravity anomalies based on Poisson's relation is used to determinate homology gravity and magnetic anomalies and decrease the ambiguity. The traditional combined analysis uses the linear regression of the reduction to pole (RTP) magnetic anomaly to the first order vertical derivative of the gravity anomaly, and provides the quantitative or semi-quantitative interpretation by calculating the correlation coefficient, slope and intercept. In the calculation process, due to the effect of remanent magnetization, the RTP anomaly still contains the effect of oblique magnetization. In this case the homology gravity and magnetic anomalies display irrelevant results in the linear regression calculation. The normalized source strength (NSS) can be transformed from the magnetic tensor matrix, which is insensitive to the remanence. Here we present a new combined analysis using NSS. Based on the Poisson's relation, the gravity tensor matrix can be transformed into the pseudomagnetic tensor matrix of the direction of geomagnetic field magnetization under the homologous condition. The NSS of pseudomagnetic tensor matrix and original magnetic tensor matrix are calculated and linear regression analysis is carried out. The calculated correlation coefficient, slope and intercept indicate the homology level, Poisson's ratio and the distribution of remanent respectively. We test the approach using synthetic model under complex magnetization, the results show that it can still distinguish the same source under the condition of strong remanence, and establish the Poisson's ratio. Finally, this approach is applied in China. The results demonstrated that our approach is feasible.
Factors associated with parasite dominance in fishes from Brazil.
Amarante, Cristina Fernandes do; Tassinari, Wagner de Souza; Luque, Jose Luis; Pereira, Maria Julia Salim
2016-06-14
The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.
Evaluating the perennial stream using logistic regression in central Taiwan
NASA Astrophysics Data System (ADS)
Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.
2014-12-01
This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.
A hybrid neural network model for noisy data regression.
Lee, Eric W M; Lim, Chee Peng; Yuen, Richard K K; Lo, S M
2004-04-01
A hybrid neural network model, based on the fusion of fuzzy adaptive resonance theory (FA ART) and the general regression neural network (GRNN), is proposed in this paper. Both FA and the GRNN are incremental learning systems and are very fast in network training. The proposed hybrid model, denoted as GRNNFA, is able to retain these advantages and, at the same time, to reduce the computational requirements in calculating and storing information of the kernels. A clustering version of the GRNN is designed with data compression by FA for noise removal. An adaptive gradient-based kernel width optimization algorithm has also been devised. Convergence of the gradient descent algorithm can be accelerated by the geometric incremental growth of the updating factor. A series of experiments with four benchmark datasets have been conducted to assess and compare effectiveness of GRNNFA with other approaches. The GRNNFA model is also employed in a novel application task for predicting the evacuation time of patrons at typical karaoke centers in Hong Kong in the event of fire. The results positively demonstrate the applicability of GRNNFA in noisy data regression problems.
Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali
2016-01-01
Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.
FPGA implementation of predictive degradation model for engine oil lifetime
NASA Astrophysics Data System (ADS)
Idros, M. F. M.; Razak, A. H. A.; Junid, S. A. M. Al; Suliman, S. I.; Halim, A. K.
2018-03-01
This paper presents the implementation of linear regression model for degradation prediction on Register Transfer Logic (RTL) using QuartusII. A stationary model had been identified in the degradation trend for the engine oil in a vehicle in time series method. As for RTL implementation, the degradation model is written in Verilog HDL and the data input are taken at a certain time. Clock divider had been designed to support the timing sequence of input data. At every five data, a regression analysis is adapted for slope variation determination and prediction calculation. Here, only the negative value are taken as the consideration for the prediction purposes for less number of logic gate. Least Square Method is adapted to get the best linear model based on the mean values of time series data. The coded algorithm has been implemented on FPGA for validation purposes. The result shows the prediction time to change the engine oil.
Linear and nonlinear models for predicting fish bioconcentration factors for pesticides.
Yuan, Jintao; Xie, Chun; Zhang, Ting; Sun, Jinfang; Yuan, Xuejie; Yu, Shuling; Zhang, Yingbiao; Cao, Yunyuan; Yu, Xingchen; Yang, Xuan; Yao, Wu
2016-08-01
This work is devoted to the applications of the multiple linear regression (MLR), multilayer perceptron neural network (MLP NN) and projection pursuit regression (PPR) to quantitative structure-property relationship analysis of bioconcentration factors (BCFs) of pesticides tested on Bluegill (Lepomis macrochirus). Molecular descriptors of a total of 107 pesticides were calculated with the DRAGON Software and selected by inverse enhanced replacement method. Based on the selected DRAGON descriptors, a linear model was built by MLR, nonlinear models were developed using MLP NN and PPR. The robustness of the obtained models was assessed by cross-validation and external validation using test set. Outliers were also examined and deleted to improve predictive power. Comparative results revealed that PPR achieved the most accurate predictions. This study offers useful models and information for BCF prediction, risk assessment, and pesticide formulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
The Application of Censored Regression Models in Low Streamflow Analyses
NASA Astrophysics Data System (ADS)
Kroll, C.; Luz, J.
2003-12-01
Estimation of low streamflow statistics at gauged and ungauged river sites is often a daunting task. This process is further confounded by the presence of intermittent streamflows, where streamflow is sometimes reported as zero, within a region. Streamflows recorded as zero may be zero, or may be less than the measurement detection limit. Such data is often referred to as censored data. Numerous methods have been developed to characterize intermittent streamflow series. Logit regression has been proposed to develop regional models of the probability annual lowflows series (such as 7-day lowflows) are zero. In addition, Tobit regression, a method of regression that allows for censored dependent variables, has been proposed for lowflow regional regression models in regions where the lowflow statistic of interest estimated as zero at some sites in the region. While these methods have been proposed, their use in practice has been limited. Here a delete-one jackknife simulation is presented to examine the performance of Logit and Tobit models of 7-day annual minimum flows in 6 USGS water resource regions in the United States. For the Logit model, an assessment is made of whether sites are correctly classified as having at least 10% of 7-day annual lowflows equal to zero. In such a situation, the 7-day, 10-year lowflow (Q710), a commonly employed low streamflow statistic, would be reported as zero. For the Tobit model, a comparison is made between results from the Tobit model, and from performing either ordinary least squares (OLS) or principal component regression (PCR) after the zero sites are dropped from the analysis. Initial results for the Logit model indicate this method to have a high probability of correctly classifying sites into groups with Q710s as zero and non-zero. Initial results also indicate the Tobit model produces better results than PCR and OLS when more than 5% of the sites in the region have Q710 values calculated as zero.
Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D
2017-07-01
Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.
Predicting flight delay based on multiple linear regression
NASA Astrophysics Data System (ADS)
Ding, Yi
2017-08-01
Delay of flight has been regarded as one of the toughest difficulties in aviation control. How to establish an effective model to handle the delay prediction problem is a significant work. To solve the problem that the flight delay is difficult to predict, this study proposes a method to model the arriving flights and a multiple linear regression algorithm to predict delay, comparing with Naive-Bayes and C4.5 approach. Experiments based on a realistic dataset of domestic airports show that the accuracy of the proposed model approximates 80%, which is further improved than the Naive-Bayes and C4.5 approach approaches. The result testing shows that this method is convenient for calculation, and also can predict the flight delays effectively. It can provide decision basis for airport authorities.
Powell, David E; Suganuma, Noriyuki; Kobayashi, Keiji; Nakamura, Tsutomu; Ninomiya, Kouzo; Matsumura, Kozaburo; Omura, Naoki; Ushioka, Satoshi
2017-02-01
Bioaccumulation and trophic transfer of cyclic volatile methylsiloxanes (cVMS), specifically octamethylcyclotetrasiloxane (D4), decamethylcyclopentasiloxane (D5), and dodecamethylcyclohexasiloxane (D6), were evaluated in the pelagic marine food web of Tokyo Bay, Japan. Polychlorinated biphenyl (PCB) congeners that are "legacy" chemicals known to bioaccumulate in aquatic organisms and biomagnify across aquatic food webs were used as a benchmark chemical (CB-180) to calibrate the sampled food web and as a reference chemical (CB-153) to validate the results. Trophic magnification factors (TMFs) were calculated from slopes of ordinary least-squares (OLS) regression models and slopes of bootstrap regression models, which were used as robust alternatives to the OLS models. Various regression models were developed that incorporated benchmarking to control bias associated with experimental design, food web dynamics, and trophic level structure. There was no evidence from any of the regression models to suggest biomagnification of cVMS in Tokyo Bay. Rather, the regression models indicated that trophic dilution of cVMS, not trophic magnification, occurred across the sampled food web. Comparison of results for Tokyo Bay to results from other studies indicated that bioaccumulation of cVMS was not related to type of food web (pelagic vs demersal), environment (marine vs freshwater), species composition, or location. Rather, results suggested that differences between study areas was likely related to food web dynamics and variable conditions of exposure resulting from non-uniform patterns of organism movement across spatial concentration gradients. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Potential pitfalls when denoising resting state fMRI data using nuisance regression.
Bright, Molly G; Tench, Christopher R; Murphy, Kevin
2017-07-01
In resting state fMRI, it is necessary to remove signal variance associated with noise sources, leaving cleaned fMRI time-series that more accurately reflect the underlying intrinsic brain fluctuations of interest. This is commonly achieved through nuisance regression, in which the fit is calculated of a noise model of head motion and physiological processes to the fMRI data in a General Linear Model, and the "cleaned" residuals of this fit are used in further analysis. We examine the statistical assumptions and requirements of the General Linear Model, and whether these are met during nuisance regression of resting state fMRI data. Using toy examples and real data we show how pre-whitening, temporal filtering and temporal shifting of regressors impact model fit. Based on our own observations, existing literature, and statistical theory, we make the following recommendations when employing nuisance regression: pre-whitening should be applied to achieve valid statistical inference of the noise model fit parameters; temporal filtering should be incorporated into the noise model to best account for changes in degrees of freedom; temporal shifting of regressors, although merited, should be achieved via optimisation and validation of a single temporal shift. We encourage all readers to make simple, practical changes to their fMRI denoising pipeline, and to regularly assess the appropriateness of the noise model used. By negotiating the potential pitfalls described in this paper, and by clearly reporting the details of nuisance regression in future manuscripts, we hope that the field will achieve more accurate and precise noise models for cleaning the resting state fMRI time-series. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Numerical modeling on carbon fiber composite material in Gaussian beam laser based on ANSYS
NASA Astrophysics Data System (ADS)
Luo, Ji-jun; Hou, Su-xia; Xu, Jun; Yang, Wei-jun; Zhao, Yun-fang
2014-02-01
Based on the heat transfer theory and finite element method, the macroscopic ablation model of Gaussian beam laser irradiated surface is built and the value of temperature field and thermal ablation development is calculated and analyzed rationally by using finite element software of ANSYS. Calculation results show that the ablating form of the materials in different irritation is of diversity. The laser irradiated surface is a camber surface rather than a flat surface, which is on the lowest point and owns the highest power density. Research shows that the higher laser power density absorbed by material surface, the faster the irritation surface regressed.
Zlotnik, Alexander; Gallardo-Antolín, Ascensión; Cuchí Alfaro, Miguel; Pérez Pérez, María Carmen; Montero Martínez, Juan Manuel
2015-08-01
Although emergency department visit forecasting can be of use for nurse staff planning, previous research has focused on models that lacked sufficient resolution and realistic error metrics for these predictions to be applied in practice. Using data from a 1100-bed specialized care hospital with 553,000 patients assigned to its healthcare area, forecasts with different prediction horizons, from 2 to 24 weeks ahead, with an 8-hour granularity, using support vector regression, M5P, and stratified average time-series models were generated with an open-source software package. As overstaffing and understaffing errors have different implications, error metrics and potential personnel monetary savings were calculated with a custom validation scheme, which simulated subsequent generation of predictions during a 4-year period. Results were then compared with a generalized estimating equation regression. Support vector regression and M5P models were found to be superior to the stratified average model with a 95% confidence interval. Our findings suggest that medium and severe understaffing situations could be reduced in more than an order of magnitude and average yearly savings of up to €683,500 could be achieved if dynamic nursing staff allocation was performed with support vector regression instead of the static staffing levels currently in use.
Power and sample size for multivariate logistic modeling of unmatched case-control studies.
Gail, Mitchell H; Haneuse, Sebastien
2017-01-01
Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.
A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test
NASA Technical Reports Server (NTRS)
Messer, Bradley P.
2004-01-01
Propulsion ground test facilities face the daily challenges of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Due to budgetary and schedule constraints, NASA and industry customers are pushing to test more components, for less money, in a shorter period of time. As these new rocket engine component test programs are undertaken, the lack of technology maturity in the test articles, combined with pushing the test facilities capabilities to their limits, tends to lead to an increase in facility breakdowns and unsuccessful tests. Over the last five years Stennis Space Center's propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and broken numerous test facility and test article parts. While various initiatives have been implemented to provide better propulsion test techniques and improve the quality, reliability, and maintainability of goods and parts used in the propulsion test facilities, unexpected failures during testing still occur quite regularly due to the harsh environment in which the propulsion test facilities operate. Previous attempts at modeling the lifecycle of a propulsion component test project have met with little success. Each of the attempts suffered form incomplete or inconsistent data on which to base the models. By focusing on the actual test phase of the tests project rather than the formulation, design or construction phases of the test project, the quality and quantity of available data increases dramatically. A logistic regression model has been developed form the data collected over the last five years, allowing the probability of successfully completing a rocket propulsion component test to be calculated. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),..,X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure. Logistic regression has primarily been used in the fields of epidemiology and biomedical research, but lends itself to many other applications. As indicated the use of logistic regression is not new, however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from the models provide project managers with insight and confidence into the affectivity of rocket engine component ground test projects. The initial success in modeling rocket propulsion ground test projects clears the way for more complex models to be developed in this area.
van Mil, Anke C C M; Greyling, Arno; Zock, Peter L; Geleijnse, Johanna M; Hopman, Maria T; Mensink, Ronald P; Reesink, Koen D; Green, Daniel J; Ghiadoni, Lorenzo; Thijssen, Dick H
2016-09-01
Brachial artery flow-mediated dilation (FMD) is a popular technique to examine endothelial function in humans. Identifying volunteer and methodological factors related to variation in FMD is important to improve measurement accuracy and applicability. Volunteer-related and methodology-related parameters were collected in 672 volunteers from eight affiliated centres worldwide who underwent repeated measures of FMD. All centres adopted contemporary expert-consensus guidelines for FMD assessment. After calculating the coefficient of variation (%) of the FMD for each individual, we constructed quartiles (n = 168 per quartile). Based on two regression models (volunteer-related factors and methodology-related factors), statistically significant components of these two models were added to a final regression model (calculated as β-coefficient and R). This allowed us to identify factors that independently contributed to the variation in FMD%. Median coefficient of variation was 17.5%, with healthy volunteers demonstrating a coefficient of variation 9.3%. Regression models revealed age (β = 0.248, P < 0.001), hypertension (β = 0.104, P < 0.001), dyslipidemia (β = 0.331, P < 0.001), time between measurements (β = 0.318, P < 0.001), lab experience (β = -0.133, P < 0.001) and baseline FMD% (β = 0.082, P < 0.05) as contributors to the coefficient of variation. After including all significant factors in the final model, we found that time between measurements, hypertension, baseline FMD% and lab experience with FMD independently predicted brachial artery variability (total R = 0.202). Although FMD% showed good reproducibility, larger variation was observed in conditions with longer time between measurements, hypertension, less experience and lower baseline FMD%. Accounting for these factors may improve FMD% variability.
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-07-01
SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.
Climate patterns as predictors of amphibians species richness and indicators of potential stress
Battaglin, W.; Hay, L.; McCabe, G.; Nanjappa, P.; Gallant, Alisa L.
2005-01-01
Amphibians occupy a range of habitats throughout the world, but species richness is greatest in regions with moist, warm climates. We modeled the statistical relations of anuran and urodele species richness with mean annual climate for the conterminous United States, and compared the strength of these relations at national and regional levels. Model variables were calculated for county and subcounty mapping units, and included 40-year (1960-1999) annual mean and mean annual climate statistics, mapping unit average elevation, mapping unit land area, and estimates of anuran and urodele species richness. Climate data were derived from more than 7,500 first-order and cooperative meteorological stations and were interpolated to the mapping units using multiple linear regression models. Anuran and urodele species richness were calculated from the United States Geological Survey's Amphibian Research and Monitoring Initiative (ARMI) National Atlas for Amphibian Distributions. The national multivariate linear regression (MLR) model of anuran species richness had an adjusted coefficient of determination (R2) value of 0.64 and the national MLR model for urodele species richness had an R2 value of 0.45. Stratifying the United States by coarse-resolution ecological regions provided models for anUrans that ranged in R2 values from 0.15 to 0.78. Regional models for urodeles had R2 values. ranging from 0.27 to 0.74. In general, regional models for anurans were more strongly influenced by temperature variables, whereas precipitation variables had a larger influence on urodele models.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
NASA Technical Reports Server (NTRS)
Carder, K. L.; Lee, Z. P.; Marra, John; Steward, R. G.; Perry, M. J.
1995-01-01
The quantum yield of photosynthesis (mol C/mol photons) was calculated at six depths for the waters of the Marine Light-Mixed Layer (MLML) cruise of May 1991. As there were photosynthetically available radiation (PAR) but no spectral irradiance measurements for the primary production incubations, three ways are presented here for the calculation of the absorbed photons (AP) by phytoplankton for the purpose of calculating phi. The first is based on a simple, nonspectral model; the second is based on a nonlinear regression using measured PAR values with depth; and the third is derived through remote sensing measurements. We show that the results of phi calculated using the nonlinear regreesion method and those using remote sensing are in good agreement with each other, and are consistent with the reported values of other studies. In deep waters, however, the simple nonspectral model may cause quantum yield values much higher than theoretically possible.
Wang, Shan-Shan; Zhang, Yu-Qi; Chen, Shu-Bao; Huang, Guo-Ying; Zhang, Hong-Yan; Zhang, Zhi-Fang; Wu, Lan-Ping; Hong, Wen-Jing; Shen, Rong; Liu, Yi-Qing; Zhu, Jun-Xue
2017-06-01
Clinical decision making in children with congenital and acquired heart disease relies on measurements of cardiac structures using two-dimensional echocardiography. We aimed to establish z-score regression equations for right heart structures in healthy Chinese Han children. Two-dimensional and M-mode echocardiography was performed in 515 patients. We measured the dimensions of the pulmonary valve annulus (PVA), main pulmonary artery (MPA), left pulmonary artery (LPA), right pulmonary artery (RPA), right ventricular outflow tract at end-diastole (RVOTd) and at end-systole (RVOTs), tricuspid valve annulus (TVA), right ventricular inflow tract at end-diastole (RVIDd) and at end-systole (RVIDs), and right atrium (RA). Regression analyses were conducted to relate the measurements of right heart structures to 4body surface area (BSA). Right ventricular outflow-tract fractional shortening (RVOTFS) was also calculated. Several models were used, and the best model was chosen to establish a z-score calculator. PVA, MPA, LPA, RPA, RVOTd, RVOTs, TVA, RVIDd, RVIDs, and RA (R 2 = 0.786, 0.705, 0.728, 0.701, 0.706, 0.824, 0.804, 0.663, 0.626, and 0.793, respectively) had a cubic polynomial relationship with BSA; specifically, measurement (M) = β0 + β1 × BSA + β2 × BSA 2 + β3 × BSA. 3 RVOTFS (0.28 ± 0.02) fell within a narrow range (0.12-0.51). Our results provide reference values for z scores and regression equations for right heart structures in Han Chinese children. These data may help interpreting the routine clinical measurement of right heart structures in children with congenital or acquired heart disease. © 2016 Wiley Periodicals, Inc. J Clin Ultrasound 45:293-303, 2017. © 2017 Wiley Periodicals, Inc.
Terziotti, Silvia; Capel, Paul D.; Tesoriero, Anthony J.; Hopple, Jessica A.; Kronholm, Scott C.
2018-03-07
The water quality of the Chesapeake Bay may be adversely affected by dissolved nitrate carried in groundwater discharge to streams. To estimate the concentrations, loads, and yields of nitrate from groundwater to streams for the Chesapeake Bay watershed, a regression model was developed based on measured nitrate concentrations from 156 small streams with watersheds less than 500 square miles (mi2 ) at baseflow. The regression model has three predictive variables: geologic unit, percent developed land, and percent agricultural land. Comparisons of estimated and actual values within geologic units were closely matched. The coefficient of determination (R2 ) for the model was 0.6906. The model was used to calculate baseflow nitrate concentrations at over 83,000 National Hydrography Dataset Plus Version 2 catchments and aggregated to 1,966 total 12-digit hydrologic units in the Chesapeake Bay watershed. The modeled output geospatial data layers provided estimated annual loads and yields of nitrate from groundwater into streams. The spatial distribution of annual nitrate yields from groundwater estimated by this method was compared to the total watershed yields of all sources estimated from a Chesapeake Bay SPAtially Referenced Regressions On Watershed attributes (SPARROW) water-quality model. The comparison showed similar spatial patterns. The regression model for groundwater contribution had similar but lower yields, suggesting that groundwater is an important source of nitrogen for streams in the Chesapeake Bay watershed.
The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression?
Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J; Ma, Keping
2013-01-01
Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees.
Alishiri, Gholam Hossein; Bayat, Noushin; Fathi Ashtiani, Ali; Tavallaii, Seyed Abbas; Assari, Shervin; Moharamzad, Yashar
2008-01-01
The aim of this work was to develop two logistic regression models capable of predicting physical and mental health related quality of life (HRQOL) among rheumatoid arthritis (RA) patients. In this cross-sectional study which was conducted during 2006 in the outpatient rheumatology clinic of our university hospital, Short Form 36 (SF-36) was used for HRQOL measurements in 411 RA patients. A cutoff point to define poor versus good HRQOL was calculated using the first quartiles of SF-36 physical and mental component scores (33.4 and 36.8, respectively). Two distinct logistic regression models were used to derive predictive variables including demographic, clinical, and psychological factors. The sensitivity, specificity, and accuracy of each model were calculated. Poor physical HRQOL was positively associated with pain score, disease duration, monthly family income below 300 US$, comorbidity, patient global assessment of disease activity or PGA, and depression (odds ratios: 1.1; 1.004; 15.5; 1.1; 1.02; 2.08, respectively). The variables that entered into the poor mental HRQOL prediction model were monthly family income below 300 US$, comorbidity, PGA, and bodily pain (odds ratios: 6.7; 1.1; 1.01; 1.01, respectively). Optimal sensitivity and specificity were achieved at a cutoff point of 0.39 for the estimated probability of poor physical HRQOL and 0.18 for mental HRQOL. Sensitivity, specificity, and accuracy of the physical and mental models were 73.8, 87, 83.7% and 90.38, 70.36, 75.43%, respectively. The results show that the suggested models can be used to predict poor physical and mental HRQOL separately among RA patients using simple variables with acceptable accuracy. These models can be of use in the clinical decision-making of RA patients and to recognize patients with poor physical or mental HRQOL in advance, for better management.
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
Risk factors for lesions of the knee menisci among workers in South Korea's national parks.
Shin, Donghee; Youn, Kanwoo; Lee, Eunja; Lee, Myeongjun; Chung, Hweemin; Kim, Deokweon
2016-01-01
This study was designed to investigate the prevalence of the menisci lesions in national park workers and work factors affecting this prevalence. The study subjects were 698 workers who worked in 20 Korean national parks in 2014. An orthopedist visited each national park and performed physical examinations. Knee MRI was performed if the McMurray test or Apley test was positive and there was a complaint of pain in knee area. An orthopedist and a radiologist respectively read these images of the menisci using a grading system based on the MRI signals. To calculate the cumulative intensity of trekking of the workers, the mean trail distance, the difficulty of the trail, the tenure at each national parks, and the number of treks per month for each worker from the start of work until the present were investigated. Chi-square tests was performed to see if there were differences in the menisci lesions grade according to the variables. The variables used in the Chi-square test were evaluated using simple logistic regression analysis to get crude odds ratios, and adjusted odds ratios and 95 % confidence intervals were calculated using multivariate logistic regression analysis after establishing three different models according to the adjusted variables. According to the MRI signal grades of menisci, 29 % were grade 0, 11.3 % were grade 1, 46.0 % were grade 2, and 13.7 % were grade 3. The differences in the MRI signal grades of menisci according to age and the intensity of trekking as calculated by the three different methods were statistically significant. Multiple logistic regression analysis was performed for three models. In model 1, there was no statistically significant factor affecting the menisci lesions. In model 2, among the factors affecting the menisci lesions, the OR of a high cumulative intensity of trekking was 4.08 (95 % CI 1.00-16.61), and in model 3, the OR of a high cumulative intensity of trekking was 5.84 (95 % CI 1.09-31.26). The factor that most affected the menisci lesions among the workers in Korean national park was a high cumulative intensity of trekking.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
QSAR Analysis of 2-Amino or 2-Methyl-1-Substituted Benzimidazoles Against Pseudomonas aeruginosa
Podunavac-Kuzmanović, Sanja O.; Cvetković, Dragoljub D.; Barna, Dijana J.
2009-01-01
A set of benzimidazole derivatives were tested for their inhibitory activities against the Gram-negative bacterium Pseudomonas aeruginosa and minimum inhibitory concentrations were determined for all the compounds. Quantitative structure activity relationship (QSAR) analysis was applied to fourteen of the abovementioned derivatives using a combination of various physicochemical, steric, electronic, and structural molecular descriptors. A multiple linear regression (MLR) procedure was used to model the relationships between molecular descriptors and the antibacterial activity of the benzimidazole derivatives. The stepwise regression method was used to derive the most significant models as a calibration model for predicting the inhibitory activity of this class of molecules. The best QSAR models were further validated by a leave one out technique as well as by the calculation of statistical parameters for the established theoretical models. To confirm the predictive power of the models, an external set of molecules was used. High agreement between experimental and predicted inhibitory values, obtained in the validation procedure, indicated the good quality of the derived QSAR models. PMID:19468332
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
Wang, Shan-Shan; Hong, Wen-Jing; Zhang, Yu-Qi; Chen, Shu-Bao; Huang, Guo-Ying; Zhang, Hong-Yan; Chen, Li-Jun; Wu, Lan-Ping; Shen, Rong; Liu, Yi-Qing; Zhu, Jun-Xue
2018-06-01
Clinical decision making in children with heart disease relies on detailed measurements of cardiac structures using two-dimensional and M-mode echocardiography. However, no echocardiographic reference values are available for the Chinese children. We aimed to establish z-score regression equations for left heart structures in a population-based cohort of healthy Chinese Han children. Echocardiography was performed in 545 children with a normal heart. The dimensions of the aortic valve annulus (AVA), aortic sinuses of Valsalva (ASV), sinotubular junction (STJ), ascending aorta (AAO), left atrium (LA), mitral valve annulus (MVA), interventricular septal end-diastolic thickness (IVSd), interventricular septal end-systolic thickness (IVSs), left ventricular end-diastolic diameter (LVIDd), left ventricular end-systolic diameter (LVIDs), left ventricular posterior wall end-diastolic thickness (LVPWd), left ventricular posterior wall end-systolic thickness (LVPWs) were measured. Regression analyses were conducted to relate the measurements of left heart structures to body surface area (BSA). Left ventricular ejection fraction (LVEF) and left ventricular fractional shortening (LVFS) were calculated. Several models were used, and the adjusted R2 values were compared for each model. AVA, ASV, STJ, AAO, LA, MVA, IVSd, IVSs, LVIDd, LVIDs, LVPWd, and LVPWs had a cubic relationship with BSA. LVEF and LVFS fell within a narrow range. Our results provide reference values for z scores and regression equations for left heart structures in Han Chinese children. These data may help make a quick and accurate judgment of the routine clinical measurement of left heart structures in children with heart disease. © 2018 Wiley Periodicals, Inc.
Wang, Wei; Griswold, Michael E
2016-11-30
The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This paper considers the problem of analysis of correlation coefficients from a multivariate normal population. A unified theorem is derived for the regression model with normally distributed explanatory variables and the general results are employed to provide useful expressions for the distributions of simple, multiple, and partial-multiple…
1981-04-28
on initial doses. Residual doses are determined through an automiated procedure that utilizes raw data in regression analyses to fit space-time models...show their relationship to the observer positions. The computer-calculated doses do not reflect the presence of the human body in the radiological
ERIC Educational Resources Information Center
Pallone, Nathaniel J.; Hennessy, James J.; Voelbel, Gerald T.
1998-01-01
A scientifically sound methodology for identifying offenders about whose presence the community should be notified is demonstrated. A stepwise multiple regression was calculated among incarcerated pedophiles (N=52) including both psychological and legal data; a precision-weighted equation produced 90.4% "true positives." This methodology can be…
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Mehl, S.; Hill, M.C.
2002-01-01
Models with local grid refinement, as often required in groundwater models, pose special problems for model calibration. This work investigates the calculation of sensitivities and the performance of regression methods using two existing and one new method of grid refinement. The existing local grid refinement methods considered are: (a) a variably spaced grid in which the grid spacing becomes smaller near the area of interest and larger where such detail is not needed, and (b) telescopic mesh refinement (TMR), which uses the hydraulic heads or fluxes of a regional model to provide the boundary conditions for a locally refined model. The new method has a feedback between the regional and local grids using shared nodes, and thereby, unlike the TMR methods, balances heads and fluxes at the interfacing boundary. Results for sensitivities are compared for the three methods and the effect of the accuracy of sensitivity calculations are evaluated by comparing inverse modelling results. For the cases tested, results indicate that the inaccuracies of the sensitivities calculated using the TMR approach can cause the inverse model to converge to an incorrect solution.
Mehl, S.; Hill, M.C.
2002-01-01
Models with local grid refinement, as often required in groundwater models, pose special problems for model calibration. This work investigates the calculation of sensitivities and performance of regression methods using two existing and one new method of grid refinement. The existing local grid refinement methods considered are (1) a variably spaced grid in which the grid spacing becomes smaller near the area of interest and larger where such detail is not needed and (2) telescopic mesh refinement (TMR), which uses the hydraulic heads or fluxes of a regional model to provide the boundary conditions for a locally refined model. The new method has a feedback between the regional and local grids using shared nodes, and thereby, unlike the TMR methods, balances heads and fluxes at the interfacing boundary. Results for sensitivities are compared for the three methods and the effect of the accuracy of sensitivity calculations are evaluated by comparing inverse modelling results. For the cases tested, results indicate that the inaccuracies of the sensitivities calculated using the TMR approach can cause the inverse model to converge to an incorrect solution.
[Hungarian health resource allocation from the viewpoint of the English methodology].
Fadgyas-Freyler, Petra
2018-02-01
This paper describes both the English health resource allocation and the attempt of its Hungarian adaptation. We describe calculations for a Hungarian regression model using the English 'weighted capitation formula'. The model has proven statistically correct. New independent variables and homogenous regional units have to be found for Hungary. The English method can be used with adequate variables. Hungarian patient-level health data can support a much more sophisticated model. Further research activities are needed. Orv Hetil. 2018; 159(5): 183-191.
Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong
2017-12-28
Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.
Riley, Richard D; Ensor, Joie; Jackson, Dan; Burke, Danielle L
2017-01-01
Many meta-analysis models contain multiple parameters, for example due to multiple outcomes, multiple treatments or multiple regression coefficients. In particular, meta-regression models may contain multiple study-level covariates, and one-stage individual participant data meta-analysis models may contain multiple patient-level covariates and interactions. Here, we propose how to derive percentage study weights for such situations, in order to reveal the (otherwise hidden) contribution of each study toward the parameter estimates of interest. We assume that studies are independent, and utilise a decomposition of Fisher's information matrix to decompose the total variance matrix of parameter estimates into study-specific contributions, from which percentage weights are derived. This approach generalises how percentage weights are calculated in a traditional, single parameter meta-analysis model. Application is made to one- and two-stage individual participant data meta-analyses, meta-regression and network (multivariate) meta-analysis of multiple treatments. These reveal percentage study weights toward clinically important estimates, such as summary treatment effects and treatment-covariate interactions, and are especially useful when some studies are potential outliers or at high risk of bias. We also derive percentage study weights toward methodologically interesting measures, such as the magnitude of ecological bias (difference between within-study and across-study associations) and the amount of inconsistency (difference between direct and indirect evidence in a network meta-analysis).
Mandel, Micha; Gauthier, Susan A; Guttmann, Charles R G; Weiner, Howard L; Betensky, Rebecca A
2007-12-01
The expanded disability status scale (EDSS) is an ordinal score that measures progression in multiple sclerosis (MS). Progression is defined as reaching EDSS of a certain level (absolute progression) or increasing of one point of EDSS (relative progression). Survival methods for time to progression are not adequate for such data since they do not exploit the EDSS level at the end of follow-up. Instead, we suggest a Markov transitional model applicable for repeated categorical or ordinal data. This approach enables derivation of covariate-specific survival curves, obtained after estimation of the regression coefficients and manipulations of the resulting transition matrix. Large sample theory and resampling methods are employed to derive pointwise confidence intervals, which perform well in simulation. Methods for generating survival curves for time to EDSS of a certain level, time to increase of EDSS of at least one point, and time to two consecutive visits with EDSS greater than three are described explicitly. The regression models described are easily implemented using standard software packages. Survival curves are obtained from the regression results using packages that support simple matrix calculation. We present and demonstrate our method on data collected at the Partners MS center in Boston, MA. We apply our approach to progression defined by time to two consecutive visits with EDSS greater than three, and calculate crude (without covariates) and covariate-specific curves.
Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A
2005-04-15
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
Kemper, Claudia; Koller, Daniela; Glaeske, Gerd; van den Bussche, Hendrik
2011-01-01
Aphasia, dementia, and depression are important and common neurological and neuropsychological disorders after ischemic stroke. We estimated the frequency of these comorbidities and their impact on mortality and nursing care dependency. Data of a German statutory health insurance were analyzed for people aged 50 years and older with first ischemic stroke. Aphasia, dementia, and depression were defined on the basis of outpatient medical diagnoses within 1 year after stroke. Logistic regression models for mortality and nursing care dependency were calculated and were adjusted for age, sex, and other relevant comorbidity. Of 977 individuals with a first ischemic stroke, 14.8% suffered from aphasia, 12.5% became demented, and 22.4% became depressed. The regression model for mortality showed a significant influence of age, aphasia, and other relevant comorbidity. In the regression model for nursing care dependency, the factors age, aphasia, dementia, depression, and other relevant comorbidity were significant. Aphasia has a high impact on mortality and nursing care dependency after ischemic stroke, while dementia and depression are strongly associated with increasing nursing care dependency.
A prediction model for lift-fan simulator performance. M.S. Thesis - Cleveland State Univ.
NASA Technical Reports Server (NTRS)
Yuska, J. A.
1972-01-01
The performance characteristics of a model VTOL lift-fan simulator installed in a two-dimensional wing are presented. The lift-fan simulator consisted of a 15-inch diameter fan driven by a turbine contained in the fan hub. The performance of the lift-fan simulator was measured in two ways: (1) the calculated momentum thrust of the fan and turbine (total thrust loading), and (2) the axial-force measured on a load cell force balance (axial-force loading). Tests were conducted over a wide range of crossflow velocities, corrected tip speeds, and wing angle of attack. A prediction modeling technique was developed to help in analyzing the performance characteristics of lift-fan simulators. A multiple linear regression analysis technique is presented which calculates prediction model equations for the dependent variables.
Acidity in DMSO from the embedded cluster integral equation quantum solvation model.
Heil, Jochen; Tomazic, Daniel; Egbers, Simon; Kast, Stefan M
2014-04-01
The embedded cluster reference interaction site model (EC-RISM) is applied to the prediction of acidity constants of organic molecules in dimethyl sulfoxide (DMSO) solution. EC-RISM is based on a self-consistent treatment of the solute's electronic structure and the solvent's structure by coupling quantum-chemical calculations with three-dimensional (3D) RISM integral equation theory. We compare available DMSO force fields with reference calculations obtained using the polarizable continuum model (PCM). The results are evaluated statistically using two different approaches to eliminating the proton contribution: a linear regression model and an analysis of pK(a) shifts for compound pairs. Suitable levels of theory for the integral equation methodology are benchmarked. The results are further analyzed and illustrated by visualizing solvent site distribution functions and comparing them with an aqueous environment.
NASA Astrophysics Data System (ADS)
Guan, Yafu; Yang, Shuo; Zhang, Dong H.
2018-04-01
Gaussian process regression (GPR) is an efficient non-parametric method for constructing multi-dimensional potential energy surfaces (PESs) for polyatomic molecules. Since not only the posterior mean but also the posterior variance can be easily calculated, GPR provides a well-established model for active learning, through which PESs can be constructed more efficiently and accurately. We propose a strategy of active data selection for the construction of PESs with emphasis on low energy regions. Through three-dimensional (3D) example of H3, the validity of this strategy is verified. The PESs for two prototypically reactive systems, namely, H + H2O ↔ H2 + OH reaction and H + CH4 ↔ H2 + CH3 reaction are reconstructed. Only 920 and 4000 points are assembled to reconstruct these two PESs respectively. The accuracy of the GP PESs is not only tested by energy errors but also validated by quantum scattering calculations.
Homogenization Issues in the Combustion of Heterogeneous Solid Propellants
NASA Technical Reports Server (NTRS)
Chen, M.; Buckmaster, J.; Jackson, T. L.; Massa, L.
2002-01-01
We examine random packs of discs or spheres, models for ammonium-perchlorate-in-binder propellants, and discuss their average properties. An analytical strategy is described for calculating the mean or effective heat conduction coefficient in terms of the heat conduction coefficients of the individual components, and the results are verified by comparison with those of direct numerical simulations (dns) for both 2-D (disc) and 3-D (sphere) packs across which a temperature difference is applied. Similarly, when the surface regression speed of each component is related to the surface temperature via a simple Arrhenius law, an analytical strategy is developed for calculating an effective Arrhenius law for the combination, and these results are verified using dns in which a uniform heat flux is applied to the pack surface, causing it to regress. These results are needed for homogenization strategies necessary for fully integrated 2-D or 3-D simulations of heterogeneous propellant combustion.
Data mining for materials design: A computational study of single molecule magnet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dam, Hieu Chi; Faculty of Physics, Vietnam National University, 334 Nguyen Trai, Hanoi; Pham, Tien Lam
2014-01-28
We develop a method that combines data mining and first principles calculation to guide the designing of distorted cubane Mn{sup 4+} Mn {sub 3}{sup 3+} single molecule magnets. The essential idea of the method is a process consisting of sparse regressions and cross-validation for analyzing calculated data of the materials. The method allows us to demonstrate that the exchange coupling between Mn{sup 4+} and Mn{sup 3+} ions can be predicted from the electronegativities of constituent ligands and the structural features of the molecule by a linear regression model with high accuracy. The relations between the structural features and magnetic propertiesmore » of the materials are quantitatively and consistently evaluated and presented by a graph. We also discuss the properties of the materials and guide the material design basing on the obtained results.« less
Vesicular stomatitis forecasting based on Google Trends
Lu, Yi; Zhou, GuangYa; Chen, Qin
2018-01-01
Background Vesicular stomatitis (VS) is an important viral disease of livestock. The main feature of VS is irregular blisters that occur on the lips, tongue, oral mucosa, hoof crown and nipple. Humans can also be infected with vesicular stomatitis and develop meningitis. This study analyses 2014 American VS outbreaks in order to accurately predict vesicular stomatitis outbreak trends. Methods American VS outbreaks data were collected from OIE. The data for VS keywords were obtained by inputting 24 disease-related keywords into Google Trends. After calculating the Pearson and Spearman correlation coefficients, it was found that there was a relationship between outbreaks and keywords derived from Google Trends. Finally, the predicted model was constructed based on qualitative classification and quantitative regression. Results For the regression model, the Pearson correlation coefficients between the predicted outbreaks and actual outbreaks are 0.953 and 0.948, respectively. For the qualitative classification model, we constructed five classification predictive models and chose the best classification predictive model as the result. The results showed, SN (sensitivity), SP (specificity) and ACC (prediction accuracy) values of the best classification predictive model are 78.52%,72.5% and 77.14%, respectively. Conclusion This study applied Google search data to construct a qualitative classification model and a quantitative regression model. The results show that the method is effective and that these two models obtain more accurate forecast. PMID:29385198
Finite Element Vibration Modeling and Experimental Validation for an Aircraft Engine Casing
NASA Astrophysics Data System (ADS)
Rabbitt, Christopher
This thesis presents a procedure for the development and validation of a theoretical vibration model, applies this procedure to a pair of aircraft engine casings, and compares select parameters from experimental testing of those casings to those from a theoretical model using the Modal Assurance Criterion (MAC) and linear regression coefficients. A novel method of determining the optimal MAC between axisymmetric results is developed and employed. It is concluded that the dynamic finite element models developed as part of this research are fully capable of modelling the modal parameters within the frequency range of interest. Confidence intervals calculated in this research for correlation coefficients provide important information regarding the reliability of predictions, and it is recommended that these intervals be calculated for all comparable coefficients. The procedure outlined for aligning mode shapes around an axis of symmetry proved useful, and the results are promising for the development of further optimization techniques.
Microbial Communities Model Parameter Calculation for TSPA/SR
DOE Office of Scientific and Technical Information (OSTI.GOV)
D. Jolley
2001-07-16
This calculation has several purposes. First the calculation reduces the information contained in ''Committed Materials in Repository Drifts'' (BSC 2001a) to useable parameters required as input to MING V1.O (CRWMS M&O 1998, CSCI 30018 V1.O) for calculation of the effects of potential in-drift microbial communities as part of the microbial communities model. The calculation is intended to replace the parameters found in Attachment II of the current In-Drift Microbial Communities Model revision (CRWMS M&O 2000c) with the exception of Section 11-5.3. Second, this calculation provides the information necessary to supercede the following DTN: M09909SPAMING1.003 and replace it with a newmore » qualified dataset (see Table 6.2-1). The purpose of this calculation is to create the revised qualified parameter input for MING that will allow {Delta}G (Gibbs Free Energy) to be corrected for long-term changes to the temperature of the near-field environment. Calculated herein are the quadratic or second order regression relationships that are used in the energy limiting calculations to potential growth of microbial communities in the in-drift geochemical environment. Third, the calculation performs an impact review of a new DTN: M00012MAJIONIS.000 that is intended to replace the currently cited DTN: GS9809083 12322.008 for water chemistry data used in the current ''In-Drift Microbial Communities Model'' revision (CRWMS M&O 2000c). Finally, the calculation updates the material lifetimes reported on Table 32 in section 6.5.2.3 of the ''In-Drift Microbial Communities'' AMR (CRWMS M&O 2000c) based on the inputs reported in BSC (2001a). Changes include adding new specified materials and updating old materials information that has changed.« less
Burning rate for steel-cased, pressed binderless HMX
NASA Technical Reports Server (NTRS)
Fifer, R. A.; Cole, J. E.
1980-01-01
The burning behavior of pressed binderless HMX laterally confined in 6.4 mm i.d. steel cases was measured over the pressure range 1.45 to 338 MPa in a constant pressure strand burner. The measured regression rates are compared to those reported previously for unconfined samples. It is shown that lateral confinement results in a several-fold decrease in the regression rate for the coarse particle size HMX above the transition to super fast regression. For class E samples, confinement shifts the transition to super fast regression from low pressure to high pressure. These results are interpreted in terms of the previously proposed progressive deconsolidation mechanism. Preliminary holographic photography and closed bomb tests are also described. Theoretical one dimensional modeling calculations were carried out to predict the expected flame height (particle burn out distance) as a function of particle size and pressure for binderless HMX burning by a progressive deconsolidation mechanism.
Fananapazir, Ghaneh; Benzl, Robert; Corwin, Michael T; Chen, Ling-Xin; Sageshima, Junichiro; Stewart, Susan L; Troppmann, Christoph
2018-07-01
Purpose To determine whether the predonation computed tomography (CT)-based volume of the future remnant kidney is predictive of postdonation renal function in living kidney donors. Materials and Methods This institutional review board-approved, retrospective, HIPAA-compliant study included 126 live kidney donors who had undergone predonation renal CT between January 2007 and December 2014 as well as 2-year postdonation measurement of estimated glomerular filtration rate (eGFR). The whole kidney volume and cortical volume of the future remnant kidney were measured and standardized for body surface area (BSA). Bivariate linear associations between the ratios of whole kidney volume to BSA and cortical volume to BSA were obtained. A linear regression model for 2-year postdonation eGFR that incorporated donor age, sex, and either whole kidney volume-to-BSA ratio or cortical volume-to-BSA ratio was created, and the coefficient of determination (R 2 ) for the model was calculated. Factors not statistically additive in assessing 2-year eGFR were removed by using backward elimination, and the coefficient of determination for this parsimonious model was calculated. Results Correlation was slightly better for cortical volume-to-BSA ratio than for whole kidney volume-to-BSA ratio (r = 0.48 vs r = 0.44, respectively). The linear regression model incorporating all donor factors had an R 2 of 0.66. The only factors that were significantly additive to the equation were cortical volume-to-BSA ratio and predonation eGFR (P = .01 and P < .01, respectively), and the final parsimonious linear regression model incorporating these two variables explained almost the same amount of variance (R 2 = 0.65) as did the full model. Conclusion The cortical volume of the future remnant kidney helped predict postdonation eGFR at 2 years. The cortical volume-to-BSA ratio should thus be considered for addition as an important variable to living kidney donor evaluation and selection guidelines. © RSNA, 2018.
Seasonal Variation in Physical Activity among Preschool Children in a Northern Canadian City
ERIC Educational Resources Information Center
Carson, Valerie; Spence, John C.; Cutumisu, Nicoleta; Boule, Normand; Edwards, Joy
2010-01-01
Little research has examined seasonal differences in physical activity (PA) levels among children. Proxy reports of PA were completed by 1,715 parents on their children in Edmonton, Alberta, Canada. Total PA (TPA) minutes were calculated, and each participant was classified as active, somewhat active, or inactive. Logistic regression models were…
A Meta-Analysis of Peer-Mediated Interventions for Young Children with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Zhang, Jie; Wheeler, John J.
2011-01-01
This meta-analysis investigated the efficacy of peer-mediated interventions for promoting social interactions among children from birth to eight years of age diagnosed with ASD. Forty-five single-subject design studies were analyzed and the effect sizes were calculated by the regression model developed by Allison and Gorman (1993). The overall…
NASA Astrophysics Data System (ADS)
Cai, Jun; Wang, Kuaishe; Shi, Jiamin; Wang, Wen; Liu, Yingying
2018-01-01
Constitutive analysis for hot working of BFe10-1-2 alloy was carried out by using experimental stress-strain data from isothermal hot compression tests, in a wide range of temperature of 1,023 1,273 K, and strain rate range of 0.001 10 s-1. A constitutive equation based on modified double multiple nonlinear regression was proposed considering the independent effects of strain, strain rate, temperature and their interrelation. The predicted flow stress data calculated from the developed equation was compared with the experimental data. Correlation coefficient (R), average absolute relative error (AARE) and relative errors were introduced to verify the validity of the developed constitutive equation. Subsequently, a comparative study was made on the capability of strain-compensated Arrhenius-type constitutive model. The results showed that the developed constitutive equation based on modified double multiple nonlinear regression could predict flow stress of BFe10-1-2 alloy with good correlation and generalization.
Numerical investigation on the regression rate of hybrid rocket motor with star swirl fuel grain
NASA Astrophysics Data System (ADS)
Zhang, Shuai; Hu, Fan; Zhang, Weihua
2016-10-01
Although hybrid rocket motor is prospected to have distinct advantages over liquid and solid rocket motor, low regression rate and insufficient efficiency are two major disadvantages which have prevented it from being commercially viable. In recent years, complex fuel grain configurations are attractive in overcoming the disadvantages with the help of Rapid Prototyping technology. In this work, an attempt has been made to numerically investigate the flow field characteristics and local regression rate distribution inside the hybrid rocket motor with complex star swirl grain. A propellant combination with GOX and HTPB has been chosen. The numerical model is established based on the three dimensional Navier-Stokes equations with turbulence, combustion, and coupled gas/solid phase formulations. The calculated fuel regression rate is compared with the experimental data to validate the accuracy of numerical model. The results indicate that, comparing the star swirl grain with the tube grain under the conditions of the same port area and the same grain length, the burning surface area rises about 200%, the spatially averaged regression rate rises as high as about 60%, and the oxidizer can combust sufficiently due to the big vortex around the axis in the aft-mixing chamber. The combustion efficiency of star swirl grain is better and more stable than that of tube grain.
Paul, Christoph; Heun, Christine; Müller, Hans-Helge; Hoerauf, Hans; Feltgen, Nicolas; Wachtlin, Joachim; Kaymak, Hakan; Mennel, Stefan; Koss, Michael Janusz; Fauser, Sascha; Maier, Mathias M; Schumann, Ricarda G; Mueller, Simone; Chang, Petrus; Schmitz-Valckenberg, Steffen; Kazerounian, Sara; Szurman, Peter; Lommatzsch, Albrecht; Bertelmann, Thomas
2017-10-31
To evaluate predictive factors for the treatment success of ocriplasmin and to use these factors to generate a multivariate model to calculate the individual probability of successful treatment. Data were collected in a retrospective, multicentre cohort study. Patients with vitreomacular traction (VMT) syndrome without a full-thickness macular hole were included if they received an intravitreal injection (IVI) of ocriplasmin. Five factors (age, gender, lens status, presence of epiretinal membrane (ERM) formation and horizontal diameter of VMT) were assessed on their association with VMT resolution. A multivariable logistic regression model was employed to further analyse these factors and calculate the individual probability of successful treatment. 167 eyes of 167 patients were included. Univariate analysis revealed a significant correlation to VMT resolution for all analysed factors: age (years) (OR 0.9208; 95% CI 0.8845 to 0.9586; p<0.0001), gender (male) (OR 0.480; 95% CI 0.241 to 0.957; p=0.0371), lens status (phakic) (OR 2.042; 95% CI 1.054 to 3.958; p=0.0344), ERM formation (present) (OR 0.384; 95% CI 0.179 to 0.821; p=0.0136) and horizontal VMT diameter (µm) (OR 0.99812; 95% CI 0.99684 to 0.99941, p=0.0042). A significant multivariable logistic regression model was established with age and VMT diameter. Known predictive factors for VMT resolution after ocriplasmin IVI were confirmed in our study. We were able to combine them into a formula, ultimately allowing the calculation of an individual probability of treatment success with ocriplasmin in patients with VMT syndrome without FTHM. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Wang, X; Xu, Y H; Du, Z Y; Qian, Y J; Xu, Z H; Chen, R; Shi, M H
2018-02-23
Objective: This study aims to analyze the relationship among the clinical features, radiologic characteristics and pathological diagnosis in patients with solitary pulmonary nodules, and establish a prediction model for the probability of malignancy. Methods: Clinical data of 372 patients with solitary pulmonary nodules who underwent surgical resection with definite postoperative pathological diagnosis were retrospectively analyzed. In these cases, we collected clinical and radiologic features including gender, age, smoking history, history of tumor, family history of cancer, the location of lesion, ground-glass opacity, maximum diameter, calcification, vessel convergence sign, vacuole sign, pleural indentation, speculation and lobulation. The cases were divided to modeling group (268 cases) and validation group (104 cases). A new prediction model was established by logistic regression analying the data from modeling group. Then the data of validation group was planned to validate the efficiency of the new model, and was compared with three classical models(Mayo model, VA model and LiYun model). With the calculated probability values for each model from validation group, SPSS 22.0 was used to draw the receiver operating characteristic curve, to assess the predictive value of this new model. Results: 112 benign SPNs and 156 malignant SPNs were included in modeling group. Multivariable logistic regression analysis showed that gender, age, history of tumor, ground -glass opacity, maximum diameter, and speculation were independent predictors of malignancy in patients with SPN( P <0.05). We calculated a prediction model for the probability of malignancy as follow: p =e(x)/(1+ e(x)), x=-4.8029-0.743×gender+ 0.057×age+ 1.306×history of tumor+ 1.305×ground-glass opacity+ 0.051×maximum diameter+ 1.043×speculation. When the data of validation group was added to the four-mathematical prediction model, The area under the curve of our mathematical prediction model was 0.742, which is greater than other models (Mayo 0.696, VA 0.634, LiYun 0.681), while the differences between any two of the four models were not significant ( P >0.05). Conclusions: Age of patient, gender, history of tumor, ground-glass opacity, maximum diameter and speculation are independent predictors of malignancy in patients with solitary pulmonary nodule. This logistic regression prediction mathematic model is not inferior to those classical models in estimating the prognosis of SPNs.
Hill, Mary Catherine
1992-01-01
This report documents a new version of the U.S. Geological Survey modular, three-dimensional, finite-difference, ground-water flow model (MODFLOW) which, with the new Parameter-Estimation Package that also is documented in this report, can be used to estimate parameters by nonlinear regression. The new version of MODFLOW is called MODFLOWP (pronounced MOD-FLOW*P), and functions nearly identically to MODFLOW when the ParameterEstimation Package is not used. Parameters are estimated by minimizing a weighted least-squares objective function by the modified Gauss-Newton method or by a conjugate-direction method. Parameters used to calculate the following MODFLOW model inputs can be estimated: Transmissivity and storage coefficient of confined layers; hydraulic conductivity and specific yield of unconfined layers; vertical leakance; vertical anisotropy (used to calculate vertical leakance); horizontal anisotropy; hydraulic conductance of the River, Streamflow-Routing, General-Head Boundary, and Drain Packages; areal recharge rates; maximum evapotranspiration; pumpage rates; and the hydraulic head at constant-head boundaries. Any spatial variation in parameters can be defined by the user. Data used to estimate parameters can include existing independent estimates of parameter values, observed hydraulic heads or temporal changes in hydraulic heads, and observed gains and losses along head-dependent boundaries (such as streams). Model output includes statistics for analyzing the parameter estimates and the model; these statistics can be used to quantify the reliability of the resulting model, to suggest changes in model construction, and to compare results of models constructed in different ways.
Isotani, Shuji; Shimoyama, Hirofumi; Yokota, Isao; Noma, Yasuhiro; Kitamura, Kousuke; China, Toshiyuki; Saito, Keisuke; Hisasue, Shin-ichi; Ide, Hisamitsu; Muto, Satoru; Yamaguchi, Raizo; Ukimura, Osamu; Gill, Inderbir S; Horie, Shigeo
2015-10-01
The predictive model of postoperative renal function may impact on planning nephrectomy. To develop the novel predictive model using combination of clinical indices with computer volumetry to measure the preserved renal cortex volume (RCV) using multidetector computed tomography (MDCT), and to prospectively validate performance of the model. Total 60 patients undergoing radical nephrectomy from 2011 to 2013 participated, including a development cohort of 39 patients and an external validation cohort of 21 patients. RCV was calculated by voxel count using software (Vincent, FUJIFILM). Renal function before and after radical nephrectomy was assessed via the estimated glomerular filtration rate (eGFR). Factors affecting postoperative eGFR were examined by regression analysis to develop the novel model for predicting postoperative eGFR with a backward elimination method. The predictive model was externally validated and the performance of the model was compared with that of the previously reported models. The postoperative eGFR value was associated with age, preoperative eGFR, preserved renal parenchymal volume (RPV), preserved RCV, % of RPV alteration, and % of RCV alteration (p < 0.01). The significant correlated variables for %eGFR alteration were %RCV preservation (r = 0.58, p < 0.01) and %RPV preservation (r = 0.54, p < 0.01). We developed our regression model as follows: postoperative eGFR = 57.87 - 0.55(age) - 15.01(body surface area) + 0.30(preoperative eGFR) + 52.92(%RCV preservation). Strong correlation was seen between postoperative eGFR and the calculated estimation model (r = 0.83; p < 0.001). The external validation cohort (n = 21) showed our model outperformed previously reported models. Combining MDCT renal volumetry and clinical indices might yield an important tool for predicting postoperative renal function.
A forecasting method to reduce estimation bias in self-reported cell phone data.
Redmayne, Mary; Smith, Euan; Abramson, Michael J
2013-01-01
There is ongoing concern that extended exposure to cell phone electromagnetic radiation could be related to an increased risk of negative health effects. Epidemiological studies seek to assess this risk, usually relying on participants' recalled use, but recall is notoriously poor. Our objectives were primarily to produce a forecast method, for use by such studies, to reduce estimation bias in the recalled extent of cell phone use. The method we developed, using Bayes' rule, is modelled with data we collected in a cross-sectional cluster survey exploring cell phone user-habits among New Zealand adolescents. Participants recalled their recent extent of SMS-texting and retrieved from their provider the current month's actual use-to-date. Actual use was taken as the gold standard in the analyses. Estimation bias arose from a large random error, as observed in all cell phone validation studies. We demonstrate that this seriously exaggerates upper-end forecasts of use when used in regression models. This means that calculations using a regression model will lead to underestimation of heavy-users' relative risk. Our Bayesian method substantially reduces estimation bias. In cases where other studies' data conforms to our method's requirements, application should reduce estimation bias, leading to a more accurate relative risk calculation for mid-to-heavy users.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.
Development and validation of a mortality risk model for pediatric sepsis.
Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan
2017-05-01
Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial.We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities.According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively.The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients.
Development and validation of a mortality risk model for pediatric sepsis
Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan
2017-01-01
Abstract Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial. We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities. According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively. The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients. PMID:28514310
Three-parameter modeling of the soil sorption of acetanilide and triazine herbicide derivatives.
Freitas, Mirlaine R; Matias, Stella V B G; Macedo, Renato L G; Freitas, Matheus P; Venturin, Nelson
2014-02-01
Herbicides have widely variable toxicity and many of them are persistent soil contaminants. Acetanilide and triazine family of herbicides have widespread use, but increasing interest for the development of new herbicides has been rising to increase their effectiveness and to diminish environmental hazard. The environmental risk of new herbicides can be accessed by estimating their soil sorption (logKoc), which is usually correlated to the octanol/water partition coefficient (logKow). However, earlier findings have shown that this correlation is not valid for some acetanilide and triazine herbicides. Thus, easily accessible quantitative structure-property relationship models are required to predict logKoc of analogues of the these compounds. Octanol/water partition coefficient, molecular weight and volume were calculated and then regressed against logKoc for two series of acetanilide and triazine herbicides using multiple linear regression, resulting in predictive and validated models.
NASA Astrophysics Data System (ADS)
Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza
2014-10-01
The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.
Schüle, Steffen Andreas; Gabriel, Katharina M A; Bolte, Gabriele
2017-06-01
The environmental justice framework states that besides environmental burdens also resources may be social unequally distributed both on the individual and on the neighbourhood level. This ecological study investigated whether neighbourhood socioeconomic position (SEP) was associated with neighbourhood public green space availability in a large German city with more than 1 million inhabitants. Two different measures were defined for green space availability. Firstly, percentage of green space within neighbourhoods was calculated with the additional consideration of various buffers around the boundaries. Secondly, percentage of green space was calculated based on various radii around the neighbourhood centroid. An index of neighbourhood SEP was calculated with principal component analysis. Log-gamma regression from the group of generalized linear models was applied in order to consider the non-normal distribution of the response variable. All models were adjusted for population density. Low neighbourhood SEP was associated with decreasing neighbourhood green space availability including 200m up to 1000m buffers around the neighbourhood boundaries. Low neighbourhood SEP was also associated with decreasing green space availability based on catchment areas measured from neighbourhood centroids with different radii (1000m up to 3000 m). With an increasing radius the strength of the associations decreased. Social unequally distributed green space may amplify environmental health inequalities in an urban context. Thus, the identification of vulnerable neighbourhoods and population groups plays an important role for epidemiological research and healthy city planning. As a methodical aspect, log-gamma regression offers an adequate parametric modelling strategy for positively distributed environmental variables. Copyright © 2017 Elsevier GmbH. All rights reserved.
NASA Astrophysics Data System (ADS)
Hadley, Brian Christopher
This dissertation assessed remotely sensed data and geospatial modeling technique(s) to map the spatial distribution of total above-ground biomass present on the surface of the Savannah River National Laboratory's (SRNL) Mixed Waste Management Facility (MWMF) hazardous waste landfill. Ordinary least squares (OLS) regression, regression kriging, and tree-structured regression were employed to model the empirical relationship between in-situ measured Bahia (Paspalum notatum Flugge) and Centipede [Eremochloa ophiuroides (Munro) Hack.] grass biomass against an assortment of explanatory variables extracted from fine spatial resolution passive optical and LIDAR remotely sensed data. Explanatory variables included: (1) discrete channels of visible, near-infrared (NIR), and short-wave infrared (SWIR) reflectance, (2) spectral vegetation indices (SVI), (3) spectral mixture analysis (SMA) modeled fractions, (4) narrow-band derivative-based vegetation indices, and (5) LIDAR derived topographic variables (i.e. elevation, slope, and aspect). Results showed that a linear combination of the first- (1DZ_DGVI), second- (2DZ_DGVI), and third-derivative of green vegetation indices (3DZ_DGVI) calculated from hyperspectral data recorded over the 400--960 nm wavelengths of the electromagnetic spectrum explained the largest percentage of statistical variation (R2 = 0.5184) in the total above-ground biomass measurements. In general, the topographic variables did not correlate well with the MWMF biomass data, accounting for less than five percent of the statistical variation. It was concluded that tree-structured regression represented the optimum geospatial modeling technique due to a combination of model performance and efficiency/flexibility factors.
Modified physiologically equivalent temperature—basics and applications for western European climate
NASA Astrophysics Data System (ADS)
Chen, Yung-Chang; Matzarakis, Andreas
2018-05-01
A new thermal index, the modified physiologically equivalent temperature (mPET) has been developed for universal application in different climate zones. The mPET has been improved against the weaknesses of the original physiologically equivalent temperature (PET) by enhancing evaluation of the humidity and clothing variability. The principles of mPET and differences between original PET and mPET are introduced and discussed in this study. Furthermore, this study has also evidenced the usability of mPET with climatic data in Freiburg, which is located in Western Europe. Comparisons of PET, mPET, and Universal Thermal Climate Index (UTCI) have shown that mPET gives a more realistic estimation of human thermal sensation than the other two thermal indices (PET, UTCI) for the thermal conditions in Freiburg. Additionally, a comparison of physiological parameters between mPET model and PET model (Munich Energy Balance Model for Individual, namely MEMI) is proposed. The core temperatures and skin temperatures of PET model vary more violently to a low temperature during cold stress than the mPET model. It can be regarded as that the mPET model gives a more realistic core temperature and mean skin temperature than the PET model. Statistical regression analysis of mPET based on the air temperature, mean radiant temperature, vapor pressure, and wind speed has been carried out. The R square (0.995) has shown a well co-relationship between human biometeorological factors and mPET. The regression coefficient of each factor represents the influence of the each factor on changing mPET (i.e., ±1 °C of T a = ± 0.54 °C of mPET). The first-order regression has been considered predicting a more realistic estimation of mPET at Freiburg during 2003 than the other higher order regression model, because the predicted mPET from the first-order regression has less difference from mPET calculated from measurement data. Statistic tests recognize that mPET can effectively evaluate the influences of all human biometeorological factors on thermal environments. Moreover, a first-order regression function can also predict the thermal evaluations of the mPET by using human biometeorological factors in Freiburg.
Sano, Yuko; Kandori, Akihiko; Shima, Keisuke; Yamaguchi, Yuki; Tsuji, Toshio; Noda, Masafumi; Higashikawa, Fumiko; Yokoe, Masaru; Sakoda, Saburo
2016-06-01
We propose a novel index of Parkinson's disease (PD) finger-tapping severity, called "PDFTsi," for quantifying the severity of symptoms related to the finger tapping of PD patients with high accuracy. To validate the efficacy of PDFTsi, the finger-tapping movements of normal controls and PD patients were measured by using magnetic sensors, and 21 characteristics were extracted from the finger-tapping waveforms. To distinguish motor deterioration due to PD from that due to aging, the aging effect on finger tapping was removed from these characteristics. Principal component analysis (PCA) was applied to the age-normalized characteristics, and principal components that represented the motion properties of finger tapping were calculated. Multiple linear regression (MLR) with stepwise variable selection was applied to the principal components, and PDFTsi was calculated. The calculated PDFTsi indicates that PDFTsi has a high estimation ability, namely a mean square error of 0.45. The estimation ability of PDFTsi is higher than that of the alternative method, MLR with stepwise regression selection without PCA, namely a mean square error of 1.30. This result suggests that PDFTsi can quantify PD finger-tapping severity accurately. Furthermore, the result of interpreting a model for calculating PDFTsi indicated that motion wideness and rhythm disorder are important for estimating PD finger-tapping severity.
Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio
2016-10-01
We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
The MSFC Solar Activity Future Estimation (MSAFE) Model
NASA Technical Reports Server (NTRS)
Suggs, Ron
2017-01-01
The MSAFE model provides forecasts for the solar indices SSN, F10.7, and Ap. These solar indices are used as inputs to space environment models used in orbital spacecraft operations and space mission analysis. Forecasts from the MSAFE model are provided on the MSFC Natural Environments Branch's solar web page and are updated as new monthly observations become available. The MSAFE prediction routine employs a statistical technique that calculates deviations of past solar cycles from the mean cycle and performs a regression analysis to calculate the deviation from the mean cycle of the solar index at the next future time interval. The forecasts are initiated for a given cycle after about 8 to 9 monthly observations from the start of the cycle are collected. A forecast made at the beginning of cycle 24 using the MSAFE program captured the cycle fairly well with some difficulty in discerning the double peak that occurred at solar cycle maximum.
Indriect Measurement Of Nitrogen In A Mult-Component Natural Gas By Heating The Gas
Morrow, Thomas B.; Behring, II, Kendricks A.
2004-06-22
Methods of indirectly measuring the nitrogen concentration in a natural gas by heating the gas. In two embodiments, the heating energy is correlated to the speed of sound in the gas, the diluent concentrations in the gas, and constant values, resulting in a model equation. Regression analysis is used to calculate the constant values, which can then be substituted into the model equation. If the diluent concentrations other than nitrogen (typically carbon dioxide) are known, the model equation can be solved for the nitrogen concentration.
TI-59 Programs for Multiple Regression.
1980-05-01
general linear hypothesis model of full rank [ Graybill , 19611 can be written as Y = x 8 + C , s-N(O,o 2I) nxl nxk kxl nxl where Y is the vector of n...a "reduced model " solution, and confidence intervals for linear functions of the coefficients can be obtained using (x’x) and a2, based on the t...O107)l UA.LLL. Library ModuIe NASTER -Puter 0NTINA Cards 1 PROGRAM DESCRIPTION (s s 2 ror the general linear hypothesis model Y - XO + C’ calculates
Armas, Cecilia María; Santana, Bayanor; Mora, Juan Luis; Notario, Jesús Santiago; Arbelo, Carmen Dolores; Rodríguez-Rodríguez, Antonio
2007-05-25
The aim of this work is to identify indicators of biological activity in soils from the Canary Islands, by studying the variation of selected biological parameters related to the processes of deforestation and accelerated soil degradation affecting the Canarian natural ecosystems. Ten plots with different degrees of maturity/degradation have been selected in three typical habitats in the Canary Islands: laurel forest, pine forest and xerophytic scrub with Andisols and Aridisols as the most common soils. The studied characteristics in each case include total organic carbon, field soil respiration, mineralized carbon after laboratory incubation, microbial biomass carbon, hot water-extractable carbon and carboxymethylcellulase, beta-d-glucosidase and dehydrogenase activities. A Biological Quality Index (BQI) has been designed on the basis of a regression model using these variables, assuming that the total soil organic carbon content is quite stable in nearly mature ecosystems. Total carbon in mature ecosystems has been related to significant biological variables (hot water-extractable carbon, soil respiration and carboxymethylcellulase, beta-d-glucosidase and dehydrogenase activities), accounting for nearly 100% of the total variance by a multiple regression analysis. The index has been calculated as the ratio of the value calculated from the regression model and the actual measured value. The obtained results show that soils in nearly mature ecosystems have BQI values close to unit, whereas those in degraded ecosystems range between 0.24 and 0.97, depending on the degradation degree.
Live Donor Renal Anatomic Asymmetry and Posttransplant Renal Function.
Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S; Newhouse, Jeffrey H; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J; Carroll, Maureen A; Sharif, Sairah; Cohen, David J; Ratner, Lloyd E; Hardy, Mark A
2015-08-01
Relationship between live donor renal anatomic asymmetry and posttransplant recipient function has not been studied extensively. We analyzed 96 live kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from computerized tomography angiograms) and their matching recipients. Split function differences (SFD) were quantified with technetium-dimercaptosuccinic acid renography. Implantation biopsies at time 0 were semiquantitatively scored. A comprehensive model using donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at 1 year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60 mL/min/1.73 m(2) at 1 year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the chronic kidney disease-epidemiology collaboration formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). In the study cohort, the mean Vol/Wgt and eGFR at 1 year were 2.04 mL/kg and 60.4 mL/min/1.73 m(2), respectively. Volume and split ratios between 2 donor kidneys were strongly correlated (r = 0.79, P < 0.001). The biopsy scores among SFD categories (<5%, 5%-10%, >10%) were not different (P = 0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR > 60 mL/min/1.73 m (odds ratio, 8.94, 95% CI 2.47-32.25, P = 0.001) and had a strong discriminatory power in predicting the risk of eGFR less than 60 mL/min/1.73 m(2) at 1 year [receiver operating curve (ROC curve), 0.78, 95% CI, 0.68-0.89]. In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at 1 year after transplantation. Renography can be replaced with CT volume calculation in estimating split renal function.
Live Donor Renal Anatomic Asymmetry and Post-Transplant Renal Function
Tanriover, Bekir; Fernandez, Sonalis; Campenot, Eric S.; Newhouse, Jeffrey H.; Oyfe, Irina; Mohan, Prince; Sandikci, Burhaneddin; Radhakrishnan, Jai; Wexler, Jennifer J.; Carroll, Maureen A.; Sharif, Sairah; Cohen, David J.; Ratner, Lloyd E.; Hardy, Mark A.
2014-01-01
Background Relationship between live donor renal anatomic asymmetry and post-transplant recipient function has not been studied extensively. Methods We analyzed 96 live-kidney donors, who had anatomical asymmetry (>10% renal length and/or volume difference calculated from CT angiograms) and their matching recipients. Split function differences (SFD) were quantified with 99mTc-DMSA renography. Implantation biopsies at time-zero were semi-quantitatively scored. A comprehensive model utilizing donor renal volume adjusted to recipient weight (Vol/Wgt), SFD, and biopsy score was used to predict recipient estimated glomerular filtration rate (eGFR) at one-year. Primary analysis consisted of a logistic regression model of outcome (odds of developing eGFR>60ml/min/1.73 m2 at one-year), a linear regression model of outcome (predicting recipient eGFR at one-year, using the CKD-EPI formula), and a Monte Carlo simulation based on the linear regression model (N=10,000 iterations). Results In the study cohort, the mean Vol/Wgt and eGFR at one-year were 2.04 ml/kg and 60.4 ml/min/1.73m2, respectively. Volume and split ratios between two donor kidneys were strongly correlated (r=0.79, p-value<0.001). The biopsy scores among SFD categories (<5%, 5–10%, >10%) were not different (p=0.190). On multivariate models, only Vol/Wgt was significantly associated with higher odds of having eGFR>60ml/min/1.73 m2 (OR=8.94, 95% CI 2.47–32.25, p=0.001) and had a strong discriminatory power in predicting the risk of eGFR<60ml/min/1.73m2 at one-year (ROC curve=0.78, 95% CI 0.68–0.89). Conclusion In the presence of donor renal anatomic asymmetry, Vol/Wgt appears to be a major determinant of recipient renal function at one-year post-transplantation. Renography can be replaced with CT volume calculation in estimating split renal function. PMID:25719258
A Proposal for Phase 4 of the Forest Inventory and Analysis Program
Ronald E. McRoberts
2005-01-01
Maps of forest cover were constructed using observations from forest inventory plots, Landsat Thematic Mapper satellite imagery, and a logistic regression model. Estimates of mean proportion forest area and the variance of the mean were calculated for circular study areas with radii ranging from 1 km to 15 km. The spatial correlation among pixel predictions was...
The influence of changes in land use and landscape patterns on soil erosion in a watershed.
Zhang, Shanghong; Fan, Weiwei; Li, Yueqiang; Yi, Yujun
2017-01-01
It is very important to have a good understanding of the relation between soil erosion and landscape patterns so that soil and water conservation in river basins can be optimized. In this study, this relationship was explored, using the Liusha River Watershed, China, as a case study. A distributed water and sediment model based on the Soil and Water Assessment Tool (SWAT) was developed to simulate soil erosion from different land use types in each sub-basin of the Liusha River Watershed. Observed runoff and sediment data from 1985 to 2005 and land use maps from 1986, 1995, and 2000 were used to calibrate and validate the model. The erosion modulus for each sub-basin was calculated from SWAT model results using the different land use maps and 12 landscape indices were chosen and calculated to describe the land use in each sub-basin for the different years. The variations in instead of the absolute amounts of the erosion modulus and the landscape indices for each sub-basin were used as the dependent and independent variables, respectively, for the regression equations derived from multiple linear regression. The results indicated that the variations in the erosion modulus were closely related to changes in the large patch index, patch cohesion index, modified Simpson's evenness index, and the aggregation index. From the regression equation and the corresponding landscape indices, it was found that watershed erosion can be reduced by decreasing the physical connectivity between patches, improving the evenness of the landscape patch types, enriching landscape types, and enhancing the degree of aggregation between the landscape patches. These findings will be useful for water and soil conservation and for optimizing the management of watershed landscapes. Copyright © 2016 Elsevier B.V. All rights reserved.
Predictive ability of a comprehensive incremental test in mountain bike marathon.
Ahrend, Marc-Daniel; Schneeweiss, Patrick; Martus, Peter; Niess, Andreas M; Krauss, Inga
2018-01-01
Traditional performance tests in mountain bike marathon (XCM) primarily quantify aerobic metabolism and may not describe the relevant capacities in XCM. We aimed to validate a comprehensive test protocol quantifying its intermittent demands. Forty-nine athletes (38.8±9.1 years; 38 male; 11 female) performed a laboratory performance test, including an incremental test, to determine individual anaerobic threshold (IAT), peak power output (PPO) and three maximal efforts (10 s all-out sprint, 1 min maximal effort and 5 min maximal effort). Within 2 weeks, the athletes participated in one of three XCM races (n=15, n=9 and n=25). Correlations between test variables and race times were calculated separately. In addition, multiple regression models of the predictive value of laboratory outcomes were calculated for race 3 and across all races (z-transformed data). All variables were correlated with race times 1, 2 and 3: 10 s all-out sprint (r=-0.72; r=-0.59; r=-0.61), 1 min maximal effort (r=-0.85; r=-0.84; r=-0.82), 5 min maximal effort (r=-0.57; r=-0.85; r=-0.76), PPO (r=-0.77; r=-0.73; r=-0.76) and IAT (r=-0.71; r=-0.67; r=-0.68). The best-fitting multiple regression models for race 3 (r 2 =0.868) and across all races (r 2 =0.757) comprised 1 min maximal effort, IAT and body weight. Aerobic and intermittent variables correlated least strongly with race times. Their use in a multiple regression model confirmed additional explanatory power to predict XCM performance. These findings underline the usefulness of the comprehensive incremental test to predict performance in that sport more precisely.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.
2018-05-01
Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan
2014-01-01
In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Polynomial order selection in random regression models via penalizing adaptively the likelihood.
Corrales, J D; Munilla, S; Cantet, R J C
2015-08-01
Orthogonal Legendre polynomials (LP) are used to model the shape of additive genetic and permanent environmental effects in random regression models (RRM). Frequently, the Akaike (AIC) and the Bayesian (BIC) information criteria are employed to select LP order. However, it has been theoretically shown that neither AIC nor BIC is simultaneously optimal in terms of consistency and efficiency. Thus, the goal was to introduce a method, 'penalizing adaptively the likelihood' (PAL), as a criterion to select LP order in RRM. Four simulated data sets and real data (60,513 records, 6675 Colombian Holstein cows) were employed. Nested models were fitted to the data, and AIC, BIC and PAL were calculated for all of them. Results showed that PAL and BIC identified with probability of one the true LP order for the additive genetic and permanent environmental effects, but AIC tended to favour over parameterized models. Conversely, when the true model was unknown, PAL selected the best model with higher probability than AIC. In the latter case, BIC never favoured the best model. To summarize, PAL selected a correct model order regardless of whether the 'true' model was within the set of candidates. © 2015 Blackwell Verlag GmbH.
Motion patterns in acupuncture needle manipulation.
Seo, Yoonjeong; Lee, In-Seon; Jung, Won-Mo; Ryu, Ho-Sun; Lim, Jinwoong; Ryu, Yeon-Hee; Kang, Jung-Won; Chae, Younbyoung
2014-10-01
In clinical practice, acupuncture manipulation is highly individualised for each practitioner. Before we establish a standard for acupuncture manipulation, it is important to understand completely the manifestations of acupuncture manipulation in the actual clinic. To examine motion patterns during acupuncture manipulation, we generated a fitted model of practitioners' motion patterns and evaluated their consistencies in acupuncture manipulation. Using a motion sensor, we obtained real-time motion data from eight experienced practitioners while they conducted acupuncture manipulation using their own techniques. We calculated the average amplitude and duration of a sampled motion unit for each practitioner and, after normalisation, we generated a true regression curve of motion patterns for each practitioner using a generalised additive mixed modelling (GAMM). We observed significant differences in rotation amplitude and duration in motion samples among practitioners. GAMM showed marked variations in average regression curves of motion patterns among practitioners but there was strong consistency in motion parameters for individual practitioners. The fitted regression model showed that the true regression curve accounted for an average of 50.2% of variance in the motion pattern for each practitioner. Our findings suggest that there is great inter-individual variability between practitioners, but remarkable intra-individual consistency within each practitioner. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Development of a Bayesian model to estimate health care outcomes in the severely wounded
Stojadinovic, Alexander; Eberhardt, John; Brown, Trevor S; Hawksworth, Jason S; Gage, Frederick; Tadaki, Douglas K; Forsberg, Jonathan A; Davis, Thomas A; Potter, Benjamin K; Dunne, James R; Elster, E A
2010-01-01
Background: Graphical probabilistic models have the ability to provide insights as to how clinical factors are conditionally related. These models can be used to help us understand factors influencing health care outcomes and resource utilization, and to estimate morbidity and clinical outcomes in trauma patient populations. Study design: Thirty-two combat casualties with severe extremity injuries enrolled in a prospective observational study were analyzed using step-wise machine-learned Bayesian belief network (BBN) and step-wise logistic regression (LR). Models were evaluated using 10-fold cross-validation to calculate area-under-the-curve (AUC) from receiver operating characteristics (ROC) curves. Results: Our BBN showed important associations between various factors in our data set that could not be developed using standard regression methods. Cross-validated ROC curve analysis showed that our BBN model was a robust representation of our data domain and that LR models trained on these findings were also robust: hospital-acquired infection (AUC: LR, 0.81; BBN, 0.79), intensive care unit length of stay (AUC: LR, 0.97; BBN, 0.81), and wound healing (AUC: LR, 0.91; BBN, 0.72) showed strong AUC. Conclusions: A BBN model can effectively represent clinical outcomes and biomarkers in patients hospitalized after severe wounding, and is confirmed by 10-fold cross-validation and further confirmed through logistic regression modeling. The method warrants further development and independent validation in other, more diverse patient populations. PMID:21197361
NASA Astrophysics Data System (ADS)
Wang, Hongliang; Liu, Baohua; Ding, Zhongjun; Wang, Xiangxin
2017-02-01
Absorption-based optical sensors have been developed for the determination of water pH. In this paper, based on the preparation of a transparent sol-gel thin film with a phenol red (PR) indicator, several calculation methods, including simple linear regression analysis, quadratic regression analysis and dual-wavelength absorbance ratio analysis, were used to calculate water pH. Results of MSSRR show that dual-wavelength absorbance ratio analysis can improve the calculation accuracy of water pH in long-term measurement.
NASA Astrophysics Data System (ADS)
Press, J.; Broughton, J.; Kudela, R. M.
2014-12-01
Suspended and dissolved trace elements are key determinants of water quality in estuarine and coastal waters. High concentrations of trace element pollutants in the San Francisco Bay estuary necessitate consistent and thorough monitoring to mitigate adverse effects on biological systems and the contamination of water and food resources. Although existing monitoring programs collect annual in situ samples from fixed locations, models proposed by Benoit, Kudela, & Flegal (2010) enable calculation of the water column total concentration (WCT) and the water column dissolved concentration (WCD) of 14 trace elements in the San Francisco Bay from a more frequently sampled metric—suspended solids concentration (SSC). This study tests the application of these models with SSC calculated from remote sensing data, with the aim of validating a tool for continuous synoptic monitoring of trace elements in the San Francisco Bay. Using HICO imagery, semi-analytical and empirical SSC algorithms were tested against a USGS dataset. A single-band method with statistically significant linear fit (p < 0.001) was chosen as the proxy for SSC values. The numerical models for WCT and the distribution ratio D were applied in MATLAB with terms to account for regional and seasonal effects, and results were used to calculate WCD. The modeled results were assessed against in situ data from the San Francisco Estuary Regional Monitoring Program. Quantile regression was used to evaluate model sensitivity to the distribution of regions, and outliers displaying regional aberrations were removed before robust regression was applied. Statistically significant and highly correlated results for WCT were found for 10 elements, with goodness of fit greater than or equal to that of the original models of seven elements. WCD was successfully modeled for six elements, with goodness of fit for each exceeding that of the original models. Concentrations of Arsenic, Iron, and Lead in the southern region of the Bay were found to exceed EPA water quality criteria for human health and aquatic life. The results of this study demonstrate the potential of monitoring programs using remote observation of trace element concentrations, and provide the foundation for investigation of pollutant sources and pathways over time.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Third molar development: measurements versus scores as age predictor.
Thevissen, P W; Fieuws, S; Willems, G
2011-10-01
Human third molar development is widely used to predict chronological age of sub adult individuals with unknown or doubted age. For these predictions, classically, the radiologically observed third molar growth and maturation is registered using a staging and related scoring technique. Measures of lengths and widths of the developing wisdom tooth and its adjacent second molar can be considered as an alternative registration. The aim of this study was to verify relations between mandibular third molar developmental stages or measurements of mandibular second molar and third molars and age. Age related performance of stages and measurements were compared to assess if measurements added information to age predictions from third molar formation stage. The sample was 340 orthopantomograms (170 females, 170 males) of individuals homogenously distributed in age between 7 and 24 years. Mandibular lower right, third and second molars, were staged following Gleiser and Hunt, length and width measurements were registered, and various ratios of these measurements were calculated. Univariable regression models with age as response and third molar stage, measurements and ratios of second and third molars as predictors, were considered. Multivariable regression models assessed if measurements or ratios added information to age prediction from third molar stage. Coefficients of determination (R(2)) and root mean squared errors (RMSE) obtained from all regression models were compared. The univariable regression model using stages as predictor yielded most accurate age predictions (males: R(2) 0.85, RMSE between 0.85 and 1.22 year; females: R(2) 0.77, RMSE between 1.19 and 2.11 year) compared to all models including measurements and ratios. The multivariable regression models indicated that measurements and ratios added no clinical relevant information to the age prediction from third molar stage. Ratios and measurements of second and third molars are less accurate age predictors than stages of developing third molars. Copyright © 2011 Elsevier Ltd. All rights reserved.
Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan
2017-01-01
This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
Predicting major element mineral/melt equilibria - A statistical approach
NASA Technical Reports Server (NTRS)
Hostetler, C. J.; Drake, M. J.
1980-01-01
Empirical equations have been developed for calculating the mole fractions of NaO0.5, MgO, AlO1.5, SiO2, KO0.5, CaO, TiO2, and FeO in a solid phase of initially unknown identity given only the composition of the coexisting silicate melt. The approach involves a linear multivariate regression analysis in which solid composition is expressed as a Taylor series expansion of the liquid compositions. An internally consistent precision of approximately 0.94 is obtained, that is, the nature of the liquidus phase in the input data set can be correctly predicted for approximately 94% of the entries. The composition of the liquidus phase may be calculated to better than 5 mol % absolute. An important feature of this 'generalized solid' model is its reversibility; that is, the dependent and independent variables in the linear multivariate regression may be inverted to permit prediction of the composition of a silicate liquid produced by equilibrium partial melting of a polymineralic source assemblage.
Temperature-viscosity models reassessed.
Peleg, Micha
2017-05-04
The temperature effect on viscosity of liquid and semi-liquid foods has been traditionally described by the Arrhenius equation, a few other mathematical models, and more recently by the WLF and VTF (or VFT) equations. The essence of the Arrhenius equation is that the viscosity is proportional to the absolute temperature's reciprocal and governed by a single parameter, namely, the energy of activation. However, if the absolute temperature in K in the Arrhenius equation is replaced by T + b where both T and the adjustable b are in °C, the result is a two-parameter model, which has superior fit to experimental viscosity-temperature data. This modified version of the Arrhenius equation is also mathematically equal to the WLF and VTF equations, which are known to be equal to each other. Thus, despite their dissimilar appearances all three equations are essentially the same model, and when used to fit experimental temperature-viscosity data render exactly the same very high regression coefficient. It is shown that three new hybrid two-parameter mathematical models, whose formulation bears little resemblance to any of the conventional models, can also have excellent fit with r 2 ∼ 1. This is demonstrated by comparing the various models' regression coefficients to published viscosity-temperature relationships of 40% sucrose solution, soybean oil, and 70°Bx pear juice concentrate at different temperature ranges. Also compared are reconstructed temperature-viscosity curves using parameters calculated directly from 2 or 3 data points and fitted curves obtained by nonlinear regression using a larger number of experimental viscosity measurements.
Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models.
Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-01-01
Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models' with and without novel biomarkers. Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham's "general CVD risk" algorithm. The command is addpred for logistic regression models. The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers.
Groundwater depth prediction in a shallow aquifer in north China by a quantile regression model
NASA Astrophysics Data System (ADS)
Li, Fawen; Wei, Wan; Zhao, Yong; Qiao, Jiale
2017-01-01
There is a close relationship between groundwater level in a shallow aquifer and the surface ecological environment; hence, it is important to accurately simulate and predict the groundwater level in eco-environmental construction projects. The multiple linear regression (MLR) model is one of the most useful methods to predict groundwater level (depth); however, the predicted values by this model only reflect the mean distribution of the observations and cannot effectively fit the extreme distribution data (outliers). The study reported here builds a prediction model of groundwater-depth dynamics in a shallow aquifer using the quantile regression (QR) method on the basis of the observed data of groundwater depth and related factors. The proposed approach was applied to five sites in Tianjin city, north China, and the groundwater depth was calculated in different quantiles, from which the optimal quantile was screened out according to the box plot method and compared to the values predicted by the MLR model. The results showed that the related factors in the five sites did not follow the standard normal distribution and that there were outliers in the precipitation and last-month (initial state) groundwater-depth factors because the basic assumptions of the MLR model could not be achieved, thereby causing errors. Nevertheless, these conditions had no effect on the QR model, as it could more effectively describe the distribution of original data and had a higher precision in fitting the outliers.
Banzato, Tommaso; Fiore, Enrico; Morgante, Massimo; Manuali, Elisabetta; Zotti, Alessandro
2016-10-01
Hepatic lipidosis is the most diffused hepatic disease in the lactating cow. A new methodology to estimate the degree of fatty infiltration of the liver in lactating cows by means of texture analysis of B-mode ultrasound images is proposed. B-mode ultrasonography of the liver was performed in 48 Holstein Friesian cows using standardized ultrasound parameters. Liver biopsies to determine the triacylglycerol content of the liver (TAGqa) were obtained from each animal. A large number of texture parameters were calculated on the ultrasound images by means of a free software. Based on the TAGqa content of the liver, 29 samples were classified as mild (TAGqa<50mg/g), 6 as moderate (50mg/g
Phobic Anxiety and Plasma Levels of Global Oxidative Stress in Women.
Hagan, Kaitlin A; Wu, Tianying; Rimm, Eric B; Eliassen, A Heather; Okereke, Olivia I
2015-01-01
Psychological distress has been hypothesized to be associated with adverse biologic states such as higher oxidative stress and inflammation. Yet, little is known about associations between a common form of distress - phobic anxiety - and global oxidative stress. Thus, we related phobic anxiety to plasma fluorescent oxidation products (FlOPs), a global oxidative stress marker. We conducted a cross-sectional analysis among 1,325 women (aged 43-70 years) from the Nurses' Health Study. Phobic anxiety was measured using the Crown-Crisp Index (CCI). Adjusted least-squares mean log-transformed FlOPs were calculated across phobic categories. Logistic regression models were used to calculate odds ratios (OR) comparing the highest CCI category (≥6 points) vs. lower scores, across FlOPs quartiles. No association was found between phobic anxiety categories and mean FlOP levels in multivariable adjusted linear models. Similarly, in multivariable logistic regression models there were no associations between FlOPs quartiles and likelihood of being in the highest phobic category. Comparing women in the highest vs. lowest FlOPs quartiles: FlOP_360: OR=0.68 (95% CI: 0.40-1.15); FlOP_320: OR=0.99 (95% CI: 0.61-1.61); FlOP_400: OR=0.92 (95% CI: 0.52, 1.63). No cross-sectional association was found between phobic anxiety and a plasma measure of global oxidative stress in this sample of middle-aged and older women.
NASA Astrophysics Data System (ADS)
Fomina, E. V.; Kozhukhova, N. I.; Sverguzova, S. V.; Fomin, A. E.
2018-05-01
In this paper, the regression equations method for design of construction material was studied. Regression and polynomial equations representing the correlation between the studied parameters were proposed. The logic design and software interface of the regression equations method focused on parameter optimization to provide the energy saving effect at the stage of autoclave aerated concrete design considering the replacement of traditionally used quartz sand by coal mining by-product such as argillite. The mathematical model represented by a quadric polynomial for the design of experiment was obtained using calculated and experimental data. This allowed the estimation of relationship between the composition and final properties of the aerated concrete. The surface response graphically presented in a nomogram allowed the estimation of concrete properties in response to variation of composition within the x-space. The optimal range of argillite content was obtained leading to a reduction of raw materials demand, development of target plastic strength of aerated concrete as well as a reduction of curing time before autoclave treatment. Generally, this method allows the design of autoclave aerated concrete with required performance without additional resource and time costs.
NASA Astrophysics Data System (ADS)
McCulley, Jonathan M.
This research investigates the application of additive manufacturing techniques for fabricating hybrid rocket fuel grains composed of porous Acrylonitrile-butadiene-styrene impregnated with paraffin wax. The digitally manufactured ABS substrate provides mechanical support for the paraffin fuel material and serves as an additional fuel component. The embedded paraffin provides an enhanced fuel regression rate while having no detrimental effect on the thermodynamic burn properties of the fuel grain. Multiple fuel grains with various ABS-to-Paraffin mass ratios were fabricated and burned with nitrous oxide. Analytical predictions for end-to-end motor performance and fuel regression are compared against static test results. Baseline fuel grain regression calculations use an enthalpy balance energy analysis with the material and thermodynamic properties based on the mean paraffin/ABS mass fractions within the fuel grain. In support of these analytical comparisons, a novel method for propagating the fuel port burn surface was developed. In this modeling approach the fuel cross section grid is modeled as an image with white pixels representing the fuel and black pixels representing empty or burned grid cells.
The validation of a human force model to predict dynamic forces resulting from multi-joint motions
NASA Technical Reports Server (NTRS)
Pandya, Abhilash K.; Maida, James C.; Aldridge, Ann M.; Hasson, Scott M.; Woolford, Barbara J.
1992-01-01
The development and validation is examined of a dynamic strength model for humans. This model is based on empirical data. The shoulder, elbow, and wrist joints were characterized in terms of maximum isolated torque, or position and velocity, in all rotational planes. This data was reduced by a least squares regression technique into a table of single variable second degree polynomial equations determining torque as a function of position and velocity. The isolated joint torque equations were then used to compute forces resulting from a composite motion, in this case, a ratchet wrench push and pull operation. A comparison of the predicted results of the model with the actual measured values for the composite motion indicates that forces derived from a composite motion of joints (ratcheting) can be predicted from isolated joint measures. Calculated T values comparing model versus measured values for 14 subjects were well within the statistically acceptable limits and regression analysis revealed coefficient of variation between actual and measured to be within 0.72 and 0.80.
NASA Astrophysics Data System (ADS)
Filho, Edilson B. A.; Moraes, Ingrid A.; Weber, Karen C.; Rocha, Gerd B.; Vasconcellos, Mário L. A. A.
2012-08-01
Morita-Baylis-Hillman Adducts (MBHA) has been recently synthesized and bio-evaluated by our research group against Leishmania amazonensis, parasite that causes cutaneous and mucocutaneous leishmaniasis. We present here a theoretical conformational study of thirty-two leismanicidal MBHA by B3LYP/6-31+g(d) calculations with Polarized Continuum Model (PCM) to simulate water influence. Intramolecular Hydrogen Bonds (IHBs) indicated to control the most conformational preferences of MBHA. Quantum Theory Atoms in Molecules (QTAIM) calculations were able to characterize these interactions at Bond Critical Point level. Compounds presenting an unusual seven member IHB between NO2 group and hydroxyl moiety, supported by experimental spectroscopic data, showed a considerable improvement of biological activity (lower IC50 values). These results are in accordance to redox NO2 mechanism of action. Based on structural observations, some molecular descriptors were calculated and submitted to Quantitative Structure-Activity Relationship (QSAR) studies through the PLS Regression Method. These studies provided a model with good validation parameters values (R2 = 0.71, Q2 = 0.61 and Qext2 = 0.92).
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
The potential of composite cognitive scores for tracking progression in Huntington's disease.
Jones, Rebecca; Stout, Julie C; Labuschagne, Izelle; Say, Miranda; Justo, Damian; Coleman, Allison; Dumas, Eve M; Hart, Ellen; Owen, Gail; Durr, Alexandra; Leavitt, Blair R; Roos, Raymund; O'Regan, Alison; Langbehn, Doug; Tabrizi, Sarah J; Frost, Chris
2014-01-01
Composite scores derived from joint statistical modelling of individual risk factors are widely used to identify individuals who are at increased risk of developing disease or of faster disease progression. We investigated the ability of composite measures developed using statistical models to differentiate progressive cognitive deterioration in Huntington's disease (HD) from natural decline in healthy controls. Using longitudinal data from TRACK-HD, the optimal combinations of quantitative cognitive measures to differentiate premanifest and early stage HD individuals respectively from controls was determined using logistic regression. Composite scores were calculated from the parameters of each statistical model. Linear regression models were used to calculate effect sizes (ES) quantifying the difference in longitudinal change over 24 months between premanifest and early stage HD groups respectively and controls. ES for the composites were compared with ES for individual cognitive outcomes and other measures used in HD research. The 0.632 bootstrap was used to eliminate biases which result from developing and testing models in the same sample. In early HD, the composite score from the HD change prediction model produced an ES for difference in rate of 24-month change relative to controls of 1.14 (95% CI: 0.90 to 1.39), larger than the ES for any individual cognitive outcome and UHDRS Total Motor Score and Total Functional Capacity. In addition, this composite gave a statistically significant difference in rate of change in premanifest HD compared to controls over 24-months (ES: 0.24; 95% CI: 0.04 to 0.44), even though none of the individual cognitive outcomes produced statistically significant ES over this period. Composite scores developed using appropriate statistical modelling techniques have the potential to materially reduce required sample sizes for randomised controlled trials.
Sturm, Marc; Quinten, Sascha; Huber, Christian G.; Kohlbacher, Oliver
2007-01-01
We propose a new model for predicting the retention time of oligonucleotides. The model is based on ν support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data. We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models. PMID:17567619
Estimating boundary shear stress along vegetated streambanks with turbulent kinetic energy
NASA Astrophysics Data System (ADS)
Hopkinson, L. C.; Wynn, T.
2010-12-01
Boundary shear stress (BSS) is critical to correctly predict streambank erosion rates and stable channel design and has been estimated using turbulent kinetic energy (TKE). Typically TKE is used in ocean and fluvial environments to determine bed shear stress where the proportionality coefficient (C1) ranges from 0.19 to 0.21. TKE has only recently been used to estimate boundary shear stress along sloping streambanks. This study examined the relationship between boundary shear stress and turbulent kinetic energy along vegetated streambanks for three vegetation treatments: bare, tree, and shrub. A flume study was conducted, modeling a second order prototype stream (Tom’s Creek in Blacksburg, VA) with individual reaches dominated by two vegetation types (trees and shrubs). Boundary shear stress was measured using a flush-mount hot-film anemometer, and three-dimensional velocity was measured using an acoustic Doppler velocimeter 0.5 cm from the boundary. The relationship between TKE and BSS (BSS=C1TKE) was examined by calculating a no-intercept regression model. The calculated C1 ranged from 0.11 to 0.53 for all vegetation types (MRes=0.018-0.086). No-intercept regression models with TKE and the Reynolds stresses (τuv, τuw, and τvw) were also examined as Reynolds stresses have been used to calculate C1. There was better agreement with the reported C1 range for the TKE and Reynolds stress relationship (C1=0.17-0.21 and MRes<0.0072 for the τvw relationship) than with the measured values of shear stress, likely due to the dominance of turbulence generation. While these results are consistent with previously reported values, the relationship should be further explored with measured values of shear stress to determine the trends along hydraulically rough boundaries.
Anastasopoulou, Panagiota; Tubic, Mirnes; Schmidt, Steffen; Neumann, Rainer; Woll, Alexander; Härtel, Sascha
2014-01-01
The measurement of activity energy expenditure (AEE) via accelerometry is the most commonly used objective method for assessing human daily physical activity and has gained increasing importance in the medical, sports and psychological science research in recent years. The purpose of this study was to determine which of the following procedures is more accurate to determine the energy cost during the most common everyday life activities; a single regression or an activity based approach. For this we used a device that utilizes single regression models (GT3X, ActiGraph Manufacturing Technology Inc., FL., USA) and a device using activity-dependent calculation models (move II, movisens GmbH, Karlsruhe, Germany). Nineteen adults (11 male, 8 female; 30.4±9.0 years) wore the activity monitors attached to the waist and a portable indirect calorimeter (IC) as reference measure for AEE while performing several typical daily activities. The accuracy of the two devices for estimating AEE was assessed as the mean differences between their output and the reference and evaluated using Bland-Altman analysis. The GT3X overestimated the AEE of walking (GT3X minus reference, 1.26 kcal/min), walking fast (1.72 kcal/min), walking up-/downhill (1.45 kcal/min) and walking upstairs (1.92 kcal/min) and underestimated the AEE of jogging (-1.30 kcal/min) and walking upstairs (-2.46 kcal/min). The errors for move II were smaller than those for GT3X for all activities. The move II overestimated AEE of walking (move II minus reference, 0.21 kcal/min), walking up-/downhill (0.06 kcal/min) and stair walking (upstairs: 0.13 kcal/min; downstairs: 0.29 kcal/min) and underestimated AEE of walking fast (-0.11 kcal/min) and jogging (-0.93 kcal/min). Our data suggest that the activity monitor using activity-dependent calculation models is more appropriate for predicting AEE in daily life than the activity monitor using a single regression model.
Mookprom, S; Boonkum, W; Kunhareang, S; Siripanya, S; Duangjinda, M
2017-02-01
The objective of this research is to investigate appropriate random regression models with various covariance functions, for the genetic evaluation of test-day egg production. Data included 7,884 monthly egg production records from 657 Thai native chickens (Pradu Hang Dam) that were obtained during the first to sixth generation and were born during 2007 to 2014 at the Research and Development Network Center for Animal Breeding (Native Chickens), Khon Kaen University. Average annual and monthly egg productions were 117 ± 41 and 10.20 ± 6.40 eggs, respectively. Nine random regression models were analyzed using the Wilmink function (WM), Koops and Grossman function (KG), Legendre polynomials functions with second, third, and fourth orders (LG2, LG3, LG4), and spline functions with 4, 5, 6, and 8 knots (SP4, SP5, SP6, and SP8). All covariance functions were nested within the same additive genetic and permanent environmental random effects, and the variance components were estimated by Restricted Maximum Likelihood (REML). In model comparisons, mean square error (MSE) and the coefficient of detemination (R 2 ) calculated the goodness of fit; and the correlation between observed and predicted values [Formula: see text] was used to calculate the cross-validated predictive abilities. We found that the covariance functions of SP5, SP6, and SP8 proved appropriate for the genetic evaluation of the egg production curves for Thai native chickens. The estimated heritability of monthly egg production ranged from 0.07 to 0.39, and the highest heritability was found during the first to third months of egg production. In conclusion, the spline functions within monthly egg production can be applied to breeding programs for the improvement of both egg number and persistence of egg production. © 2016 Poultry Science Association Inc.
NASA Astrophysics Data System (ADS)
Ghadiriyan Arani, M.; Pahlavani, P.; Effati, M.; Noori Alamooti, F.
2017-09-01
Today, one of the social problems influencing on the lives of many people is the road traffic crashes especially the highway ones. In this regard, this paper focuses on highway of capital and the most populous city in the U.S. state of Georgia and the ninth largest metropolitan area in the United States namely Atlanta. Geographically weighted regression and general centrality criteria are the aspects of traffic used for this article. In the first step, in order to estimate of crash intensity, it is needed to extract the dual graph from the status of streets and highways to use general centrality criteria. With the help of the graph produced, the criteria are: Degree, Pageranks, Random walk, Eccentricity, Closeness, Betweenness, Clustering coefficient, Eigenvector, and Straightness. The intensity of crash point is counted for every highway by dividing the number of crashes in that highway to the total number of crashes. Intensity of crash point is calculated for each highway. Then, criteria and crash point were normalized and the correlation between them was calculated to determine the criteria that are not dependent on each other. The proposed hybrid approach is a good way to regression issues because these effective measures result to a more desirable output. R2 values for geographically weighted regression using the Gaussian kernel was 0.539 and also 0.684 was obtained using a triple-core cube. The results showed that the triple-core cube kernel is better for modeling the crash intensity.
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-11-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
A Subsonic Aircraft Design Optimization With Neural Network and Regression Approximators
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.; Haller, William J.
2004-01-01
The Flight-Optimization-System (FLOPS) code encountered difficulty in analyzing a subsonic aircraft. The limitation made the design optimization problematic. The deficiencies have been alleviated through use of neural network and regression approximations. The insight gained from using the approximators is discussed in this paper. The FLOPS code is reviewed. Analysis models are developed and validated for each approximator. The regression method appears to hug the data points, while the neural network approximation follows a mean path. For an analysis cycle, the approximate model required milliseconds of central processing unit (CPU) time versus seconds by the FLOPS code. Performance of the approximators was satisfactory for aircraft analysis. A design optimization capability has been created by coupling the derived analyzers to the optimization test bed CometBoards. The approximators were efficient reanalysis tools in the aircraft design optimization. Instability encountered in the FLOPS analyzer was eliminated. The convergence characteristics were improved for the design optimization. The CPU time required to calculate the optimum solution, measured in hours with the FLOPS code was reduced to minutes with the neural network approximation and to seconds with the regression method. Generation of the approximators required the manipulation of a very large quantity of data. Design sensitivity with respect to the bounds of aircraft constraints is easily generated.
Chan, King-Pan; Chan, Kwok-Hung; Wong, Wilfred Hing-Sang; Peiris, J. S. Malik; Wong, Chit-Ming
2011-01-01
Background Reliable estimates of disease burden associated with respiratory viruses are keys to deployment of preventive strategies such as vaccination and resource allocation. Such estimates are particularly needed in tropical and subtropical regions where some methods commonly used in temperate regions are not applicable. While a number of alternative approaches to assess the influenza associated disease burden have been recently reported, none of these models have been validated with virologically confirmed data. Even fewer methods have been developed for other common respiratory viruses such as respiratory syncytial virus (RSV), parainfluenza and adenovirus. Methods and Findings We had recently conducted a prospective population-based study of virologically confirmed hospitalization for acute respiratory illnesses in persons <18 years residing in Hong Kong Island. Here we used this dataset to validate two commonly used models for estimation of influenza disease burden, namely the rate difference model and Poisson regression model, and also explored the applicability of these models to estimate the disease burden of other respiratory viruses. The Poisson regression models with different link functions all yielded estimates well correlated with the virologically confirmed influenza associated hospitalization, especially in children older than two years. The disease burden estimates for RSV, parainfluenza and adenovirus were less reliable with wide confidence intervals. The rate difference model was not applicable to RSV, parainfluenza and adenovirus and grossly underestimated the true burden of influenza associated hospitalization. Conclusion The Poisson regression model generally produced satisfactory estimates in calculating the disease burden of respiratory viruses in a subtropical region such as Hong Kong. PMID:21412433
Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities?
Aldstadt, Jared; Whalen, John; White, Kellee; Castro, Marcia C.; Williams, David R.
2017-01-01
BACKGROUND Multiple and varied benefits have been suggested for increased neighborhood walkability. However, spatial inequalities in neighborhood walkability likely exist and may be attributable, in part, to residential segregation. OBJECTIVE Utilizing a spatial demographic perspective, we evaluated potential spatial inequalities in walkable neighborhood amenities across census tracts in Boston, MA (US). METHODS The independent variables included minority racial/ethnic population percentages and percent of families in poverty. Walkable neighborhood amenities were assessed with a composite measure. Spatial autocorrelation in key study variables were first calculated with the Global Moran’s I statistic. Then, Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were calculated as well as Spearman correlations accounting for spatial autocorrelation. We fit ordinary least squares (OLS) regression and spatial autoregressive models, when appropriate, as a final step. RESULTS Significant positive spatial autocorrelation was found in neighborhood socio-demographic characteristics (e.g. census tract percent Black), but not walkable neighborhood amenities or in the OLS regression residuals. Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were not statistically significant, nor were neighborhood socio-demographic characteristics significantly associated with walkable neighborhood amenities in OLS regression models. CONCLUSIONS Our results suggest that there is residential segregation in Boston and that spatial inequalities do not necessarily show up using a composite measure. COMMENTS Future research in other geographic areas (including international contexts) and using different definitions of neighborhoods (including small-area definitions) should evaluate if spatial inequalities are found using composite measures but also should use measures of specific neighborhood amenities. PMID:29046612
NASA Astrophysics Data System (ADS)
Shen, S. S.
2014-12-01
This presentation describes a suite of global precipitation products reconstructed by a multivariate regression method using an empirical orthogonal function (EOF) expansion. The sampling errors of the reconstruction are estimated for each product datum entry. The maximum temporal coverage is 1850-present and the spatial coverage is quasi-global (75S, 75N). The temporal resolution ranges from 5-day, monthly, to seasonal and annual. The Global Precipitation Climatology Project (GPCP) precipitation data from 1979-2008 are used to calculate the EOFs. The Global Historical Climatology Network (GHCN) gridded data are used to calculate the regression coefficients for reconstructions. The sampling errors of the reconstruction are analyzed in detail for different EOF modes. Our reconstructed 1900-2011 time series of the global average annual precipitation shows a 0.024 (mm/day)/100a trend, which is very close to the trend derived from the mean of 25 models of the CMIP5 (Coupled Model Intercomparison Project Phase 5). Our reconstruction examples of 1983 El Niño precipitation and 1917 La Niña precipitation (Figure 1) demonstrate that the El Niño and La Niña precipitation patterns are well reflected in the first two EOFs. The validation of our reconstruction results with GPCP makes it possible to use the reconstruction as the benchmark data for climate models. This will help the climate modeling community to improve model precipitation mechanisms and reduce the systematic difference between observed global precipitation, which hovers at around 2.7 mm/day for reconstructions and GPCP, and model precipitations, which have a range of 2.6-3.3 mm/day for CMIP5. Our precipitation products are publically available online, including digital data, precipitation animations, computer codes, readme files, and the user manual. This work is a joint effort between San Diego State University (Sam Shen, Nancy Tafolla, Barbara Sperberg, and Melanie Thorn) and University of Maryland (Phil Arkin, Tom Smith, Li Ren, and Li Dai) and supported in part by the U.S. National Science Foundation (Awards No. AGS-1015926 and AGS-1015957).
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
Logistic regression model for diagnosis of transition zone prostate cancer on multi-parametric MRI.
Dikaios, Nikolaos; Alkalbani, Jokha; Sidhu, Harbir Singh; Fujiwara, Taiki; Abd-Alazeez, Mohamed; Kirkham, Alex; Allen, Clare; Ahmed, Hashim; Emberton, Mark; Freeman, Alex; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit
2015-02-01
We aimed to develop logistic regression (LR) models for classifying prostate cancer within the transition zone on multi-parametric magnetic resonance imaging (mp-MRI). One hundred and fifty-five patients (training cohort, 70 patients; temporal validation cohort, 85 patients) underwent mp-MRI and transperineal-template-prostate-mapping (TPM) biopsy. Positive cores were classified by cancer definitions: (1) any-cancer; (2) definition-1 [≥Gleason 4 + 3 or ≥ 6 mm cancer core length (CCL)] [high risk significant]; and (3) definition-2 (≥Gleason 3 + 4 or ≥ 4 mm CCL) cancer [intermediate-high risk significant]. For each, logistic-regression mp-MRI models were derived from the training cohort and validated internally and with the temporal cohort. Sensitivity/specificity and the area under the receiver operating characteristic (ROC-AUC) curve were calculated. LR model performance was compared to radiologists' performance. Twenty-eight of 70 patients from the training cohort, and 25/85 patients from the temporal validation cohort had significant cancer on TPM. The ROC-AUC of the LR model for classification of cancer was 0.73/0.67 at internal/temporal validation. The radiologist A/B ROC-AUC was 0.65/0.74 (temporal cohort). For patients scored by radiologists as Prostate Imaging Reporting and Data System (Pi-RADS) score 3, sensitivity/specificity of radiologist A 'best guess' and LR model was 0.14/0.54 and 0.71/0.61, respectively; and radiologist B 'best guess' and LR model was 0.40/0.34 and 0.50/0.76, respectively. LR models can improve classification of Pi-RADS score 3 lesions similar to experienced radiologists. • MRI helps find prostate cancer in the anterior of the gland • Logistic regression models based on mp-MRI can classify prostate cancer • Computers can help confirm cancer in areas doctors are uncertain about.
Burns, Douglas A.; Smith, Martyn J.; Freehafer, Douglas A.
2015-12-31
The application uses predictions of future annual precipitation from five climate models and two future greenhouse gas emissions scenarios and provides results that are averaged over three future periods—2025 to 2049, 2050 to 2074, and 2075 to 2099. Results are presented in ensemble form as the mean, median, maximum, and minimum values among the five climate models for each greenhouse gas emissions scenario and period. These predictions of future annual precipitation are substituted into either the precipitation variable or a water balance equation for runoff to calculate potential future peak flows. This application is intended to be used only as an exploratory tool because (1) the regression equations on which the application is based have not been adequately tested outside the range of the current climate and (2) forecasting future precipitation with climate models and downscaling these results to a fine spatial resolution have a high degree of uncertainty. This report includes a discussion of the assumptions, uncertainties, and appropriate use of this exploratory application.
Online quantitative analysis of multispectral images of human body tissues
NASA Astrophysics Data System (ADS)
Lisenko, S. A.
2013-08-01
A method is developed for online monitoring of structural and morphological parameters of biological tissues (haemoglobin concentration, degree of blood oxygenation, average diameter of capillaries and the parameter characterising the average size of tissue scatterers), which involves multispectral tissue imaging, image normalisation to one of its spectral layers and determination of unknown parameters based on their stable regression relation with the spectral characteristics of the normalised image. Regression is obtained by simulating numerically the diffuse reflectance spectrum of the tissue by the Monte Carlo method at a wide variation of model parameters. The correctness of the model calculations is confirmed by the good agreement with the experimental data. The error of the method is estimated under conditions of general variability of structural and morphological parameters of the tissue. The method developed is compared with the traditional methods of interpretation of multispectral images of biological tissues, based on the solution of the inverse problem for each pixel of the image in the approximation of different analytical models.
The value of a statistical life: a meta-analysis with a mixed effects regression model.
Bellavance, François; Dionne, Georges; Lebeau, Martin
2009-03-01
The value of a statistical life (VSL) is a very controversial topic, but one which is essential to the optimization of governmental decisions. We see a great variability in the values obtained from different studies. The source of this variability needs to be understood, in order to offer public decision-makers better guidance in choosing a value and to set clearer guidelines for future research on the topic. This article presents a meta-analysis based on 39 observations obtained from 37 studies (from nine different countries) which all use a hedonic wage method to calculate the VSL. Our meta-analysis is innovative in that it is the first to use the mixed effects regression model [Raudenbush, S.W., 1994. Random effects models. In: Cooper, H., Hedges, L.V. (Eds.), The Handbook of Research Synthesis. Russel Sage Foundation, New York] to analyze studies on the value of a statistical life. We conclude that the variability found in the values studied stems in large part from differences in methodologies.
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D
2017-10-26
To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient's reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.
Inferring Aquifer Transmissivity from River Flow Data
NASA Astrophysics Data System (ADS)
Trichakis, Ioannis; Pistocchi, Alberto
2016-04-01
Daily streamflow data is the measurable result of many different hydrological processes within a basin; therefore, it includes information about all these processes. In this work, recession analysis applied to a pan-European dataset of measured streamflow was used to estimate hydrogeological parameters of the aquifers that contribute to the stream flow. Under the assumption that base-flow in times of no precipitation is mainly due to groundwater, we estimated parameters of European shallow aquifers connected with the stream network, and identified on the basis of the 1:1,500,000 scale Hydrogeological map of Europe. To this end, Master recession curves (MRCs) were constructed based on the RECESS model of the USGS for 1601 stream gauge stations across Europe. The process consists of three stages. Firstly, the model analyses the stream flow time-series. Then, it uses regression to calculate the recession index. Finally, it infers characteristics of the aquifer from the recession index. During time-series analysis, the model identifies those segments, where the number of successive recession days is above a certain threshold. The reason for this pre-processing lies in the necessity for an adequate number of points when performing regression at a later stage. The recession index derives from the semi-logarithmic plot of stream flow over time, and the post processing involves the calculation of geometrical parameters of the watershed through a GIS platform. The program scans the full stream flow dataset of all the stations. For each station, it identifies the segments with continuous recession that exceed a predefined number of days. When the algorithm finds all the segments of a certain station, it analyses them and calculates the best linear fit between time and the logarithm of flow. The algorithm repeats this procedure for the full number of segments, thus it calculates many different values of recession index for each station. After the program has found all the recession segments, it performs calculations to determine the expression for the MRC. Further processing of the MRCs can yield estimates of transmissivity or response time representative of the aquifers upstream of the station. These estimates can be useful for large scale (e.g. continental) groundwater modelling. The above procedure allowed calculating values of transmissivity for a large share of European aquifers, ranging from Tmin = 4.13E-04 m²/d to Tmax = 8.12E+03 m²/d, with an average value Taverage = 9.65E+01 m²/d. These results are in line with the literature, indicating that the procedure may provide realistic results for large-scale groundwater modelling. In this contribution we present the results in the perspective of their application for the parameterization of a pan-European bi-dimensional shallow groundwater flow model.
NASA Astrophysics Data System (ADS)
Lubkin, S. H.; Morgan, C.
2015-12-01
Harmful algal bloom species have had an increasing ecological impact on the Chesapeake Bay Watershed where they disrupt water chemistry, kill fish and cause human illness. In Virginia, scientists from Virginia Institute of Marine Science and Old Dominion University monitor HABs and their effect on water quality; however, these groups lack a method to monitor HABs in real time. This limits the ability to document associated water quality conditions and predict future blooms. Band reflectance values from Landsat 8 Surface Reflectance data (USGS Earth Explorer) and MODIS Chlorophyll imagery (NOAA CoastWatch) were cross calibrated to create a regression model that calculated concentrations of chlorophyll. Calculations were verified with in situ measurements from the Virginia Estuarine and Coastal Observing System. Imagery produced with the Chlorophyll-A calculation model will allow VIMS and ODU scientists to assess the timing, magnitude, duration and frequency of HABs in Virginia's Chesapeake watershed and to predict the environmental and water quality conditions that favor bloom development.
Time trend of polycyclic aromatic hydrocarbon emission factors from motor vehicles
NASA Astrophysics Data System (ADS)
Tao, Shu; Shen, Huizhong; Wang, Rong; Sun, Kang
2010-05-01
Motor vehicle is an important emission source of polycyclic aromatic hydrocarbons (PAHs) and this is particularly true in urban areas. Motor vehicle emission factors (EFs) for individual PAH compound reported in the literature varied for 4 to 5 orders of magnitude, leading to high uncertainty in emission estimation. In this study, the major factors affecting EFs were investigated and characterized by regression models. Based on the model developed, a motor vehicle PAH emission inventory at country level was developed. It was found that country and model year are the most important factors affecting EFs for PAHs. The influence of the two factors can be quantified by a single parameter of per capita gross domestic production (purchasing power parity), which was used as the independent variables of the regression models. The models developed using randomly selected 80% of measurements and tested with the remained data accounted for 28 to 48% of the variations in EFs for PAHs measured in 16 countries over 50 years. The regression coefficients of the EF prediction models were molecular weight dependent. Motor vehicle emission of PAHs from individual countries in the world in 1985, 1995, 2005, 2015, and 2025 were calculated and the global emission of total PAHs were 470, 390, and 430 Gg in 1985, 1995, and 2005 and will be 290 and 130 Gg in 2015 and 2025, respectively. The emission is currently passing its peak and will decrease due to significant decrease in China and other developing countries.
Reynolds, Penny S; Tamariz, Francisco J; Barbee, Robert Wayne
2010-04-01
Exploratory pilot studies are crucial to best practice in research but are frequently conducted without a systematic method for maximizing the amount and quality of information obtained. We describe the use of response surface regression models and simultaneous optimization methods to develop a rat model of hemorrhagic shock in the context of chronic hypertension, a clinically relevant comorbidity. Response surface regression model was applied to determine optimal levels of two inputs--dietary NaCl concentration (0.49%, 4%, and 8%) and time on the diet (4, 6, 8 weeks)--to achieve clinically realistic and stable target measures of systolic blood pressure while simultaneously maximizing critical oxygen delivery (a measure of vulnerability to hemorrhagic shock) and body mass M. Simultaneous optimization of the three response variables was performed though a dimensionality reduction strategy involving calculation of a single aggregate measure, the "desirability" function. Optimal conditions for inducing systolic blood pressure of 208 mmHg, critical oxygen delivery of 4.03 mL/min, and M of 290 g were determined to be 4% [NaCl] for 5 weeks. Rats on the 8% diet did not survive past 7 weeks. Response surface regression model and simultaneous optimization method techniques are commonly used in process engineering but have found little application to date in animal pilot studies. These methods will ensure both the scientific and ethical integrity of experimental trials involving animals and provide powerful tools for the development of novel models of clinically interacting comorbidities with shock.
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
Ciura, Krzesimir; Belka, Mariusz; Kawczak, Piotr; Bączek, Tomasz; Markuszewski, Michał J; Nowakowska, Joanna
2017-09-05
The objective of this paper is to build QSRR/QSAR model for predicting the blood-brain barrier (BBB) permeability. The obtained models are based on salting-out thin layer chromatography (SOTLC) constants and calculated molecular descriptors. Among chromatographic methods SOTLC was chosen, since the mobile phases are free of organic solvent. As consequences, there are less toxic, and have lower environmental impact compared to classical reserved phases liquid chromatography (RPLC). During the study three stationary phase silica gel, cellulose plates and neutral aluminum oxide were examined. The model set of solutes presents a wide range of log BB values, containing compounds which cross the BBB readily and molecules poorly distributed to the brain including drugs acting on the nervous system as well as peripheral acting drugs. Additionally, the comparison of three regression models: multiple linear regression (MLR), partial least-squares (PLS) and orthogonal partial least squares (OPLS) were performed. The designed QSRR/QSAR models could be useful to predict BBB of systematically synthesized newly compounds in the drug development pipeline and are attractive alternatives of time-consuming and demanding directed methods for log BB measurement. The study also shown that among several regression techniques, significant differences can be obtained in models performance, measured by R 2 and Q 2 , hence it is strongly suggested to evaluate all available options as MLR, PLS and OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
Burgess, Robert; Davis, Robert; Edwards, Deborah
2018-03-01
Uptake of lead from soil was examined in order to establish a site-specific ecological protective concentration level for the Texas Horned Lizard (Phrynosoma cornutum) at the Former Humble Refinery in San Antonio, Texas. Soils, harvester ants, and rinse water from the ants were analyzed at 11 Texas Harvester Ant (Pogonomyrmex barbatus) mounds. Soil concentrations at the harvester ant mounds ranged from 13 to 7474 mg/kg of lead dry weight. Ant tissue sample concentrations ranged from < 0.82 to 21.17 mg/kg dry weight. Rinse water concentrations were below the reporting limit in the majority of samples. Two uptake models were developed for the ants. A bioaccumulation factor model did not fit the data, as there was a strong decay in the calculated value with rising soil concentrations. A univariate natural log-transformed regression model produced a significant regression (p < .0001) with a high coefficient of determination (0.82), indicating a good fit to the data. Other diagnostic regression statistics indicated that the regression model could be reliably used to predict concentrations of lead in harvester ants from soil concentrations. Estimates of protective levels for P. cornutum were developed using published sub-chronic toxicological findings for the Western Fence Lizard that were allometricly adapted and compared to the Dose oral equation, which estimated lead consumed through ants plus incidental soil ingestion. The no observed adverse effect level toxicological limit for P. cornutum was estimated to be 5500 mg/kg.
Desai, Rishi J; Solomon, Daniel H; Weinblatt, Michael E; Shadick, Nancy; Kim, Seoyoung C
2015-04-13
We conducted an external validation study to examine the correlation of a previously published claims-based index for rheumatoid arthritis severity (CIRAS) with disease activity score in 28 joints calculated by using C-reactive protein (DAS28-CRP) and the multi-dimensional health assessment questionnaire (MD-HAQ) physical function score. Patients enrolled in the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study (BRASS) and Medicare were identified and their data from these two sources were linked. For each patient, DAS28-CRP measurement and MD-HAQ physical function scores were extracted from BRASS, and CIRAS was calculated from Medicare claims for the period of 365 days prior to the DAS28-CRP measurement. Pearson correlation coefficient between CIRAS and DAS28-CRP as well as MD-HAQ physical function scores were calculated. Furthermore, we considered several additional pharmacy and medical claims-derived variables as predictors for DAS28-CRP in a multivariable linear regression model in order to assess improvement in the performance of the original CIRAS algorithm. In total, 315 patients with enrollment in both BRASS and Medicare were included in this study. The majority (81%) of the cohort was female, and the mean age was 70 years. The correlation between CIRAS and DAS28-CRP was low (Pearson correlation coefficient = 0.07, P = 0.24). The correlation between the calculated CIRAS and MD-HAQ physical function scores was also found to be low (Pearson correlation coefficient = 0.08, P = 0.17). The linear regression model containing additional claims-derived variables yielded model R(2) of 0.23, suggesting limited ability of this model to explain variation in DAS28-CRP. In a cohort of Medicare-enrolled patients with established RA, CIRAS showed low correlation with DAS28-CRP as well as MD-HAQ physical function scores. Claims-based algorithms for disease activity should be rigorously tested in distinct populations in order to establish their generalizability before widespread adoption.
Hypothesis Testing Using Factor Score Regression
Devlieger, Ines; Mayer, Axel; Rosseel, Yves
2015-01-01
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886
Analysis and selection of magnitude relations for the Working Group on Utah Earthquake Probabilities
Duross, Christopher; Olig, Susan; Schwartz, David
2015-01-01
Prior to calculating time-independent and -dependent earthquake probabilities for faults in the Wasatch Front region, the Working Group on Utah Earthquake Probabilities (WGUEP) updated a seismic-source model for the region (Wong and others, 2014) and evaluated 19 historical regressions on earthquake magnitude (M). These regressions relate M to fault parameters for historical surface-faulting earthquakes, including linear fault length (e.g., surface-rupture length [SRL] or segment length), average displacement, maximum displacement, rupture area, seismic moment (Mo ), and slip rate. These regressions show that significant epistemic uncertainties complicate the determination of characteristic magnitude for fault sources in the Basin and Range Province (BRP). For example, we found that M estimates (as a function of SRL) span about 0.3–0.4 units (figure 1) owing to differences in the fault parameter used; age, quality, and size of historical earthquake databases; and fault type and region considered.
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
Harvey, H Benjamin; Liu, Catherine; Ai, Jing; Jaworsky, Cristina; Guerrier, Claude Emmanuel; Flores, Efren; Pianykh, Oleg
2017-10-01
To test whether data elements available in the electronic medical record (EMR) can be effectively leveraged to predict failure to attend a scheduled radiology examination. Using data from a large academic medical center, we identified all patients with a diagnostic imaging examination scheduled from January 1, 2016, to April 1, 2016, and determined whether the patient successfully attended the examination. Demographic, clinical, and health services utilization variables available in the EMR potentially relevant to examination attendance were recorded for each patient. We used descriptive statistics and logistic regression models to test whether these data elements could predict failure to attend a scheduled radiology examination. The predictive accuracy of the regression models were determined by calculating the area under the receiver operator curve. Among the 54,652 patient appointments with radiology examinations scheduled during the study period, 6.5% were no-shows. No-show rates were highest for the modalities of mammography and CT and lowest for PET and MRI. Logistic regression indicated that 16 of the 27 demographic, clinical, and health services utilization factors were significantly associated with failure to attend a scheduled radiology examination (P ≤ .05). Stepwise logistic regression analysis demonstrated that previous no-shows, days between scheduling and appointments, modality type, and insurance type were most strongly predictive of no-show. A model considering all 16 data elements had good ability to predict radiology no-shows (area under the receiver operator curve = 0.753). The predictive ability was similar or improved when these models were analyzed by modality. Patient and examination information readily available in the EMR can be successfully used to predict radiology no-shows. Moving forward, this information can be proactively leveraged to identify patients who might benefit from additional patient engagement through appointment reminders or other targeted interventions to avoid no-shows. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Arcenegui, V.; Morugán, A.; García-Orenes, F.; Zornoza, R.; Mataix-Solera, J.; Navarro, M. A.; Guerrero, C.; Mataix-Beneyto, J.
2009-04-01
The use of treated wastewater for the irrigation of agricultural soils is an alternative to utilizing better-quality water, especially in semiarid regions where water shortage is a very serious problem. However, this practise can modify the soil equilibrium and affect its quality. In this work two soil quality indices (models) are used to evaluate the effects of long-term irrigation with treated wastewater in soil. The models were developed studying different soil properties in undisturbed forest soils in SE Spain, and the relationships between soil parameters were established using multiple linear regressions. Model 1, that explained 92% of the variance in soil organic carbon (SOC) showed that the SOC can be calculated by the linear combination of 6 physical, chemical and biochemical properties (acid phosphatase, water holding capacity (WHC), electrical conductivity (EC), available phosphorus (P), cation exchange capacity (CEC) and aggregate stability (AS)). Model 2 explains 89% of the SOC variance, which can be calculated by means of 7 chemical and biochemical properties (urease, phosphatase, and
Gene set analysis using variance component tests.
Huang, Yen-Tsung; Lin, Xihong
2013-06-28
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Establishing endangered species recovery criteria using predictive simulation modeling
McGowan, Conor P.; Catlin, Daniel H.; Shaffer, Terry L.; Gratto-Trevor, Cheri L.; Aron, Carol
2014-01-01
Listing a species under the Endangered Species Act (ESA) and developing a recovery plan requires U.S. Fish and Wildlife Service to establish specific and measurable criteria for delisting. Generally, species are listed because they face (or are perceived to face) elevated risk of extinction due to issues such as habitat loss, invasive species, or other factors. Recovery plans identify recovery criteria that reduce extinction risk to an acceptable level. It logically follows that the recovery criteria, the defined conditions for removing a species from ESA protections, need to be closely related to extinction risk. Extinction probability is a population parameter estimated with a model that uses current demographic information to project the population into the future over a number of replicates, calculating the proportion of replicated populations that go extinct. We simulated extinction probabilities of piping plovers in the Great Plains and estimated the relationship between extinction probability and various demographic parameters. We tested the fit of regression models linking initial abundance, productivity, or population growth rate to extinction risk, and then, using the regression parameter estimates, determined the conditions required to reduce extinction probability to some pre-defined acceptable threshold. Binomial regression models with mean population growth rate and the natural log of initial abundance were the best predictors of extinction probability 50 years into the future. For example, based on our regression models, an initial abundance of approximately 2400 females with an expected mean population growth rate of 1.0 will limit extinction risk for piping plovers in the Great Plains to less than 0.048. Our method provides a straightforward way of developing specific and measurable recovery criteria linked directly to the core issue of extinction risk. Published by Elsevier Ltd.
Model parameter uncertainty analysis for an annual field-scale P loss model
NASA Astrophysics Data System (ADS)
Bolster, Carl H.; Vadas, Peter A.; Boykin, Debbie
2016-08-01
Phosphorous (P) fate and transport models are important tools for developing and evaluating conservation practices aimed at reducing P losses from agricultural fields. Because all models are simplifications of complex systems, there will exist an inherent amount of uncertainty associated with their predictions. It is therefore important that efforts be directed at identifying, quantifying, and communicating the different sources of model uncertainties. In this study, we conducted an uncertainty analysis with the Annual P Loss Estimator (APLE) model. Our analysis included calculating parameter uncertainties and confidence and prediction intervals for five internal regression equations in APLE. We also estimated uncertainties of the model input variables based on values reported in the literature. We then predicted P loss for a suite of fields under different management and climatic conditions while accounting for uncertainties in the model parameters and inputs and compared the relative contributions of these two sources of uncertainty to the overall uncertainty associated with predictions of P loss. Both the overall magnitude of the prediction uncertainties and the relative contributions of the two sources of uncertainty varied depending on management practices and field characteristics. This was due to differences in the number of model input variables and the uncertainties in the regression equations associated with each P loss pathway. Inspection of the uncertainties in the five regression equations brought attention to a previously unrecognized limitation with the equation used to partition surface-applied fertilizer P between leaching and runoff losses. As a result, an alternate equation was identified that provided similar predictions with much less uncertainty. Our results demonstrate how a thorough uncertainty and model residual analysis can be used to identify limitations with a model. Such insight can then be used to guide future data collection and model development and evaluation efforts.
Yamagata, Tetsuo; Zanelli, Ugo; Gallemann, Dieter; Perrin, Dominique; Dolgos, Hugues; Petersson, Carl
2017-09-01
1. We compared direct scaling, regression model equation and the so-called "Poulin et al." methods to scale clearance (CL) from in vitro intrinsic clearance (CL int ) measured in human hepatocytes using two sets of compounds. One reference set comprised of 20 compounds with known elimination pathways and one external evaluation set based on 17 compounds development in Merck (MS). 2. A 90% prospective confidence interval was calculated using the reference set. This interval was found relevant for the regression equation method. The three outliers identified were justified on the basis of their elimination mechanism. 3. The direct scaling method showed a systematic underestimation of clearance in both the reference and evaluation sets. The "Poulin et al." and the regression equation methods showed no obvious bias in either the reference or evaluation sets. 4. The regression model equation was slightly superior to the "Poulin et al." method in the reference set and showed a better absolute average fold error (AAFE) of value 1.3 compared to 1.6. A larger difference was observed in the evaluation set were the regression method and "Poulin et al." resulted in an AAFE of 1.7 and 2.6, respectively (removing the three compounds with known issues mentioned above). A similar pattern was observed for the correlation coefficient. Based on these data we suggest the regression equation method combined with a prospective confidence interval as the first choice for the extrapolation of human in vivo hepatic metabolic clearance from in vitro systems.
Breaker, Brian K.
2015-01-01
Equations for two regions were found to be statistically significant for developing regression equations for estimating harmonic mean flows at ungaged basins; thus, equations are applicable only to streams in those respective regions in Arkansas. Regression equations for dry season mean monthly flows are applicable only to streams located throughout Arkansas. All regression equations are applicable only to unaltered streams where flows were not significantly affected by regulation, diversion, or urbanization. The median number of years used for dry season mean monthly flow calculation was 43, and the median number of years used for harmonic mean flow calculations was 34 for region 1 and 43 for region 2.
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Wang, Bing; Shen, Hao; Fang, Aiqin; Huang, De-Shuang; Jiang, Changjun; Zhang, Jun; Chen, Peng
2016-06-17
Comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC/TOF-MS) system has become a key analytical technology in high-throughput analysis. Retention index has been approved to be helpful for compound identification in one-dimensional gas chromatography, which is also true for two-dimensional gas chromatography. In this work, a novel regression model was proposed for calculating the second dimension retention index of target components where n-alkanes were used as reference compounds. This model was developed to depict the relationship among adjusted second dimension retention time, temperature of the second dimension column and carbon number of n-alkanes by an exponential nonlinear function with only five parameters. Three different criteria were introduced to find the optimal values of parameters. The performance of this model was evaluated using experimental data of n-alkanes (C7-C31) at 24 temperatures which can cover all 0-6s adjusted retention time area. The experimental results show that the mean relative error between predicted adjusted retention time and experimental data of n-alkanes was only 2%. Furthermore, our proposed model demonstrates a good extrapolation capability for predicting adjusted retention time of target compounds which located out of the range of the reference compounds in the second dimension adjusted retention time space. Our work shows the deviation was less than 9 retention index units (iu) while the number of alkanes were added up to 5. The performance of our proposed model has also been demonstrated by analyzing a mixture of compounds in temperature programmed experiments. Copyright © 2016 Elsevier B.V. All rights reserved.
Zelle, Sten G; Baltussen, Rob; Otten, Johannes D M; Heijnsdijk, Eveline A M; van Schoor, Guido; Broeders, Mireille J M
2015-03-01
To provide proof of concept for a simple model to estimate the stage shift as a result of breast cancer screening in low- and middle-income countries (LMICs). Stage shift is an essential early detection indicator and an important proxy for the performance and possible further impact of screening programmes. Our model could help LIMCs to choose appropriate control strategies. We assessed our model concept in three steps. First, we calculated the proportional performance rates (i.e. index number Z) based on 16 screening rounds of the Nijmegen Screening Program (384,884 screened women). Second, we used linear regression to assess the association between Z and the amount of stage shift observed in the programme. Third, we hypothesized how Z could be used to estimate the stage shift as a result of breast cancer screening in LMICs. Stage shifts can be estimated by the proportional performance rates (Zs) using linear regression. Zs calculated for each screening round are highly associated with the observed stage shifts in the Nijmegen Screening Program (Pearson's R: 0.798, R square: 0.637). Our model can predict the stage shifts in the Nijmegen Screening Program, and could be applied to settings with different characteristics, although it should not be straightforwardly used to estimate the impact on mortality. Further research should investigate the extrapolation of our model to other settings. As stage shift is an essential screening performance indicator, our model could provide important information on the performance of breast cancer screening programmes that LMICs consider implementing. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Sheehan, Kenneth R.; Strager, Michael P.; Welsh, Stuart A.
2013-01-01
Stream habitat assessments are commonplace in fish management, and often involve nonspatial analysis methods for quantifying or predicting habitat, such as ordinary least squares regression (OLS). Spatial relationships, however, often exist among stream habitat variables. For example, water depth, water velocity, and benthic substrate sizes within streams are often spatially correlated and may exhibit spatial nonstationarity or inconsistency in geographic space. Thus, analysis methods should address spatial relationships within habitat datasets. In this study, OLS and a recently developed method, geographically weighted regression (GWR), were used to model benthic substrate from water depth and water velocity data at two stream sites within the Greater Yellowstone Ecosystem. For data collection, each site was represented by a grid of 0.1 m2 cells, where actual values of water depth, water velocity, and benthic substrate class were measured for each cell. Accuracies of regressed substrate class data by OLS and GWR methods were calculated by comparing maps, parameter estimates, and determination coefficient r 2. For analysis of data from both sites, Akaike’s Information Criterion corrected for sample size indicated the best approximating model for the data resulted from GWR and not from OLS. Adjusted r 2 values also supported GWR as a better approach than OLS for prediction of substrate. This study supports GWR (a spatial analysis approach) over nonspatial OLS methods for prediction of habitat for stream habitat assessments.
A nonparametric multiple imputation approach for missing categorical data.
Zhou, Muhan; He, Yulei; Yu, Mandi; Hsu, Chiu-Hsieh
2017-06-06
Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented. The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method. We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome. In terms of the choices for the working models, we suggest a multinomial logistic regression for predicting the missing outcome and a binary logistic regression for predicting the missingness probability.
Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree.
Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-Koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid
2016-10-01
Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. The proposed model had an accuracy of 94.84% (. 24.42) in order to correct prediction of the ESD disease. Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD.
On statistical analysis of factors affecting anthocyanin extraction from Ixora siamensis
NASA Astrophysics Data System (ADS)
Mat Nor, N. A.; Arof, A. K.
2016-10-01
This study focused on designing an experimental model in order to evaluate the influence of operative extraction parameters employed for anthocyanin extraction from Ixora siamensis on CIE color measurements (a*, b* and color saturation). Extractions were conducted at temperatures of 30, 55 and 80°C, soaking time of 60, 120 and 180 min using acidified methanol solvent with different trifluoroacetic acid (TFA) contents of 0.5, 1.75 and 3% (v/v). The statistical evaluation was performed by running analysis of variance (ANOVA) and regression calculation to investigate the significance of the generated model. Results show that the generated regression models adequately explain the data variation and significantly represented the actual relationship between the independent variables and the responses. Analysis of variance (ANOVA) showed high coefficient determination values (R2) of 0.9687 for a*, 0.9621 for b* and 0.9758 for color saturation, thus ensuring a satisfactory fit of the developed models with the experimental data. Interaction between TFA content and extraction temperature exhibited to the highest significant influence on CIE color parameter.
Determination of streamflow of the Arkansas River near Bentley in south-central Kansas
Perry, Charles A.
2012-01-01
The Kansas Department of Agriculture, Division of Water Resources, requires that the streamflow of the Arkansas River just upstream from Bentley in south-central Kansas be measured or calculated before groundwater can be pumped from the well field. When the daily streamflow of the Arkansas River near Bentley is less than 165 cubic feet per second (ft3/s), pumping must be curtailed. Daily streamflow near Bentley was calculated by determining the relations between streamflow data from two reference streamgages with a concurrent record of 24 years, one located 17.2 miles (mi) upstream and one located 10.9 mi downstream, and streamflow at a temporary gage located just upstream from Bentley (Arkansas River near Bentley, Kansas). Flow-duration curves for the two reference streamgages indicate that during 1988?2011, the mean daily streamflow was less than 165 ft3/s 30 to 35 percent of the time. During extreme low-flow (drought) conditions, the reach of the Arkansas River between Hutchinson and Maize can lose flow to the adjacent alluvial aquifer, with streamflow losses as much as 1.6 cubic feet per second per mile. Three models were developed to calculate the streamflow of the Arkansas River near Bentley, Kansas. The model chosen depends on the data available and on whether the reach of the Arkansas River between Hutchinson and Maize is gaining or losing groundwater from or to the adjacent alluvial aquifer. The first model was a pair of equations developed from linear regressions of the relation between daily streamflow data from the Bentley streamgage and daily streamflow data from either the Arkansas River near Hutchinson, Kansas, station (station number 07143330) or the Arkansas River near Maize, Kansas, station (station number 07143375). The standard error of the Hutchinson-only equation was 22.8 ft3/s, and the standard error of the Maize-only equation was 22.3 ft3/s. The single-station model would be used if only one streamgage was available. In the second model, the flow gradient between the streamflow near Hutchinson and the streamflow near Maize was used to calculate the streamflow at the Bentley streamgage. This equation resulted in a standard error of 26.7 ft3/s. In the third model, a multiple regression analysis between both the daily streamflow of the Arkansas River near Hutchinson, Kansas, and the daily streamflow of the Arkansas River near Maize, Kansas, was used to calculate the streamflow at the Bentley streamgage. The multiple regression equation had a standard error of 21.2 ft3/s, which was the smallest of the standard errors for all the models. An analysis of the number of low-flow days and the number of days when the reach between Hutchinson and Maize loses flow to the adjacent alluvial aquifer indicates that the long-term trend is toward fewer days of losing conditions. This trend may indicate a long-term increase in water levels in the alluvial aquifer, which could be caused by one or more of several conditions, including an increase in rainfall, a decrease in pumping, a decrease in temperature, and an increase in streamflow upstream from the Hutchinson-to-Maize reach of the Arkansas River.
Variability of pCO2 in surface waters and development of prediction model.
Chung, Sewoong; Park, Hyungseok; Yoo, Jisu
2018-05-01
Inland waters are substantial sources of atmospheric carbon, but relevant data are rare in Asian monsoon regions including Korea. Emissions of CO 2 to the atmosphere depend largely on the partial pressure of CO 2 (pCO 2 ) in water; however, measured pCO 2 data are scarce and calculated pCO 2 can show large uncertainty. This study had three objectives: 1) to examine the spatial variability of pCO 2 in diverse surface water systems in Korea; 2) to compare pCO 2 calculated using pH-total alkalinity (Alk) and pH-dissolved inorganic carbon (DIC) with pCO 2 measured by an in situ submersible nondispersive infrared detector; and 3) to characterize the major environmental variables determining the variation of pCO 2 based on physical, chemical, and biological data collected concomitantly. Of 30 samples, 80% were found supersaturated in CO 2 with respect to the overlying atmosphere. Calculated pCO 2 using pH-Alk and pH-DIC showed weak prediction capability and large variations with respect to measured pCO 2 . Error analysis indicated that calculated pCO 2 is highly sensitive to the accuracy of pH measurements, particularly at low pH. Stepwise multiple linear regression (MLR) and random forest (RF) techniques were implemented to develop the most parsimonious model based on 10 potential predictor variables (pH, Alk, DIC, Uw, Cond, Turb, COD, DOC, TOC, Chla) by optimizing model performance. The RF model showed better performance than the MLR model, and the most parsimonious RF model (pH, Turb, Uw, Chla) improved pCO 2 prediction capability considerably compared with the simple calculation approach, reducing the RMSE from 527-544 to 105μatm at the study sites. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.
77 FR 3121 - Program Integrity: Gainful Employment-Debt Measures; Correction
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-23
...On June 13, 2011, the Secretary of Education (Secretary) published a notice of final regulations in the Federal Register for Program Integrity: Gainful Employment--Debt Measures (Gainful Employment--Debt Measures) (76 FR 34386). In the preamble of the final regulations, we used the wrong data to calculate the percent of total variance in institutions' repayment rates that may be explained by race/ethnicity. Our intent was to use the data that included all minority students per institution. However, we mistakenly used the data for a subset of minority students per institution. We have now recalculated the total variance using the data that includes all minority students. Through this document, we correct, in the preamble of the Gainful Employment--Debt Measures final regulations, the errors resulting from this misapplication. We do not change the regression analysis model itself; we are using the same model with the appropriate data. Through this notice we also correct, in the preamble of the Gainful Employment--Debt Measures final regulations, our description of one component of the regression analysis. The preamble referred to use of an institutional variable measuring acceptance rates. This description was incorrect; in fact we used an institutional variable measuring retention rates. Correcting this language does not change the regression analysis model itself or the variance explained by the model. The text of the final regulations remains unchanged.
Carbon emissions risk map from deforestation in the tropical Amazon
NASA Astrophysics Data System (ADS)
Ometto, J.; Soler, L. S.; Assis, T. D.; Oliveira, P. V.; Aguiar, A. P.
2011-12-01
Assis, Pedro Valle This work aims to estimate the carbon emissions from tropical deforestation in the Brazilian Amazon associated to the risk assessment of future land use change. The emissions are estimated by incorporating temporal deforestation dynamics, accounting for the biophysical and socioeconomic heterogeneity in the region, as well secondary forest growth dynamic in abandoned areas. The land cover change model that supported the risk assessment of deforestation, was run based on linear regressions. This method takes into account spatial heterogeneity of deforestation as the spatial variables adopted to fit the final regression model comprise: environmental aspects, economic attractiveness, accessibility and land tenure structure. After fitting a suitable regression models for each land cover category, the potential of each cell to be deforested (25x25km and 5x5 km of resolution) in the near future was used to calculate the risk assessment of land cover change. The carbon emissions model combines high-resolution new forest clear-cut mapping and four alternative sources of spatial information on biomass distribution for different vegetation types. The risk assessment map of CO2 emissions, was obtained by crossing the simulation results of the historical land cover changes to a map of aboveground biomass contained in the remaining forest. This final map represents the risk of CO2 emissions at 25x25km and 5x5 km until 2020, under a scenario of carbon emission reduction target.
Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide
2009-01-01
The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
Barros, L M; Martins, R T; Ferreira-Keppler, R L; Gutjahr, A L N
2017-08-04
Information on biomass is substantial for calculating growth rates and may be employed in the medicolegal and economic importance of Hermetia illucens (Linnaeus, 1758). Although biomass is essential to understanding many ecological processes, it is not easily measured. Biomass may be determined by directly weighing or indirectly through regression models of fresh/dry mass versus body dimensions. In this study, we evaluated the association between morphometry and fresh/dry mass of immature H. illucens using linear, exponential, and power regression models. We measured width and length of the cephalic capsule, overall body length, and width of the largest abdominal segment of 280 larvae. Overall body length and width of the largest abdominal segment were the best predictors for biomass. Exponential models best fitted body dimensions and biomass (both fresh and dry), followed by power and linear models. In all models, fresh and dry biomass were strongly correlated (>75%). Values estimated by the models did not differ from observed ones, and prediction power varied from 27 to 79%. Accordingly, the correspondence between biomass and body dimensions should facilitate and motivate the development of applied studies involving H. illucens in the Amazon region.
Brügemann, K; Gernand, E; von Borstel, U U; König, S
2011-08-01
Data used in the present study included 1,095,980 first-lactation test-day records for protein yield of 154,880 Holstein cows housed on 196 large-scale dairy farms in Germany. Data were recorded between 2002 and 2009 and merged with meteorological data from public weather stations. The maximum distance between each farm and its corresponding weather station was 50 km. Hourly temperature-humidity indexes (THI) were calculated using the mean of hourly measurements of dry bulb temperature and relative humidity. On the phenotypic scale, an increase in THI was generally associated with a decrease in daily protein yield. For genetic analyses, a random regression model was applied using time-dependent (d in milk, DIM) and THI-dependent covariates. Additive genetic and permanent environmental effects were fitted with this random regression model and Legendre polynomials of order 3 for DIM and THI. In addition, the fixed curve was modeled with Legendre polynomials of order 3. Heterogeneous residuals were fitted by dividing DIM into 5 classes, and by dividing THI into 4 classes, resulting in 20 different classes. Additive genetic variances for daily protein yield decreased with increasing degrees of heat stress and were lowest at the beginning of lactation and at extreme THI. Due to higher additive genetic variances, slightly higher permanent environment variances, and similar residual variances, heritabilities were highest for low THI in combination with DIM at the end of lactation. Genetic correlations among individual values for THI were generally >0.90. These trends from the complex random regression model were verified by applying relatively simple bivariate animal models for protein yield measured in 2 THI environments; that is, defining a THI value of 60 as a threshold. These high correlations indicate the absence of any substantial genotype × environment interaction for protein yield. However, heritabilities and additive genetic variances from the random regression model tended to be slightly higher in the THI range corresponding to cows' comfort zone. Selecting such superior environments for progeny testing can contribute to an accurate genetic differentiation among selection candidates. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng
2018-02-09
Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P < 0.001) and inferior to that of the Logistic regression model (71.87% vs 78.71%, P < 0.001). The specificity of decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P < 0.001). The Area under the ROC curve (AUROCC) of the decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P < 0.001) and logistic regression model (0.765 vs 0.662, P < 0.001). BOLD MRI is a useful non-invasive imaging technique for the evaluation of lupus nephritis. Decision tree models constructed using functions of R2* values may facilitate the prediction of renal pathological patterns.
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
A CNN Regression Approach for Real-Time 2D/3D Registration.
Shun Miao; Wang, Z Jane; Rui Liao
2016-05-01
In this paper, we present a Convolutional Neural Network (CNN) regression approach to address the two major limitations of existing intensity-based 2-D/3-D registration technology: 1) slow computation and 2) small capture range. Different from optimization-based methods, which iteratively optimize the transformation parameters over a scalar-valued metric function representing the quality of the registration, the proposed method exploits the information embedded in the appearances of the digitally reconstructed radiograph and X-ray images, and employs CNN regressors to directly estimate the transformation parameters. An automatic feature extraction step is introduced to calculate 3-D pose-indexed features that are sensitive to the variables to be regressed while robust to other factors. The CNN regressors are then trained for local zones and applied in a hierarchical manner to break down the complex regression task into multiple simpler sub-tasks that can be learned separately. Weight sharing is furthermore employed in the CNN regression model to reduce the memory footprint. The proposed approach has been quantitatively evaluated on 3 potential clinical applications, demonstrating its significant advantage in providing highly accurate real-time 2-D/3-D registration with a significantly enlarged capture range when compared to intensity-based methods.
González-Madroño, A; Mancha, A; Rodríguez, F J; Culebras, J; de Ulibarri, J I
2012-01-01
To ratify previous validations of the CONUT nutritional screening tool by the development of two probabilistic models using the parameters included in the CONUT, to see if the CONUT´s effectiveness could be improved. It is a two step prospective study. In Step 1, 101 patients were randomly selected, and SGA and CONUT was made. With data obtained an unconditional logistic regression model was developed, and two variants of CONUT were constructed: Model 1 was made by a method of logistic regression. Model 2 was made by dividing the probabilities of undernutrition obtained in model 1 in seven regular intervals. In step 2, 60 patients were selected and underwent the SGA, the original CONUT and the new models developed. The diagnostic efficacy of the original CONUT and the new models was tested by means of ROC curves. Both samples 1 and 2 were put together to measure the agreement degree between the original CONUT and SGA, and diagnostic efficacy parameters were calculated. No statistically significant differences were found between sample 1 and 2, regarding age, sex and medical/surgical distribution and undernutrition rates were similar (over 40%). The AUC for the ROC curves were 0.862 for the original CONUT, and 0.839 and 0.874, for model 1 and 2 respectively. The kappa index for the CONUT and SGA was 0.680. The CONUT, with the original scores assigned by the authors is equally good than mathematical models and thus is a valuable tool, highly useful and efficient for the purpose of Clinical Undernutrition screening.
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
NASA Astrophysics Data System (ADS)
Hajigeorgiou, Photos G.
2016-12-01
An analytical model for the diatomic potential energy function that was recently tested as a universal function (Hajigeorgiou, 2010) has been further modified and tested as a suitable model for direct-potential-fit analysis. Applications are presented for the ground electronic states of three diatomic molecules: oxygen, carbon monoxide, and hydrogen fluoride. The adjustable parameters of the extended Lennard-Jones potential model are determined through nonlinear regression by fits to calculated rovibrational energy term values or experimental spectroscopic line positions. The model is shown to lead to reliable, compact and simple representations for the potential energy functions of these systems and could therefore be classified as a suitable and attractive model for direct-potential-fit analysis.
Validity of Treadmill-Derived Critical Speed on Predicting 5000-Meter Track-Running Performance.
Nimmerichter, Alfred; Novak, Nina; Triska, Christoph; Prinz, Bernhard; Breese, Brynmor C
2017-03-01
Nimmerichter, A, Novak, N, Triska, C, Prinz, B, and Breese, BC. Validity of treadmill-derived critical speed on predicting 5,000-meter track-running performance. J Strength Cond Res 31(3): 706-714, 2017-To evaluate 3 models of critical speed (CS) for the prediction of 5,000-m running performance, 16 trained athletes completed an incremental test on a treadmill to determine maximal aerobic speed (MAS) and 3 randomly ordered runs to exhaustion at the [INCREMENT]70% intensity, at 110% and 98% of MAS. Critical speed and the distance covered above CS (D') were calculated using the hyperbolic speed-time (HYP), the linear distance-time (LIN), and the linear speed inverse-time model (INV). Five thousand meter performance was determined on a 400-m running track. Individual predictions of 5,000-m running time (t = [5,000-D']/CS) and speed (s = D'/t + CS) were calculated across the 3 models in addition to multiple regression analyses. Prediction accuracy was assessed with the standard error of estimate (SEE) from linear regression analysis and the mean difference expressed in units of measurement and coefficient of variation (%). Five thousand meter running performance (speed: 4.29 ± 0.39 m·s; time: 1,176 ± 117 seconds) was significantly better than the predictions from all 3 models (p < 0.0001). The mean difference was 65-105 seconds (5.7-9.4%) for time and -0.22 to -0.34 m·s (-5.0 to -7.5%) for speed. Predictions from multiple regression analyses with CS and D' as predictor variables were not significantly different from actual running performance (-1.0 to 1.1%). The SEE across all models and predictions was approximately 65 seconds or 0.20 m·s and is therefore considered as moderate. The results of this study have shown the importance of aerobic and anaerobic energy system contribution to predict 5,000-m running performance. Using estimates of CS and D' is valuable for predicting performance over race distances of 5,000 m.
Validity of VO(2 max) in predicting blood volume: implications for the effect of fitness on aging
NASA Technical Reports Server (NTRS)
Convertino, V. A.; Ludwig, D. A.
2000-01-01
A multiple regression model was constructed to investigate the premise that blood volume (BV) could be predicted using several anthropometric variables, age, and maximal oxygen uptake (VO(2 max)). To test this hypothesis, age, calculated body surface area (height/weight composite), percent body fat (hydrostatic weight), and VO(2 max) were regressed on to BV using data obtained from 66 normal healthy men. Results from the evaluation of the full model indicated that the most parsimonious result was obtained when age and VO(2 max) were regressed on BV expressed per kilogram body weight. The full model accounted for 52% of the total variance in BV per kilogram body weight. Both age and VO(2 max) were related to BV in the positive direction. Percent body fat contributed <1% to the explained variance in BV when expressed in absolute BV (ml) or as BV per kilogram body weight. When the model was cross validated on 41 new subjects and BV per kilogram body weight was reexpressed as raw BV, the results indicated that the statistical model would be stable under cross validation (e.g., predictive applications) with an accuracy of +/- 1,200 ml at 95% confidence. Our results support the hypothesis that BV is an increasing function of aerobic fitness and to a lesser extent the age of the subject. The results may have implication as to a mechanism by which aerobic fitness and activity may be protective against reduced BV associated with aging.
Fridman, M; Hodgkins, P S; Kahle, J S; Erder, M H
2015-06-01
There are few approved therapies for adults with attention-deficit/hyperactivity disorder (ADHD) in Europe. Lisdexamfetamine (LDX) is an effective treatment for ADHD; however, no clinical trials examining the efficacy of LDX specifically in European adults have been conducted. Therefore, to estimate the efficacy of LDX in European adults we performed a meta-regression of existing clinical data. A systematic review identified US- and Europe-based randomized efficacy trials of LDX, atomoxetine (ATX), or osmotic-release oral system methylphenidate (OROS-MPH) in children/adolescents and adults. A meta-regression model was then fitted to the published/calculated effect sizes (Cohen's d) using medication, geographical location, and age group as predictors. The LDX effect size in European adults was extrapolated from the fitted model. Sensitivity analyses performed included using adult-only studies and adding studies with placebo designs other than a standard pill-placebo design. Twenty-two of 2832 identified articles met inclusion criteria. The model-estimated effect size of LDX for European adults was 1.070 (95% confidence interval: 0.738, 1.401), larger than the 0.8 threshold for large effect sizes. The overall model fit was adequate (80%) and stable in the sensitivity analyses. This model predicts that LDX may have a large treatment effect size in European adults with ADHD. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Batterham, Philip J; Bunce, David; Mackinnon, Andrew J; Christensen, Helen
2014-01-01
very few studies have examined the association between intra-individual reaction time variability and subsequent mortality. Furthermore, the ability of simple measures of variability to predict mortality has not been compared with more complex measures. a prospective cohort study of 896 community-based Australian adults aged 70+ were interviewed up to four times from 1990 to 2002, with vital status assessed until June 2007. From this cohort, 770-790 participants were included in Cox proportional hazards regression models of survival. Vital status and time in study were used to conduct survival analyses. The mean reaction time and three measures of intra-individual reaction time variability were calculated separately across 20 trials of simple and choice reaction time tasks. Models were adjusted for a range of demographic, physical health and mental health measures. greater intra-individual simple reaction time variability, as assessed by the raw standard deviation (raw SD), coefficient of variation (CV) or the intra-individual standard deviation (ISD), was strongly associated with an increased hazard of all-cause mortality in adjusted Cox regression models. The mean reaction time had no significant association with mortality. intra-individual variability in simple reaction time appears to have a robust association with mortality over 17 years. Health professionals such as neuropsychologists may benefit in their detection of neuropathology by supplementing neuropsychiatric testing with the straightforward process of testing simple reaction time and calculating raw SD or CV.
Phobic Anxiety and Plasma Levels of Global Oxidative Stress in Women
Hagan, Kaitlin A.; Wu, Tianying; Rimm, Eric B.; Eliassen, A. Heather; Okereke, Olivia I.
2015-01-01
Background and Objectives Psychological distress has been hypothesized to be associated with adverse biologic states such as higher oxidative stress and inflammation. Yet, little is known about associations between a common form of distress – phobic anxiety – and global oxidative stress. Thus, we related phobic anxiety to plasma fluorescent oxidation products (FlOPs), a global oxidative stress marker. Methods We conducted a cross-sectional analysis among 1,325 women (aged 43-70 years) from the Nurses’ Health Study. Phobic anxiety was measured using the Crown-Crisp Index (CCI). Adjusted least-squares mean log-transformed FlOPs were calculated across phobic categories. Logistic regression models were used to calculate odds ratios (OR) comparing the highest CCI category (≥6 points) vs. lower scores, across FlOPs quartiles. Results No association was found between phobic anxiety categories and mean FlOP levels in multivariable adjusted linear models. Similarly, in multivariable logistic regression models there were no associations between FlOPs quartiles and likelihood of being in the highest phobic category. Comparing women in the highest vs. lowest FlOPs quartiles: FlOP_360: OR=0.68 (95% CI: 0.40-1.15); FlOP_320: OR=0.99 (95% CI: 0.61-1.61); FlOP_400: OR=0.92 (95% CI: 0.52, 1.63). Conclusions No cross-sectional association was found between phobic anxiety and a plasma measure of global oxidative stress in this sample of middle-aged and older women. PMID:26635425
Hwang, Bosun; Han, Jonghee; Choi, Jong Min; Park, Kwang Suk
2008-11-01
The purpose of this study was to develop an unobtrusive energy expenditure (EE) measurement system using an infrared (IR) sensor-based activity monitoring system to measure indoor activities and to estimate individual quantitative EE. IR-sensor activation counts were measured with a Bluetooth-based monitoring system and the standard EE was calculated using an established regression equation. Ten male subjects participated in the experiment and three different EE measurement systems (gas analyzer, accelerometer, IR sensor) were used simultaneously in order to determine the regression equation and evaluate the performance. As a standard measurement, oxygen consumption was simultaneously measured by a portable metabolic system (Metamax 3X, Cortex, Germany). A single room experiment was performed to develop a regression model of the standard EE measurement from the proposed IR sensor-based measurement system. In addition, correlation and regression analyses were done to compare the performance of the IR system with that of the Actigraph system. We determined that our proposed IR-based EE measurement system shows a similar correlation to the Actigraph system with the standard measurement system.
Franesqui, Miguel A; Yepes, Jorge; García-González, Cándida
2017-08-01
This article outlines the ultrasound data employed to calibrate in the laboratory an analytical model that permits the calculation of the depth of partial-depth surface-initiated cracks on bituminous pavements using this non-destructive technique. This initial calibration is required so that the model provides sufficient precision during practical application. The ultrasonic pulse transit times were measured on beam samples of different asphalt mixtures (semi-dense asphalt concrete AC-S; asphalt concrete for very thin layers BBTM; and porous asphalt PA). The cracks on the laboratory samples were simulated by means of notches of variable depths. With the data of ultrasound transmission time ratios, curve-fittings were carried out on the analytical model, thus determining the regression parameters and their statistical dispersion. The calibrated models obtained from laboratory datasets were subsequently applied to auscultate the evolution of the crack depth after microwaves exposure in the research article entitled "Top-down cracking self-healing of asphalt pavements with steel filler from industrial waste applying microwaves" (Franesqui et al., 2017) [1].
Approximate Dynamic Programming Algorithms for United States Air Force Officer Sustainment
2015-03-26
level of correction needed. While paying bonuses has an easily calculable cost, RIFs have more subtle costs. Mone (1994) discovered that in a steady...a regression is performed utilizing instrumental variables to minimize Bellman error. This algorithm uses a set of basis functions to approximate the...transitioned to an all-volunteer force. Charnes et al. (1972) utilize a goal programming model for General Schedule civilian manpower management in the
Content Coding of Psychotherapy Transcripts Using Labeled Topic Models.
Gaut, Garren; Steyvers, Mark; Imel, Zac E; Atkins, David C; Smyth, Padhraic
2017-03-01
Psychotherapy represents a broad class of medical interventions received by millions of patients each year. Unlike most medical treatments, its primary mechanisms are linguistic; i.e., the treatment relies directly on a conversation between a patient and provider. However, the evaluation of patient-provider conversation suffers from critical shortcomings, including intensive labor requirements, coder error, nonstandardized coding systems, and inability to scale up to larger data sets. To overcome these shortcomings, psychotherapy analysis needs a reliable and scalable method for summarizing the content of treatment encounters. We used a publicly available psychotherapy corpus from Alexander Street press comprising a large collection of transcripts of patient-provider conversations to compare coding performance for two machine learning methods. We used the labeled latent Dirichlet allocation (L-LDA) model to learn associations between text and codes, to predict codes in psychotherapy sessions, and to localize specific passages of within-session text representative of a session code. We compared the L-LDA model to a baseline lasso regression model using predictive accuracy and model generalizability (measured by calculating the area under the curve (AUC) from the receiver operating characteristic curve). The L-LDA model outperforms the lasso logistic regression model at predicting session-level codes with average AUC scores of 0.79, and 0.70, respectively. For fine-grained level coding, L-LDA and logistic regression are able to identify specific talk-turns representative of symptom codes. However, model performance for talk-turn identification is not yet as reliable as human coders. We conclude that the L-LDA model has the potential to be an objective, scalable method for accurate automated coding of psychotherapy sessions that perform better than comparable discriminative methods at session-level coding and can also predict fine-grained codes.
Content Coding of Psychotherapy Transcripts Using Labeled Topic Models
Gaut, Garren; Steyvers, Mark; Imel, Zac E; Atkins, David C; Smyth, Padhraic
2016-01-01
Psychotherapy represents a broad class of medical interventions received by millions of patients each year. Unlike most medical treatments, its primary mechanisms are linguistic; i.e., the treatment relies directly on a conversation between a patient and provider. However, the evaluation of patient-provider conversation suffers from critical shortcomings, including intensive labor requirements, coder error, non-standardized coding systems, and inability to scale up to larger data sets. To overcome these shortcomings, psychotherapy analysis needs a reliable and scalable method for summarizing the content of treatment encounters. We used a publicly-available psychotherapy corpus from Alexander Street press comprising a large collection of transcripts of patient-provider conversations to compare coding performance for two machine learning methods. We used the Labeled Latent Dirichlet Allocation (L-LDA) model to learn associations between text and codes, to predict codes in psychotherapy sessions, and to localize specific passages of within-session text representative of a session code. We compared the L-LDA model to a baseline lasso regression model using predictive accuracy and model generalizability (measured by calculating the area under the curve (AUC) from the receiver operating characteristic (ROC) curve). The L-LDA model outperforms the lasso logistic regression model at predicting session-level codes with average AUC scores of .79, and .70, respectively. For fine-grained level coding, L-LDA and logistic regression are able to identify specific talk-turns representative of symptom codes. However, model performance for talk-turn identification is not yet as reliable as human coders. We conclude that the L-LDA model has the potential to be an objective, scaleable method for accurate automated coding of psychotherapy sessions that performs better than comparable discriminative methods at session-level coding and can also predict fine-grained codes. PMID:26625437
NASA Astrophysics Data System (ADS)
Aulenbach, B. T.; Burns, D. A.; Shanley, J. B.; Yanai, R. D.; Bae, K.; Wild, A.; Yang, Y.; Dong, Y.
2013-12-01
There are many sources of uncertainty in estimates of streamwater solute flux. Flux is the product of discharge and concentration (summed over time), each of which has measurement uncertainty of its own. Discharge can be measured almost continuously, but concentrations are usually determined from discrete samples, which increases uncertainty dependent on sampling frequency and how concentrations are assigned for the periods between samples. Gaps between samples can be estimated by linear interpolation or by models that that use the relations between concentration and continuously measured or known variables such as discharge, season, temperature, and time. For this project, developed in cooperation with QUEST (Quantifying Uncertainty in Ecosystem Studies), we evaluated uncertainty for three flux estimation methods and three different sampling frequencies (monthly, weekly, and weekly plus event). The constituents investigated were dissolved NO3, Si, SO4, and dissolved organic carbon (DOC), solutes whose concentration dynamics exhibit strongly contrasting behavior. The evaluation was completed for a 10-year period at five small, forested watersheds in Georgia, New Hampshire, New York, Puerto Rico, and Vermont. Concentration regression models were developed for each solute at each of the three sampling frequencies for all five watersheds. Fluxes were then calculated using (1) a linear interpolation approach, (2) a regression-model method, and (3) the composite method - which combines the regression-model method for estimating concentrations and the linear interpolation method for correcting model residuals to the observed sample concentrations. We considered the best estimates of flux to be derived using the composite method at the highest sampling frequencies. We also evaluated the importance of sampling frequency and estimation method on flux estimate uncertainty; flux uncertainty was dependent on the variability characteristics of each solute and varied for different reporting periods (e.g. 10-year, study period vs. annually vs. monthly). The usefulness of the two regression model based flux estimation approaches was dependent upon the amount of variance in concentrations the regression models could explain. Our results can guide the development of optimal sampling strategies by weighing sampling frequency with improvements in uncertainty in stream flux estimates for solutes with particular characteristics of variability. The appropriate flux estimation method is dependent on a combination of sampling frequency and the strength of concentration regression models. Sites: Biscuit Brook (Frost Valley, NY), Hubbard Brook Experimental Forest and LTER (West Thornton, NH), Luquillo Experimental Forest and LTER (Luquillo, Puerto Rico), Panola Mountain (Stockbridge, GA), Sleepers River Research Watershed (Danville, VT)
Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis
2009-02-01
Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
The effect of clouds on the earth's radiation budget
NASA Technical Reports Server (NTRS)
Ziskin, Daniel; Strobel, Darrell F.
1991-01-01
The radiative fluxes from the Earth Radiation Budget Experiment (ERBE) and the cloud properties from the International Satellite Cloud Climatology Project (ISCCP) over Indonesia for the months of June and July of 1985 and 1986 were analyzed to determine the cloud sensitivity coefficients. The method involved a linear least squares regression between co-incident flux and cloud coverage measurements. The calculated slope is identified as the cloud sensitivity. It was found that the correlations between the total cloud fraction and radiation parameters were modest. However, correlations between cloud fraction and IR flux were improved by separating clouds by height. Likewise, correlations between the visible flux and cloud fractions were improved by distinguishing clouds based on optical depth. Calculating correlations between the net fluxes and either height or optical depth segregated cloud fractions were somewhat improved. When clouds were classified in terms of their height and optical depth, correlations among all the radiation components were improved. Mean cloud sensitivities based on the regression of radiative fluxes against height and optical depth separated cloud types are presented. Results are compared to a one-dimensional radiation model with a simple cloud parameterization scheme.
Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V
2018-01-01
Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dominick, F.; Lockwood, R.A.
1986-07-01
The US Army Aviation Engineering Flight Activity conducted an evaluation of Flight Management Calculator for the UH-1H. The calculator was a Hewlett-Packard HP-41CV. The performance calculator was evaluated for flight planning and in-flight use during 14 mission flights simulating operational conditions. The calculator was much easier to use in-flight than the operator's manual data. The calculator program needs improvement in the areas of pre-flight planning and execution speed. The mission flights demonstrated a 19% fuel saving using optimum over normal flight profiles in warm temperatures (15/sup 0/C above standard). Savings would be greater at colder temperatures because of increasing compressibilitymore » effects. Acceptable accuracy for individual aircraft under operational conditions may require a regressive analog model in which individual aircraft data are used to update the program. The performance data base for the UH-1H was expanded with level flight and hover data to thrust coefficients and Mach numbers to the practical limits of aircraft operation.« less
Activity-based differentiation of pathologists' workload in surgical pathology.
Meijer, G A; Oudejans, J J; Koevoets, J J M; Meijer, C J L M
2009-06-01
Adequate budget control in pathology practice requires accurate allocation of resources. Any changes in types and numbers of specimens handled or protocols used will directly affect the pathologists' workload and consequently the allocation of resources. The aim of the present study was to develop a model for measuring the pathologists' workload that can take into account the changes mentioned above. The diagnostic process was analyzed and broken up into separate activities. The time needed to perform these activities was measured. Based on linear regression analysis, for each activity, the time needed was calculated as a function of the number of slides or blocks involved. The total pathologists' time required for a range of specimens was calculated based on standard protocols and validated by comparing to actually measured workload. Cutting up, microscopic procedures and dictating turned out to be highly correlated to number of blocks and/or slides per specimen. Calculated workload per type of specimen was significantly correlated to the actually measured workload. Modeling pathologists' workload based on formulas that calculate workload per type of specimen as a function of the number of blocks and slides provides a basis for a comprehensive, yet flexible, activity-based costing system for pathology.
NASA Astrophysics Data System (ADS)
Vieira, Daniel; Krems, Roman
2017-04-01
Fine-structure transitions in collisions of O(3Pj) with atomic hydrogen are an important cooling mechanism in the interstellar medium; knowledge of the rate coefficients for these transitions has a wide range of astrophysical applications. The accuracy of the theoretical calculation is limited by inaccuracy in the ab initio interaction potentials used in the coupled-channel quantum scattering calculations from which the rate coefficients can be obtained. In this work we use the latest ab initio results for the O(3Pj) + H interaction potentials to improve on previous calculations of the rate coefficients. We further present a machine-learning technique based on Gaussian Process regression to determine the sensitivity of the rate coefficients to variations of the underlying adiabatic interaction potentials. To account for the inaccuracy inherent in the ab initio calculations we compute error bars for the rate coefficients corresponding to 20% variation in each of the interaction potentials. We obtain these error bars by fitting a Gaussian Process model to a data set of potential curves and rate constants. We use the fitted model to do sensitivity analysis, determining the relative importance of individual adiabatic potential curves to a given fine-structure transition. NSERC.
Price Analysis of Railway Freight Transport under Marketing Mechanism
NASA Astrophysics Data System (ADS)
Shi, Ying; Fang, Xiaoping; Chen, Zhiya
Regarding the problems in the reform of the railway tariff system and the pricing of the transport, by means of assaying the influence of the price elasticity on the artifice used for price, this article proposed multiple regressive model which analyzed price elasticity quantitatively. This model conclude multi-factors which influences on the price elasticity, such as the averagely railway freight charge, the averagely freight haulage of proximate supersede transportation mode, the GDP per capita in the point of origin, and a series of dummy variable which can reflect the features of some productive and consume demesne. It can calculate the price elasticity of different classes in different domains, and predict the freight traffic volume on different rate levels. It can calculate confidence-level, and evaluate the relevance of each parameter to get rid of irrelevant or little relevant variables. It supplied a good theoretical basis for directing the pricing of transport enterprises in market economic conditions, which is suitable for railway freight, passenger traffic and other transportation manner as well. SPSS (Statistical Package for the Social Science) software was used to calculate and analysis the example. This article realized the calculation by HYFX system(Ministry of Railways fund).
Washington, Simon; Haque, Md Mazharul; Oh, Jutaek; Lee, Dongmin
2014-05-01
Hot spot identification (HSID) aims to identify potential sites-roadway segments, intersections, crosswalks, interchanges, ramps, etc.-with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset. Copyright © 2014 Elsevier Ltd. All rights reserved.
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
Malignant testicular tumour incidence and mortality trends
Wojtyła-Buciora, Paulina; Więckowska, Barbara; Krzywinska-Wiewiorowska, Małgorzata; Gromadecka-Sutkiewicz, Małgorzata
2016-01-01
Aim of the study In Poland testicular tumours are the most frequent cancer among men aged 20–44 years. Testicular tumour incidence since the 1980s and 1990s has been diversified geographically, with an increased risk of mortality in Wielkopolska Province, which was highlighted at the turn of the 1980s and 1990s. The aim of the study was the comparative analysis of the tendencies in incidence and death rates due to malignant testicular tumours observed among men in Poland and in Wielkopolska Province. Material and methods Data from the National Cancer Registry were used for calculations. The incidence/mortality rates among men due to malignant testicular cancer as well as the tendencies in incidence/death ratio observed in Poland and Wielkopolska were established based on regression equation. The analysis was deepened by adopting the multiple linear regression model. A p-value < 0.05 was arbitrarily adopted as the criterion of statistical significance, and for multiple comparisons it was modified according to the Bonferroni adjustment to a value of p < 0.0028. Calculations were performed with the use of PQStat v1.4.8 package. Results The incidence of malignant testicular neoplasms observed among men in Poland and in Wielkopolska Province indicated a significant rising tendency. The multiple linear regression model confirmed that the year variable is a strong incidence forecast factor only within the territory of Poland. A corresponding analysis of mortality rates among men in Poland and in Wielkopolska Province did not show any statistically significant correlations. Conclusions Late diagnosis of Polish patients calls for undertaking appropriate educational activities that would facilitate earlier reporting of the patients, thus increasing their chances for recovery. Introducing preventive examinations in the regions of increased risk of testicular tumour may allow earlier diagnosis. PMID:27095941
Timmermans, Luc; Falez, Freddy; Mélot, Christian; Wespes, Eric
2013-09-01
A urinary incontinence impairment rating must be a highly accurate, non-invasive exploration of the condition using International Classification of Functioning (ICF)-based assessment tools. The objective of this study was to identify the best evaluation test and to determine an impairment rating model of urinary incontinence. In performing a cross-sectional study comparing successive urodynamic tests using both the International Consultation on Incontinence Questionnaire-Urinary Incontinence-Short Form (ICIQ-UI-SF) and the 1-hr pad-weighing test in 120 patients, we performed statistical likelihood ratio analysis and used logistic regression to calculate the probability of urodynamic incontinence using the most significant independent predictors. Subsequently, we created a template that was based on the significant predictors and the probability of urodynamic incontinence. The mean ICIQ-UI-SF score was 13.5 ± 4.6, and the median pad test value was 8 g. The discrimination statistic (receiver operating characteristic) described how well the urodynamic observations matched the ICIQ-UI-SF scores (under curve area (UDA):0.689) and the pad test data (UDA: 0.693). Using logistic regression analysis, we demonstrated that the best independent predictors of urodynamic incontinence were the patient's age and the ICIQ-UI-SF score. The logistic regression model permitted us to construct an equation to determine the probability of urodynamic incontinence. Using these tools, we created a template to generate a probability index of urodynamic urinary incontinence. Using this probability index, relative to the patient and to the maximum impairment of the whole person (MIWP) relative to urinary incontinence, we were able to calculate a patient's permanent impairment. Copyright © 2012 Wiley Periodicals, Inc.
Villarrasa-Sapiña, Israel; Serra-Añó, Pilar; Pardo-Ibáñez, Alberto; Gonzalez, Luis-Millán; García-Massó, Xavier
2017-01-01
Obesity is now a serious worldwide challenge, especially in children. This condition can cause a number of different health problems, including musculoskeletal disorders, some of which are due to mechanical stress caused by excess body weight. The aim of this study was to determine the association between body composition and the vertical ground reaction force produced during walking in obese children. Sixteen children participated in the study, six females and ten males [11.5 (1.2) years old, 69.8 (15.5) kg, 1.56 (0.09) m, and 28.36 (3.74) kg/m 2 of body mass index (BMI)]. Total weight, lean mass and fat mass were measured by dual-energy X-ray absorptiometry and vertical forces while walking were obtained by a force platform. The vertical force variables analysed were impact and propulsive forces, and the rate of development of both. Multiple regression models for each vertical force parameter were calculated using the body composition variables as input. The impact force regression model was found to be positively related to the weight of obese children and negatively related to lean mass. The regression model showed lean mass was positively related to the propulsive rate. Finally, regression models for impact and propulsive force showed a direct relationship with body weight. Impact force is positively related to the weight of obese children, but lean mass helps to reduce the impact force in this population. Exercise could help obese persons to reduce their total body weight and increase their lean mass, thus reducing impact forces during sports and other activities. Copyright © 2016 Elsevier Ltd. All rights reserved.
Predictive ability of a comprehensive incremental test in mountain bike marathon
Schneeweiss, Patrick; Martus, Peter; Niess, Andreas M; Krauss, Inga
2018-01-01
Objectives Traditional performance tests in mountain bike marathon (XCM) primarily quantify aerobic metabolism and may not describe the relevant capacities in XCM. We aimed to validate a comprehensive test protocol quantifying its intermittent demands. Methods Forty-nine athletes (38.8±9.1 years; 38 male; 11 female) performed a laboratory performance test, including an incremental test, to determine individual anaerobic threshold (IAT), peak power output (PPO) and three maximal efforts (10 s all-out sprint, 1 min maximal effort and 5 min maximal effort). Within 2 weeks, the athletes participated in one of three XCM races (n=15, n=9 and n=25). Correlations between test variables and race times were calculated separately. In addition, multiple regression models of the predictive value of laboratory outcomes were calculated for race 3 and across all races (z-transformed data). Results All variables were correlated with race times 1, 2 and 3: 10 s all-out sprint (r=−0.72; r=−0.59; r=−0.61), 1 min maximal effort (r=−0.85; r=−0.84; r=−0.82), 5 min maximal effort (r=−0.57; r=−0.85; r=−0.76), PPO (r=−0.77; r=−0.73; r=−0.76) and IAT (r=−0.71; r=−0.67; r=−0.68). The best-fitting multiple regression models for race 3 (r2=0.868) and across all races (r2=0.757) comprised 1 min maximal effort, IAT and body weight. Conclusion Aerobic and intermittent variables correlated least strongly with race times. Their use in a multiple regression model confirmed additional explanatory power to predict XCM performance. These findings underline the usefulness of the comprehensive incremental test to predict performance in that sport more precisely. PMID:29387445
Estimation of average annual streamflows and power potentials for Alaska and Hawaii
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verdin, Kristine L.
2004-05-01
This paper describes the work done to develop average annual streamflow estimates and power potential for the states of Alaska and Hawaii. The Elevation Derivatives for National Applications (EDNA) database was used, along with climatic datasets, to develop flow and power estimates for every stream reach in the EDNA database. Estimates of average annual streamflows were derived using state-specific regression equations, which were functions of average annual precipitation, precipitation intensity, drainage area, and other elevation-derived parameters. Power potential was calculated through the use of the average annual streamflow and the hydraulic head of each reach, which is calculated from themore » EDNA digital elevation model. In all, estimates of streamflow and power potential were calculated for over 170,000 stream segments in the Alaskan and Hawaiian datasets.« less
Hermes, Ilarraza-Lomelí; Marianna, García-Saldivia; Jessica, Rojano-Castillo; Carlos, Barrera-Ramírez; Rafael, Chávez-Domínguez; María Dolores, Rius-Suárez; Pedro, Iturralde
2016-10-01
Mortality due to cardiovascular disease is often associated with ventricular arrhythmias. Nowadays, patients with cardiovascular disease are more encouraged to take part in physical training programs. Nevertheless, high-intensity exercise is associated to a higher risk for sudden death, even in apparently healthy people. During an exercise testing (ET), health care professionals provide patients, in a controlled scenario, an intense physiological stimulus that could precipitate cardiac arrhythmia in high risk individuals. There is still no clinical or statistical tool to predict this incidence. The aim of this study was to develop a statistical model to predict the incidence of exercise-induced potentially life-threatening ventricular arrhythmia (PLVA) during high intensity exercise. 6415 patients underwent a symptom-limited ET with a Balke ramp protocol. A multivariate logistic regression model where the primary outcome was PLVA was performed. Incidence of PLVA was 548 cases (8.5%). After a bivariate model, thirty one clinical or ergometric variables were statistically associated with PLVA and were included in the regression model. In the multivariate model, 13 of these variables were found to be statistically significant. A regression model (G) with a X(2) of 283.987 and a p<0.001, was constructed. Significant variables included: heart failure, antiarrhythmic drugs, myocardial lower-VD, age and use of digoxin, nitrates, among others. This study allows clinicians to identify patients at risk of ventricular tachycardia or couplets during exercise, and to take preventive measures or appropriate supervision. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models
Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-01-01
Background Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models’ with and without novel biomarkers. Objectives Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. Materials and Methods We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham’s “general CVD risk” algorithm. Results The command is addpred for logistic regression models. Conclusions The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers. PMID:27279830
NASA Astrophysics Data System (ADS)
Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.
2007-07-01
Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
NASA Astrophysics Data System (ADS)
Faucci, Maria Teresa; Melani, Fabrizio; Mura, Paola
2002-06-01
Molecular modeling was used to investigate factors influencing complex formation between cyclodextrins and guest molecules and predict their stability through a theoretical model based on the search for a correlation between experimental stability constants ( Ks) and some theoretical parameters describing complexation (docking energy, host-guest contact surfaces, intermolecular interaction fields) calculated from complex structures at a minimum conformational energy, obtained through stochastic methods based on molecular dynamic simulations. Naproxen, ibuprofen, ketoprofen and ibuproxam were used as model drug molecules. Multiple Regression Analysis allowed identification of the significant factors for the complex stability. A mathematical model ( r=0.897) related log Ks with complex docking energy and lipophilic molecular fields of cyclodextrin and drug.
Ground temperature measurement by PRT-5 for maps experiment
NASA Technical Reports Server (NTRS)
Gupta, S. K.; Tiwari, S. N.
1978-01-01
A simple algorithm and computer program were developed for determining the actual surface temperature from the effective brightness temperature as measured remotely by a radiation thermometer called PRT-5. This procedure allows the computation of atmospheric correction to the effective brightness temperature without performing detailed radiative transfer calculations. Model radiative transfer calculations were performed to compute atmospheric corrections for several values of the surface and atmospheric parameters individually and in combination. Polynomial regressions were performed between the magnitudes or deviations of these parameters and the corresponding computed corrections to establish simple analytical relations between them. Analytical relations were also developed to represent combined correction for simultaneous variation of parameters in terms of their individual corrections.
Standardization of domestic frying processes by an engineering approach.
Franke, K; Strijowski, U
2011-05-01
An approach was developed to enable a better standardization of domestic frying of potato products. For this purpose, 5 domestic fryers differing in heating power and oil capacity were used. A very defined frying process using a highly standardized model product and a broad range of frying conditions was carried out in these fryers and the development of browning representing an important quality parameter was measured. Product-to-oil ratio, oil temperature, and frying time were varied. Quite different color changes were measured in the different fryers although the same frying process parameters were applied. The specific energy consumption for water evaporation (spECWE) during frying related to product amount was determined for all frying processes to define an engineering parameter for characterizing the frying process. A quasi-linear regression approach was applied to calculate this parameter from frying process settings and fryer properties. The high significance of the regression coefficients and a coefficient of determination close to unity confirmed the suitability of this approach. Based on this regression equation, curves for standard frying conditions (SFC curves) were calculated which describe the frying conditions required to obtain the same level of spECWE in the different domestic fryers. Comparison of browning results from the different fryers operated at conditions near the SFC curves confirmed the applicability of the approach. © 2011 Institute of Food Technologists®
Factors associated with developing a fear of falling in subjects with primary open-angle glaucoma.
Adachi, Sayaka; Yuki, Kenya; Awano-Tanabe, Sachiko; Ono, Takeshi; Shiba, Daisuke; Murata, Hiroshi; Asaoka, Ryo; Tsubota, Kazuo
2018-02-13
To investigate the relationship between clinical risk factors, including visual field (VF) defects and visual acuity, and a fear of falling, among patients with primary open-angle glaucoma (POAG). All participants answered the following question at a baseline ophthalmic examination: Are you afraid of falling? The same question was then answered every 12 months for 3 years. A binocular integrated visual field was calculated by merging a patient's monocular Humphrey field analyzer VFs, using the 'best sensitivity' method. The means of total deviation values in the whole, superior peripheral, superior central, inferior central, and inferior peripheral VFs were calculated. The relationship between these mean VF measurements, and various clinical factors, against patients' baseline fear of falling and future fear of falling was analyzed using multiple logistic regression. Among 392 POAG subjects, 342 patients (87.2%) responded to the fear of falling question at least twice in the 3 years study period. The optimal regression model for patients' baseline fear of falling included age, gender, mean of total deviation values in the inferior peripheral VF and number of previous falls. The optimal regression equation for future fear of falling included age, gender, mean of total deviation values in the inferior peripheral VF and number of previous falls. Defects in the inferior peripheral VF area are significantly related to the development of a fear of falling.
Characteristics and Impact of Imperviousness From a GIS-based Hydrological Perspective
NASA Astrophysics Data System (ADS)
Moglen, G. E.; Kim, S.
2005-12-01
With the concern that imperviousness can be differently quantified depending on data sources and methods, this study assessed imperviousness estimates using two different data sources: land use and land cover. Year 2000 land use developed by the Maryland Department of Planning was utilized to estimate imperviousness by assigning imperviousness coefficients to unique land use categories. These estimates were compared with imperviousness estimates based on satellite-derived land cover from the 2001 National Land Cover Dataset. Our study developed the relationships between these two estimates in the form of regression equations to convert imperviousness derived from one data source to the other. The regression equations are considered reliable, based on goodness-of-fit measures. Furthermore, this study examined how quantitatively different imperviousness estimates affect the prediction of hydrological response both in the flow regime and in the thermal regime. We assessed the relationships between indicators of hydrological response and imperviousness-descriptors. As indicators of flow variability, coefficient of variance, lag-one autocorrelation, and mean daily flow change were calculated based on measured mean daily stream flow from the water year 1997 to 2003. For thermal variability, indicators such as percent-days of surge, degree-day, and mean daily temperature difference were calculated base on measured stream temperature over several basins in Maryland. To describe imperviousness through the hydrological process, GIS-based spatially distributed hydrological models were developed based on a water-balance method and the SCS-CN method. Imperviousness estimates from land use and land cover were used as predictors in these models to examine the effect of imperviousness using different data sources on the prediction of hydrological response. Indicators of hydrological response were also regressed on aggregate imperviousness. This allowed for identifying if hydrological response is more sensitive to spatially distributed imperviousness or aggregate (lumped) imperviousness. The regressions between indicators of hydrological response and imperviousness-descriptors were evaluated by examining goodness-of-fit measures such as explained variance or relative standard error. The results show that imperviousness estimates using land use are better predictors of flow variability and thermal variability than imperviousness estimates using land cover. Also, this study reveals that flow variability is more sensitive to spatially distributed models than lumped models, while thermal variability is equally responsive to both models. The findings from this study can be further examined from a policy perspective with regard to policies that are based on a threshold concept for imperviousness impacts on the ecological and hydrological system.
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Liu, Ruijie Rachel; Erwin, William D
2006-08-01
An algorithm was developed to estimate noncircular orbit (NCO) single-photon emission computed tomography (SPECT) detector radius on a SPECT/CT imaging system using the CT images, for incorporation into collimator resolution modeling for iterative SPECT reconstruction. Simulated male abdominal (arms up), male head and neck (arms down) and female chest (arms down) anthropomorphic phantom, and ten patient, medium-energy SPECT/CT scans were acquired on a hybrid imaging system. The algorithm simulated inward SPECT detector radial motion and object contour detection at each projection angle, employing the calculated average CT image and a fixed Hounsfield unit (HU) threshold. Calculated radii were compared to the observed true radii, and optimal CT threshold values, corresponding to patient bed and clothing surfaces, were found to be between -970 and -950 HU. The algorithm was constrained by the 45 cm CT field-of-view (FOV), which limited the detected radii to < or = 22.5 cm and led to occasional radius underestimation in the case of object truncation by CT. Two methods incorporating the algorithm were implemented: physical model (PM) and best fit (BF). The PM method computed an offset that produced maximum overlap of calculated and true radii for the phantom scans, and applied that offset as a calculated-to-true radius transformation. For the BF method, the calculated-to-true radius transformation was based upon a linear regression between calculated and true radii. For the PM method, a fixed offset of +2.75 cm provided maximum calculated-to-true radius overlap for the phantom study, which accounted for the camera system's object contour detect sensor surface-to-detector face distance. For the BF method, a linear regression of true versus calculated radius from a reference patient scan was used as a calculated-to-true radius transform. Both methods were applied to ten patient scans. For -970 and -950 HU thresholds, the combined overall average root-mean-square (rms) error in radial position for eight patient scans without truncation were 3.37 cm (12.9%) for PM and 1.99 cm (8.6%) for BF, indicating BF is superior to PM in the absence of truncation. For two patient scans with truncation, the rms error was 3.24 cm (12.2%) for PM and 4.10 cm (18.2%) for BF. The slightly better performance of PM in the case of truncation is anomalous, due to FOV edge truncation artifacts in the CT reconstruction, and thus is suspect. The calculated NCO contour for a patient SPECT/CT scan was used with an iterative reconstruction algorithm that incorporated compensation for system resolution. The resulting image was qualitatively superior to the image obtained by reconstructing the data using the fixed radius stored by the scanner. The result was also superior to the image reconstructed using the iterative algorithm provided with the system, which does not incorporate resolution modeling. These results suggest that, under conditions of no or only mild lateral truncation of the CT scan, the algorithm is capable of providing radius estimates suitable for iterative SPECT reconstruction collimator geometric resolution modeling.
CATEGORICAL REGRESSION ANALYSIS OF ACUTE INHALATION TOXICITY DATA FOR HYDROGEN SULFIDE
Categorical regression is one of the tools offered by the U.S. EPA for derivation of acute reference exposures (AREs), which are dose-response assessments for acute exposures to inhaled chemicals. Categorical regression is used as a meta-analytical technique to calculate probabi...
Ließ, Mareike; Schmidt, Johannes; Glaser, Bruno
2016-01-01
Tropical forests are significant carbon sinks and their soils' carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms-including the model tuning and predictor selection-were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models' predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di
Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less
Quantitative prediction of ionization effect on human skin permeability.
Baba, Hiromi; Ueno, Yusuke; Hashida, Mitsuru; Yamashita, Fumiyoshi
2017-04-30
Although skin permeability of an active ingredient can be severely affected by its ionization in a dose solution, most of the existing prediction models cannot predict such impacts. To provide reliable predictors, we curated a novel large dataset of in vitro human skin permeability coefficients for 322 entries comprising chemically diverse permeants whose ionization fractions can be calculated. Subsequently, we generated thousands of computational descriptors, including LogD (octanol-water distribution coefficient at a specific pH), and analyzed the dataset using nonlinear support vector regression (SVR) and Gaussian process regression (GPR) combined with greedy descriptor selection. The SVR model was slightly superior to the GPR model, with externally validated squared correlation coefficient, root mean square error, and mean absolute error values of 0.94, 0.29, and 0.21, respectively. These models indicate that Log D is effective for a comprehensive prediction of ionization effects on skin permeability. In addition, the proposed models satisfied the statistical criteria endorsed in recent model validation studies. These models can evaluate virtually generated compounds at any pH; therefore, they can be used for high-throughput evaluations of numerous active ingredients and optimization of their skin permeability with respect to permeant ionization. Copyright © 2017 Elsevier B.V. All rights reserved.
Factors associated with mouth breathing in children with -developmental -disabilities.
de Castilho, Lia Silva; Abreu, Mauro Henrique Nogueira Guimarães; de Oliveira, Renata Batista; Souza E Silva, Maria Elisa; Resende, Vera Lúcia Silva
2016-01-01
To investigate the prevalence and factors associated with mouth breathing among patients with developmental disabilities of a dental service. We analyzed 408 dental records. Mouth breathing was reported by the patients' parents and from direct observation. Other variables were as -follows: history of asthma, bronchitis, palate shape, pacifier use, thumb -sucking, nail biting, use of medications, gastroesophageal reflux, bruxism, gender, age, and diagnosis of the patient. Statistical analysis included descriptive analysis with ratio calculation and multiple logistic regression. Variables with p < 0.25 were included in the model to estimate the adjusted OR (95% CI), calculated by the forward stepwise method. Variables with p < 0.05 were kept in the model. Being male (p = 0.016) and use of centrally acting drugs (p = 0.001) were the variables that remained in the model. Among patients with -developmental disabilities, boys and psychotropic drug users had a greater chance of being mouth breathers. © 2016 Special Care Dentistry Association and Wiley Periodicals, Inc.
Ye, Jiang-Feng; Zhao, Yu-Xin; Ju, Jian; Wang, Wei
2017-10-01
To discuss the value of the Bedside Index for Severity in Acute Pancreatitis (BISAP), Modified Early Warning Score (MEWS), serum Ca2+, similarly hereinafter, and red cell distribution width (RDW) for predicting the severity grade of acute pancreatitis and to develop and verify a more accurate scoring system to predict the severity of AP. In 302 patients with AP, we calculated BISAP and MEWS scores and conducted regression analyses on the relationships of BISAP scoring, RDW, MEWS, and serum Ca2+ with the severity of AP using single-factor logistics. The variables with statistical significance in the single-factor logistic regression were used in a multi-factor logistic regression model; forward stepwise regression was used to screen variables and build a multi-factor prediction model. A receiver operating characteristic curve (ROC curve) was constructed, and the significance of multi- and single-factor prediction models in predicting the severity of AP using the area under the ROC curve (AUC) was evaluated. The internal validity of the model was verified through bootstrapping. Among 302 patients with AP, 209 had mild acute pancreatitis (MAP) and 93 had severe acute pancreatitis (SAP). According to single-factor logistic regression analysis, we found that BISAP, MEWS and serum Ca2+ are prediction indexes of the severity of AP (P-value<0.001), whereas RDW is not a prediction index of AP severity (P-value>0.05). The multi-factor logistic regression analysis showed that BISAP and serum Ca2+ are independent prediction indexes of AP severity (P-value<0.001), and MEWS is not an independent prediction index of AP severity (P-value>0.05); BISAP is negatively related to serum Ca2+ (r=-0.330, P-value<0.001). The constructed model is as follows: ln()=7.306+1.151*BISAP-4.516*serum Ca2+. The predictive ability of each model for SAP follows the order of the combined BISAP and serum Ca2+ prediction model>Ca2+>BISAP. There is no statistical significance for the predictive ability of BISAP and serum Ca2+ (P-value>0.05); however, there is remarkable statistical significance for the predictive ability using the newly built prediction model as well as BISAP and serum Ca2+ individually (P-value<0.01). Verification of the internal validity of the models by bootstrapping is favorable. BISAP and serum Ca2+ have high predictive value for the severity of AP. However, the model built by combining BISAP and serum Ca2+ is remarkably superior to those of BISAP and serum Ca2+ individually. Furthermore, this model is simple, practical and appropriate for clinical use. Copyright © 2016. Published by Elsevier Masson SAS.
Fischer, Thomas; Fischer, Susanne; Himmel, Wolfgang; Kochen, Michael M; Hummers-Pradier, Eva
2008-01-01
The influence of patient characteristics on family practitioners' (FPs') diagnostic decision making has mainly been investigated using indirect methods such as vignettes or questionnaires. Direct observation-borrowed from social and cultural anthropology-may be an alternative method for describing FPs' real-life behavior and may help in gaining insight into how FPs diagnose respiratory tract infections, which are frequent in primary care. To clarify FPs' diagnostic processes when treating patients suffering from symptoms of respiratory tract infection. This direct observation study was performed in 30 family practices using a checklist for patient complaints, history taking, physical examination, and diagnoses. The influence of patients' symptoms and complaints on the FPs' physical examination and diagnosis was calculated by logistic regression analyses. Dummy variables based on combinations of symptoms and complaints were constructed and tested against saturated (full) and backward regression models. In total, 273 patients (median age 37 years, 51% women) were included. The median number of symptoms described was 4 per patient, and most information was provided at the patients' own initiative. Multiple logistic regression analysis showed a strong association between patients' complaints and the physical examination. Frequent diagnoses were upper respiratory tract infection (URTI)/common cold (43%), bronchitis (26%), sinusitis (12%), and tonsillitis (11%). There were no significant statistical differences between "simple heuristic'' models and saturated regression models in the diagnoses of bronchitis, sinusitis, and tonsillitis, indicating that simple heuristics are probably used by the FPs, whereas "URTI/common cold'' was better explained by the full model. FPs tended to make their diagnosis based on a few patient symptoms and a limited physical examination. Simple heuristic models were almost as powerful in explaining most diagnoses as saturated models. Direct observation allowed for the study of decision making under real conditions, yielding both quantitative data and "qualitative'' information about the FPs' performance. It is important for investigators to be aware of the specific disadvantages of the method (e.g., a possible observer effect).
Austin, Peter C.; Stryhn, Henrik; Leckie, George; Merlo, Juan
2017-01-01
Multilevel data occur frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models. These models incorporate cluster‐specific random effects that allow one to partition the total variation in the outcome into between‐cluster variation and between‐individual variation. The magnitude of the effect of clustering provides a measure of the general contextual effect. When outcomes are binary or time‐to‐event in nature, the general contextual effect can be quantified by measures of heterogeneity like the median odds ratio or the median hazard ratio, respectively, which can be calculated from a multilevel regression model. Outcomes that are integer counts denoting the number of times that an event occurred are common in epidemiological and medical research. The median (incidence) rate ratio in multilevel Poisson regression for counts that corresponds to the median odds ratio or median hazard ratio for binary or time‐to‐event outcomes respectively is relatively unknown and is rarely used. The median rate ratio is the median relative change in the rate of the occurrence of the event when comparing identical subjects from 2 randomly selected different clusters that are ordered by rate. We also describe how the variance partition coefficient, which denotes the proportion of the variation in the outcome that is attributable to between‐cluster differences, can be computed with count outcomes. We illustrate the application and interpretation of these measures in a case study analyzing the rate of hospital readmission in patients discharged from hospital with a diagnosis of heart failure. PMID:29114926
Albuquerque, F S; Peso-Aguiar, M C; Assunção-Albuquerque, M J T; Gálvez, L
2009-08-01
The length-weight relationship and condition factor have been broadly investigated in snails to obtain the index of physical condition of populations and evaluate habitat quality. Herein, our goal was to describe the best predictors that explain Achatina fulica biometrical parameters and well being in a recently introduced population. From November 2001 to November 2002, monthly snail samples were collected in Lauro de Freitas City, Bahia, Brazil. Shell length and total weight were measured in the laboratory and the potential curve and condition factor were calculated. Five environmental variables were considered: temperature range, mean temperature, humidity, precipitation and human density. Multiple regressions were used to generate models including multiple predictors, via model selection approach, and then ranked with AIC criteria. Partial regressions were used to obtain the separated coefficients of determination of climate and human density models. A total of 1.460 individuals were collected, presenting a shell length range between 4.8 to 102.5 mm (mean: 42.18 mm). The relationship between total length and total weight revealed that Achatina fulica presented a negative allometric growth. Simple regression indicated that humidity has a significant influence on A. fulica total length and weight. Temperature range was the main variable that influenced the condition factor. Multiple regressions showed that climatic and human variables explain a small proportion of the variance in shell length and total weight, but may explain up to 55.7% of the condition factor variance. Consequently, we believe that the well being and biometric parameters of A. fulica can be influenced by climatic and human density factors.
Stevanović, Nikola R; Perušković, Danica S; Gašić, Uroš M; Antunović, Vesna R; Lolić, Aleksandar Đ; Baošić, Rada M
2017-03-01
The objectives of this study were to gain insights into structure-retention relationships and to propose the model to estimating their retention. Chromatographic investigation of series of 36 Schiff bases and their copper(II) and nickel(II) complexes was performed under both normal- and reverse-phase conditions. Chemical structures of the compounds were characterized by molecular descriptors which are calculated from the structure and related to the chromatographic retention parameters by multiple linear regression analysis. Effects of chelation on retention parameters of investigated compounds, under normal- and reverse-phase chromatographic conditions, were analyzed by principal component analysis, quantitative structure-retention relationship and quantitative structure-activity relationship models were developed on the basis of theoretical molecular descriptors, calculated exclusively from molecular structure, and parameters of retention and lipophilicity. Copyright © 2016 John Wiley & Sons, Ltd.
Multi-fidelity machine learning models for accurate bandgap predictions of solids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab
Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less
Multi-fidelity machine learning models for accurate bandgap predictions of solids
Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab
2016-12-28
Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less
Sleep Disruption Medical Intervention Forecasting (SDMIF) Module for the Integrated Medical Model
NASA Technical Reports Server (NTRS)
Lewandowski, Beth; Brooker, John; Mallis, Melissa; Hursh, Steve; Caldwell, Lynn; Myers, Jerry
2011-01-01
The NASA Integrated Medical Model (IMM) assesses the risk, including likelihood and impact of occurrence, of all credible in-flight medical conditions. Fatigue due to sleep disruption is a condition that could lead to operational errors, potentially resulting in loss of mission or crew. Pharmacological consumables are mitigation strategies used to manage the risks associated with sleep deficits. The likelihood of medical intervention due to sleep disruption was estimated with a well validated sleep model and a Monte Carlo computer simulation in an effort to optimize the quantity of consumables. METHODS: The key components of the model are the mission parameter program, the calculation of sleep intensity and the diagnosis and decision module. The mission parameter program was used to create simulated daily sleep/wake schedules for an ISS increment. The hypothetical schedules included critical events such as dockings and extravehicular activities and included actual sleep time and sleep quality. The schedules were used as inputs to the Sleep, Activity, Fatigue and Task Effectiveness (SAFTE) Model (IBR Inc., Baltimore MD), which calculated sleep intensity. Sleep data from an ISS study was used to relate calculated sleep intensity to the probability of sleep medication use, using a generalized linear model for binomial regression. A human yes/no decision process using a binomial random number was also factored into sleep medication use probability. RESULTS: These probability calculations were repeated 5000 times resulting in an estimate of the most likely amount of sleep aids used during an ISS mission and a 95% confidence interval. CONCLUSIONS: These results were transferred to the parent IMM for further weighting and integration with other medical conditions, to help inform operational decisions. This model is a potential planning tool for ensuring adequate sleep during sleep disrupted periods of a mission.
Interpreting Bivariate Regression Coefficients: Going beyond the Average
ERIC Educational Resources Information Center
Halcoussis, Dennis; Phillips, G. Michael
2010-01-01
Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…
Liu, S Z; Yu, L; Chen, Q; Quan, P L; Cao, X Q; Sun, X B
2017-05-06
Objective: To investigate the incidence and survival of esophageal cancer with different histological types and to understand the incidence trend and burden of esophageal cancer in Linzhou during 2003-2012. Methods: All incidence records of esophageal cancer and population reported were collected from Linzhou Cancer Registry during 2003-2012. Incidence rate was calculated using gender and histological types. Age standardized incidence rate was calculated according to world Segi's population and Chinese census data in 2000. Age standardized incidence rate by world population between 2003 and 2012 was analyzed with JoinPoint regression model and estimated annual percentage change (EAPC) was calculated. 5-year survival rate was calculated with Kaplan-Meier model. Results: There were 8 229 esophageal cancer cases in Linzhou during 2003-2012. The average annual incidence rate was 80.08/100 000 (8 229/10 276 481). Among all esophageal cancer cases, 7 019 (85.3%) were diagnosed as esophageal squamous cell carcinoma (ESCC). In Linzhou, the age standardized incidence rate by Chinese standard population and by world standard population was 80.92/100 000 and 81.85/100 000 in 2003, 67.97/100 000 and 68.63/100 000 in 2012. JoinPoint regression model showed that EAPC was-12.9% (95 %CI: -16.4%--9.1%) for other and unspecified histological type between 2003 and 2012. The EAPC was-5.5% (95 %CI: -9.2%--1.6%) for esophageal cancer between 2007 and 2012,-5.4% (95 %CI: -7.0%--3.9%) for esophageal cancer in female between 2006 and 2012,-4.9% (95 %CI: -9.5%--0.1%) for ESCC between 2007 and 2012. The 5-year prevalence of esophageal cancer was 215.49 per 100 000 (2 337/1 084 493), and 5 489 died within 5 years after incidence. 5-year survival rate of esophageal cancer was 34.6% (95 %CI: 33.5%-35.6%). Conclusion: Esophageal cancer had a decreasing trend in Linzhou. The survival rate was increasing. But, esophageal cancer was still a major burden in Linzhou. The major histological type was ESCC. ESCC had a similar decreasing trend with esophageal cancer.
A robust nonparametric framework for reconstruction of stochastic differential equation models
NASA Astrophysics Data System (ADS)
Rajabzadeh, Yalda; Rezaie, Amir Hossein; Amindavar, Hamidreza
2016-05-01
In this paper, we employ a nonparametric framework to robustly estimate the functional forms of drift and diffusion terms from discrete stationary time series. The proposed method significantly improves the accuracy of the parameter estimation. In this framework, drift and diffusion coefficients are modeled through orthogonal Legendre polynomials. We employ the least squares regression approach along with the Euler-Maruyama approximation method to learn coefficients of stochastic model. Next, a numerical discrete construction of mean squared prediction error (MSPE) is established to calculate the order of Legendre polynomials in drift and diffusion terms. We show numerically that the new method is robust against the variation in sample size and sampling rate. The performance of our method in comparison with the kernel-based regression (KBR) method is demonstrated through simulation and real data. In case of real dataset, we test our method for discriminating healthy electroencephalogram (EEG) signals from epilepsy ones. We also demonstrate the efficiency of the method through prediction in the financial data. In both simulation and real data, our algorithm outperforms the KBR method.
Modelling soil salinity in Oued El Abid watershed, Morocco
NASA Astrophysics Data System (ADS)
Mouatassime Sabri, El; Boukdir, Ahmed; Karaoui, Ismail; Arioua, Abdelkrim; Messlouhi, Rachid; El Amrani Idrissi, Abdelkhalek
2018-05-01
Soil salinisation is a phenomenon considered to be a real threat to natural resources in semi-arid climates. The phenomenon is controlled by soil (texture, depth, slope etc.), anthropogenic factors (drainage system, irrigation, crops types, etc.), and climate factors. This study was conducted in the watershed of Oued El Abid in the region of Beni Mellal-Khenifra, aimed at localising saline soil using remote sensing and a regression model. The spectral indices were extracted from Landsat imagery (30 m resolution). A linear correlation of electrical conductivity, which was calculated based on soil samples (ECs), and the values extracted based on spectral bands showed a high accuracy with an R2 (Root square) of 0.80. This study proposes a new spectral salinity index using Landsat bands B1 and B4. This hydro-chemical and statistical study, based on a yearlong survey, showed a moderate amount of salinity, which threatens dam water quality. The results present an improved ability to use remote sensing and regression model integration to detect soil salinity with high accuracy and low cost, and permit intervention at an early stage of salinisation.
Hao, Z Q; Li, C M; Shen, M; Yang, X Y; Li, K H; Guo, L B; Li, X Y; Lu, Y F; Zeng, X Y
2015-03-23
Laser-induced breakdown spectroscopy (LIBS) with partial least squares regression (PLSR) has been applied to measuring the acidity of iron ore, which can be defined by the concentrations of oxides: CaO, MgO, Al₂O₃, and SiO₂. With the conventional internal standard calibration, it is difficult to establish the calibration curves of CaO, MgO, Al₂O₃, and SiO₂ in iron ore due to the serious matrix effects. PLSR is effective to address this problem due to its excellent performance in compensating the matrix effects. In this work, fifty samples were used to construct the PLSR calibration models for the above-mentioned oxides. These calibration models were validated by the 10-fold cross-validation method with the minimum root-mean-square errors (RMSE). Another ten samples were used as a test set. The acidities were calculated according to the estimated concentrations of CaO, MgO, Al₂O₃, and SiO₂ using the PLSR models. The average relative error (ARE) and RMSE of the acidity achieved 3.65% and 0.0048, respectively, for the test samples.
Baker, Ronald J.; Chepiga, Mary M.; Cauller, Stephen J.
2015-01-01
The Kaplan-Meier method of estimating summary statistics from left-censored data was applied in order to include nondetects (left-censored data) in median nitrate-concentration calculations. Median concentrations also were determined using three alternative methods of handling nondetects. Treatment of the 23 percent of samples that were nondetects had little effect on estimated median nitrate concentrations because method detection limits were mostly less than median values.
1992-05-01
regression analysis. The strength of any one variable can be estimated along with the strength of the entire model in explaining the variance of percent... applicable a set of damage functions is to a particular situation. Sometimes depth- damage functions are embedded in computer programs which calculate...functions. Chapter Six concludes with recommended policies on the development and application of depth-damage functions. 5 6 CHAPTER TWO CONSTRUCTION OF
Temporal and spatial trends in nutrient and sediment loading to Lake Tahoe, California-Nevada, USA
Coats, Robert; Lewis, Jack; Alvarez, Nancy L.; Arneson, Patricia
2016-01-01
Since 1980, the Lake Tahoe Interagency Monitoring Program (LTIMP) has provided stream-discharge and water quality data—nitrogen (N), phosphorus (P), and suspended sediment—at more than 20 stations in Lake Tahoe Basin streams. To characterize the temporal and spatial patterns in nutrient and sediment loading to the lake, and improve the usefulness of the program and the existing database, we have (1) identified and corrected for sources of bias in the water quality database; (2) generated synthetic datasets for sediments and nutrients, and resampled to compare the accuracy and precision of different load calculation models; (3) using the best models, recalculated total annual loads over the period of record; (4) regressed total loads against total annual and annual maximum daily discharge, and tested for time trends in the residuals; (5) compared loads for different forms of N and P; and (6) tested constituent loads against land use-land cover (LULC) variables using multiple regression. The results show (1) N and P loads are dominated by organic N and particulate P; (2) there are significant long-term downward trends in some constituent loads of some streams; and (3) anthropogenic impervious surface is the most important LULC variable influencing water quality in basin streams. Many of our recommendations for changes in water quality monitoring and load calculation methods have been adopted by the LTIMP.
NASA Astrophysics Data System (ADS)
Atıcı, Ramazan; Sağır, Selçuk
2017-07-01
In the present work, the relationship with QBO of difference (ΔfoE = foEmea - foEIRI) between critical frequency (foE) values of ionospheric E-region, measured at Darwin and Casos Island stations and calculated by IRI-2012 ionospheric model, is statistically investigated. A multiple regression model is used as statistical tool. The ;Dummy; variables (;DummyWest; and ;DummyEast; represent westerly QBO values and easterly QBO values, respectively) are added to model in order to see the effect of westerly and easterly QBO. In the result of calculations, it is observed that the changes in ΔfoE about 50-52% can be explained by QBO at both stations. The relationship between QBO and ΔfoE is negative at both stations. The change of 1 ms-1 in whole set of QBO leads to a decrease of 0.008 MHz at Casos Island station and 0.017 MHz at Darwin station in ΔfoE. Directions of QBO have an effect on ΔfoE at the Darwin station, but they've not any effect on ΔfoE at Casos Island station. It is thought that the difference values in the foE are due to not to be included in the IRI-model of all parameters affecting the critical frequency value. Thus, QBO which is not included to IRI-model can have an effect on foE and more accurate results can be obtained by IRI model if the QBO is included in this model calculations.
NASA Technical Reports Server (NTRS)
Wu, Man Li C.; Schubert, Siegfried; Lin, Ching I.; Stajner, Ivanka; Einaudi, Franco (Technical Monitor)
2000-01-01
A method is developed for validating model-based estimates of atmospheric moisture and ground temperature using satellite data. The approach relates errors in estimates of clear-sky longwave fluxes at the top of the Earth-atmosphere system to errors in geophysical parameters. The fluxes include clear-sky outgoing longwave radiation (CLR) and radiative flux in the window region between 8 and 12 microns (RadWn). The approach capitalizes on the availability of satellite estimates of CLR and RadWn and other auxiliary satellite data, and multiple global four-dimensional data assimilation (4-DDA) products. The basic methodology employs off-line forward radiative transfer calculations to generate synthetic clear-sky longwave fluxes from two different 4-DDA data sets. Simple linear regression is used to relate the clear-sky longwave flux discrepancies to discrepancies in ground temperature ((delta)T(sub g)) and broad-layer integrated atmospheric precipitable water ((delta)pw). The slopes of the regression lines define sensitivity parameters which can be exploited to help interpret mismatches between satellite observations and model-based estimates of clear-sky longwave fluxes. For illustration we analyze the discrepancies in the clear-sky longwave fluxes between an early implementation of the Goddard Earth Observing System Data Assimilation System (GEOS2) and a recent operational version of the European Centre for Medium-Range Weather Forecasts data assimilation system. The analysis of the synthetic clear-sky flux data shows that simple linear regression employing (delta)T(sub g)) and broad layer (delta)pw provides a good approximation to the full radiative transfer calculations, typically explaining more thin 90% of the 6 hourly variance in the flux differences. These simple regression relations can be inverted to "retrieve" the errors in the geophysical parameters, Uncertainties (normalized by standard deviation) in the monthly mean retrieved parameters range from 7% for (delta)T(sub g) to approx. 20% for the lower tropospheric moisture between 500 hPa and surface. The regression relationships developed from the synthetic flux data, together with CLR and RadWn observed with the Clouds and Earth Radiant Energy System instrument, ire used to assess the quality of the GEOS2 T(sub g) and pw. Results showed that the GEOS2 T(sub g) is too cold over land, and pw in upper layers is too high over the tropical oceans and too low in the lower atmosphere.
Remote sensing of PM2.5 from ground-based optical measurements
NASA Astrophysics Data System (ADS)
Li, S.; Joseph, E.; Min, Q.
2014-12-01
Remote sensing of particulate matter concentration with aerodynamic diameter smaller than 2.5 um(PM2.5) by using ground-based optical measurements of aerosols is investigated based on 6 years of hourly average measurements of aerosol optical properties, PM2.5, ceilometer backscatter coefficients and meteorological factors from Howard University Beltsville Campus facility (HUBC). The accuracy of quantitative retrieval of PM2.5 using aerosol optical depth (AOD) is limited due to changes in aerosol size distribution and vertical distribution. In this study, ceilometer backscatter coefficients are used to provide vertical information of aerosol. It is found that the PM2.5-AOD ratio can vary largely for different aerosol vertical distributions. The ratio is also sensitive to mode parameters of bimodal lognormal aerosol size distribution when the geometric mean radius for the fine mode is small. Using two Angstrom exponents calculated at three wavelengths of 415, 500, 860nm are found better representing aerosol size distributions than only using one Angstrom exponent. A regression model is proposed to assess the impacts of different factors on the retrieval of PM2.5. Compared to a simple linear regression model, the new model combining AOD and ceilometer backscatter can prominently improve the fitting of PM2.5. The contribution of further introducing Angstrom coefficients is apparent. Using combined measurements of AOD, ceilometer backscatter, Angstrom coefficients and meteorological parameters in the regression model can get a correlation coefficient of 0.79 between fitted and expected PM2.5.
Organic carbon stock modelling for the quantification of the carbon sinks in terrestrial ecosystems
NASA Astrophysics Data System (ADS)
Durante, Pilar; Algeet, Nur; Oyonarte, Cecilio
2017-04-01
Given the recent environmental policies derived from the serious threats caused by global change, practical measures to decrease net CO2 emissions have to be put in place. Regarding this, carbon sequestration is a major measure to reduce atmospheric CO2 concentrations within a short and medium term, where terrestrial ecosystems play a basic role as carbon sinks. Development of tools for quantification, assessment and management of organic carbon in ecosystems at different scales and management scenarios, it is essential to achieve these commitments. The aim of this study is to establish a methodological framework for the modeling of this tool, applied to a sustainable land use planning and management at spatial and temporal scale. The methodology for carbon stock estimation in ecosystems is based on merger techniques between carbon stored in soils and aerial biomass. For this purpose, both spatial variability map of soil organic carbon (SOC) and algorithms for calculation of forest species biomass will be created. For the modelling of the SOC spatial distribution at different map scales, it is necessary to fit in and screen the available information of soil database legacy. Subsequently, SOC modelling will be based on the SCORPAN model, a quantitative model use to assess the correlation among soil-forming factors measured at the same site location. These factors will be selected from both static (terrain morphometric variables) and dynamic variables (climatic variables and vegetation indexes -NDVI-), providing to the model the spatio-temporal characteristic. After the predictive model, spatial inference techniques will be used to achieve the final map and to extrapolate the data to unavailable information areas (automated random forest regression kriging). The estimated uncertainty will be calculated to assess the model performance at different scale approaches. Organic carbon modelling of aerial biomass will be estimate using LiDAR (Light Detection And Ranging) algorithms. The available LiDAR databases will be used. LiDAR statistics (which describe the LiDAR cloud point data to calculate forest stand parameters) will be correlated with different canopy cover variables. The regression models applied to the total area will produce a continuous geo-information map to each canopy variable. The CO2 estimation will be calculated by dry-mass conversion factors for each forest species (C kg-CO2 kg equivalent). The result is the organic carbon modelling at spatio-temporal scale with different levels of uncertainty associated to the predictive models and diverse detailed scales. However, one of the main expected problems is due to the heterogeneous spatial distribution of the soil information, which influences on the prediction of the models at different spatial scales and, consequently, at SOC map scale. Besides this, the variability and mixture of the forest species of the aerial biomass decrease the accuracy assessment of the organic carbon.
Residence times in river basins as determined by analysis of long-term tritium records
Michel, R.L.
1992-01-01
The US Geological Survey has maintained a network of stations to collect samples for the measurement of tritium concentrations in precipitation and streamflow since the early 1960s. Tritium data from outflow waters of river basins draining 4500-75000 km2 are used to determine average residence times of water within the basins. The basins studied are the Colorado River above Cisco, Utah; the Kissimmee River above Lake Okeechobee, Florida; the Mississippi River above Anoka, Minnesota; the Neuse River above Streets Ferry Bridge near Vanceboro, North Carolina; the Potomac River above Point of Rocks, Maryland; the Sacramento River above Sacramento, California; the Susquehanna River above Harrisburg, Pennsylvania. The basins are modeled with the assumption that the outflow in the river comes from two sources-prompt (within-year) runoff from precipitation, and flow from the long-term reservoirs of the basin. Tritium concentration in the outflow water of the basin is dependent on three factors: (1) tritium concentration in runoff from the long-term reservoir, which depends on the residence time for the reservoir and historical tritium concentrations in precipitation; (2) tritium concentrations in precipitation (the within-year runoff component); (3) relative contributions of flow from the long-term and within-year components. Predicted tritium concentrations for the outflow water in the river basins were calculated for different residence times and for different relative contributions from the two reservoirs. A box model was used to calculate tritium concentrations in the long-term reservoir. Calculated values of outflow tritium concentrations for the basin were regressed against the measured data to obtain a slope as close as possible to 1. These regressions assumed an intercept of zero and were carried out for different values of residence time and reservoir contribution to maximize the fit of modeled versus actual data for all the above rivers. The final slopes of the fitted regression lines ranged from 0.95 to 1.01 (correlation coefficient > 0.96) for the basins studied. Values for the residence time of waters within the basins and average relative contributions of the within-year and long-term reservoirs to outflow were obtained. Values for river basin residence times ranged from 2 years for the Kissimmee River basin to 20 years for the Potomac River basin. The residence times indicate the time scale in which the basin responds to anthropogenic inputs. The modeled tritium concentrations for the basins also furnish input data for urban and agricultural settings where these river waters are used. ?? 1992.
Schmidt, Johannes; Glaser, Bruno
2016-01-01
Tropical forests are significant carbon sinks and their soils’ carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms—including the model tuning and predictor selection—were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models’ predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction. PMID:27128736
Auras, Silke; Ostermann, Thomas; de Cruppé, Werner; Bitzer, Eva-Maria; Diel, Franziska; Geraedts, Max
2016-12-01
The study aimed to illustrate the effect of the patients' sex, age, self-rated health and medical practice specialization on patient satisfaction. Secondary analysis of patient survey data using multilevel analysis (generalized linear mixed model, medical practice as random effect) using a sequential modelling strategy. We examined the effects of the patients' sex, age, self-rated health and medical practice specialization on four patient satisfaction dimensions: medical practice organization, information, interaction, professional competence. The study was performed in 92 German medical practices providing ambulatory care in general medicine, internal medicine or gynaecology. In total, 9888 adult patients participated in a patient survey using the validated 'questionnaire on satisfaction with ambulatory care-quality from the patient perspective [ZAP]'. We calculated four models for each satisfaction dimension, revealing regression coefficients with 95% confidence intervals (CIs) for all independent variables, and using Wald Chi-Square statistic for each modelling step (model validity) and LR-Tests to compare the models of each step with the previous model. The patients' sex and age had a weak effect (maximum regression coefficient 1.09, CI 0.39; 1.80), and the patients' self-rated health had the strongest positive effect (maximum regression coefficient 7.66, CI 6.69; 8.63) on satisfaction ratings. The effect of medical practice specialization was heterogeneous. All factors studied, specifically the patients' self-rated health, affected patient satisfaction. Adjustment should always be considered because it improves the comparability of patient satisfaction in medical practices with atypically varying patient populations and increases the acceptance of comparisons. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Investigation on the effect of diaphragm on the combustion characteristics of solid-fuel ramjet
NASA Astrophysics Data System (ADS)
Gong, Lunkun; Chen, Xiong; Yang, Haitao; Li, Weixuan; Zhou, Changsheng
2017-10-01
The flow field characteristics and the regression rate distribution of solid-fuel ramjet with three-hole diaphragm were investigated by numerical and experimental methods. The experimental data were obtained by burning high-density polyethylene using a connected-pipe facility to validate the numerical model and analyze the combustion efficiency of the solid-fuel ramjet. The three-dimensional code developed in the present study adopted three-order MUSCL and central difference schemes, AUSMPW + flux vector splitting method, and second-order moment turbulence-chemistry model, together with k-ω shear stress transport (SST) turbulence model. The solid fuel surface temperature was calculated with fluid-solid heat coupling method. The numerical results show that strong circumferential flow exists in the region upstream of the diaphragm. The diaphragm can enhance the regression rate of the solid fuel in the region downstream of the diaphragm significantly, which mainly results from the increase of turbulent viscosity. As the diaphragm port area decreases, the regression rate of the solid fuel downstream of the diaphragm increases. The diaphragm can result in more sufficient mixing between the incoming air and fuel pyrolysis gases, while inevitably producing some pressure loss. The experimental results indicate that the effect of the diaphragm on the combustion efficiency of hydrocarbon fuels is slightly negative. It is conjectured that the diaphragm may have some positive effects on the combustion efficiency of the solid fuel with metal particles.
Separation mechanism of nortriptyline and amytriptyline in RPLC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gritti, Fabrice; Guiochon, Georges A
2005-08-01
The single and the competitive equilibrium isotherms of nortriptyline and amytriptyline were acquired by frontal analysis (FA) on the C{sub 18}-bonded discovery column, using a 28/72 (v/v) mixture of acetonitrile and water buffered with phosphate (20 mM, pH 2.70). The adsorption energy distributions (AED) of each compound were calculated from the raw adsorption data. Both the fitting of the adsorption data using multi-linear regression analysis and the AEDs are consistent with a trimodal isotherm model. The single-component isotherm data fit well to the tri-Langmuir isotherm model. The extension to a competitive two-component tri-Langmuir isotherm model based on the best parametersmore » of the single-component isotherms does not account well for the breakthrough curves nor for the overloaded band profiles measured for mixtures of nortriptyline and amytriptyline. However, it was possible to derive adjusted parameters of a competitive tri-Langmuir model based on the fitting of the adsorption data obtained for these mixtures. A very good agreement was then found between the calculated and the experimental overloaded band profiles of all the mixtures injected.« less
Evaluation of the validity of the Bolton Index using cone-beam computed tomography (CBCT)
Llamas, José M.; Cibrián, Rosa; Gandía, José L.; Paredes, Vanessa
2012-01-01
Aims: To evaluate the reliability and reproducibility of calculating the Bolton Index using cone-beam computed tomography (CBCT), and to compare this with measurements obtained using the 2D Digital Method. Material and Methods: Traditional study models were obtained from 50 patients, which were then digitized in order to be able to measure them using the Digital Method. Likewise, CBCTs of those same patients were undertaken using the Dental Picasso Master 3D® and the images obtained were then analysed using the InVivoDental programme. Results: By determining the regression lines for both measurement methods, as well as the difference between both of their values, the two methods are shown to be comparable, despite the fact that the measurements analysed presented statistically significant differences. Conclusions: The three-dimensional models obtained from the CBCT are as accurate and reproducible as the digital models obtained from the plaster study casts for calculating the Bolton Index. The differences existing between both methods were clinically acceptable. Key words:Tooth-size, digital models, bolton index, CBCT. PMID:22549690
Lewis, Jason M.
2010-01-01
Peak-streamflow regression equations were determined for estimating flows with exceedance probabilities from 50 to 0.2 percent for the state of Oklahoma. These regression equations incorporate basin characteristics to estimate peak-streamflow magnitude and frequency throughout the state by use of a generalized least squares regression analysis. The most statistically significant independent variables required to estimate peak-streamflow magnitude and frequency for unregulated streams in Oklahoma are contributing drainage area, mean-annual precipitation, and main-channel slope. The regression equations are applicable for watershed basins with drainage areas less than 2,510 square miles that are not affected by regulation. The resulting regression equations had a standard model error ranging from 31 to 46 percent. Annual-maximum peak flows observed at 231 streamflow-gaging stations through water year 2008 were used for the regression analysis. Gage peak-streamflow estimates were used from previous work unless 2008 gaging-station data were available, in which new peak-streamflow estimates were calculated. The U.S. Geological Survey StreamStats web application was used to obtain the independent variables required for the peak-streamflow regression equations. Limitations on the use of the regression equations and the reliability of regression estimates for natural unregulated streams are described. Log-Pearson Type III analysis information, basin and climate characteristics, and the peak-streamflow frequency estimates for the 231 gaging stations in and near Oklahoma are listed. Methodologies are presented to estimate peak streamflows at ungaged sites by using estimates from gaging stations on unregulated streams. For ungaged sites on urban streams and streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow magnitude and frequency.
Predicting Ascospore Release of Monilinia vaccinii-corymbosi of Blueberry with Machine Learning.
Harteveld, Dalphy O C; Grant, Michael R; Pscheidt, Jay W; Peever, Tobin L
2017-11-01
Mummy berry, caused by Monilinia vaccinii-corymbosi, causes economic losses of highbush blueberry in the U.S. Pacific Northwest (PNW). Apothecia develop from mummified berries overwintering on soil surfaces and produce ascospores that infect tissue emerging from floral and vegetative buds. Disease control currently relies on fungicides applied on a calendar basis rather than inoculum availability. To establish a prediction model for ascospore release, apothecial development was tracked in three fields, one in western Oregon and two in northwestern Washington in 2015 and 2016. Air and soil temperature, precipitation, soil moisture, leaf wetness, relative humidity and solar radiation were monitored using in-field weather stations and Washington State University's AgWeatherNet stations. Four modeling approaches were compared: logistic regression, multivariate adaptive regression splines, artificial neural networks, and random forest. A supervised learning approach was used to train the models on two data sets: training (70%) and testing (30%). The importance of environmental factors was calculated for each model separately. Soil temperature, soil moisture, and solar radiation were identified as the most important factors influencing ascospore release. Random forest models, with 78% accuracy, showed the best performance compared with the other models. Results of this research helps PNW blueberry growers to optimize fungicide use and reduce production costs.
GTM-Based QSAR Models and Their Applicability Domains.
Gaspar, H A; Baskin, I I; Marcou, G; Horvath, D; Varnek, A
2015-06-01
In this paper we demonstrate that Generative Topographic Mapping (GTM), a machine learning method traditionally used for data visualisation, can be efficiently applied to QSAR modelling using probability distribution functions (PDF) computed in the latent 2-dimensional space. Several different scenarios of the activity assessment were considered: (i) the "activity landscape" approach based on direct use of PDF, (ii) QSAR models involving GTM-generated on descriptors derived from PDF, and, (iii) the k-Nearest Neighbours approach in 2D latent space. Benchmarking calculations were performed on five different datasets: stability constants of metal cations Ca(2+) , Gd(3+) and Lu(3+) complexes with organic ligands in water, aqueous solubility and activity of thrombin inhibitors. It has been shown that the performance of GTM-based regression models is similar to that obtained with some popular machine-learning methods (random forest, k-NN, M5P regression tree and PLS) and ISIDA fragment descriptors. By comparing GTM activity landscapes built both on predicted and experimental activities, we may visually assess the model's performance and identify the areas in the chemical space corresponding to reliable predictions. The applicability domain used in this work is based on data likelihood. Its application has significantly improved the model performances for 4 out of 5 datasets. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Time-domain electromagnetic soundings collected in Dawson County, Nebraska, 2007-09
Payne, Jason; Teeple, Andrew
2011-01-01
Between April 2007 and November 2009, the U.S. Geological Survey, in cooperation with the Central Platte Natural Resources District, collected time-domain electro-magnetic (TDEM) soundings at 14 locations in Dawson County, Nebraska. The TDEM soundings provide information pertaining to the hydrogeology at each of 23 sites at the 14 locations; 30 TDEM surface geophysical soundings were collected at the 14 locations to develop smooth and layered-earth resistivity models of the subsurface at each site. The soundings yield estimates of subsurface electrical resistivity; variations in subsurface electrical resistivity can be correlated with hydrogeologic and stratigraphic units. Results from each sounding were used to calculate resistivity to depths of approximately 90-130 meters (depending on loop size) below the land surface. Geonics Protem 47 and 57 systems, as well as the Alpha Geoscience TerraTEM, were used to collect the TDEM soundings (voltage data from which resistivity is calculated). For each sounding, voltage data were averaged and evaluated statistically before inversion (inverse modeling). Inverse modeling is the process of creating an estimate of the true distribution of subsurface resistivity from the mea-sured apparent resistivity obtained from TDEM soundings. Smooth and layered-earth models were generated for each sounding. A smooth model is a vertical delineation of calculated apparent resistivity that represents a non-unique estimate of the true resistivity. Ridge regression (Interpex Limited, 1996) was used by the inversion software in a series of iterations to create a smooth model consisting of 24-30 layers for each sounding site. Layered-earth models were then generated based on results of smooth modeling. The layered-earth models are simplified (generally 1 to 6 layers) to represent geologic units with depth. Throughout the area, the layered-earth models range from 2 to 4 layers, depending on observed inflections in the raw data and smooth model inversions. The TDEM data collected were considered good results on the basis of root mean square errors calculated after inversion modeling, comparisons with borehole geophysical logging, and repeatability.
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.
Trninić, Marko; Jeličić, Mario; Papić, Vladan
2015-07-01
In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree
Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid
2016-01-01
Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889
Morrow, Thomas B.; Behring, II, Kendricks A.
2004-10-12
A methods of indirectly measuring the nitrogen concentration in a gas mixture. The molecular weight of the gas is modeled as a function of the speed of sound in the gas, the diluent concentrations in the gas, and constant values, resulting in a model equation. Regression analysis is used to calculate the constant values, which can then be substituted into the model equation. If the speed of sound in the gas is measured at two states and diluent concentrations other than nitrogen (typically carbon dioxide) are known, two equations for molecular weight can be equated and solved for the nitrogen concentration in the gas mixture.
Kobayashi, Tohru; Fuse, Shigeto; Sakamoto, Naoko; Mikami, Masashi; Ogawa, Shunichi; Hamaoka, Kenji; Arakaki, Yoshio; Nakamura, Tsuneyuki; Nagasawa, Hiroyuki; Kato, Taichi; Jibiki, Toshiaki; Iwashima, Satoru; Yamakawa, Masaru; Ohkubo, Takashi; Shimoyama, Shinya; Aso, Kentaro; Sato, Seiichi; Saji, Tsutomu
2016-08-01
Several coronary artery Z score models have been developed. However, a Z score model derived by the lambda-mu-sigma (LMS) method has not been established. Echocardiographic measurements of the proximal right coronary artery, left main coronary artery, proximal left anterior descending coronary artery, and proximal left circumflex artery were prospectively collected in 3,851 healthy children ≤18 years of age and divided into developmental and validation data sets. In the developmental data set, smooth curves were fitted for each coronary artery using linear, logarithmic, square-root, and LMS methods for both sexes. The relative goodness of fit of these models was compared using the Bayesian information criterion. The best-fitting model was tested for reproducibility using the validation data set. The goodness of fit of the selected model was visually compared with that of the previously reported regression models using a Q-Q plot. Because the internal diameter of each coronary artery was not similar between sexes, sex-specific Z score models were developed. The LMS model with body surface area as the independent variable showed the best goodness of fit; therefore, the internal diameter of each coronary artery was transformed into a sex-specific Z score on the basis of body surface area using the LMS method. In the validation data set, a Q-Q plot of each model indicated that the distribution of Z scores in the LMS models was closer to the normal distribution compared with previously reported regression models. Finally, the final models for each coronary artery in both sexes were developed using the developmental and validation data sets. A Microsoft Excel-based Z score calculator was also created, which is freely available online (http://raise.umin.jp/zsp/calculator/). Novel LMS models with which to estimate the sex-specific Z score of each internal coronary artery diameter were generated and validated using a large pediatric population. Copyright © 2016 American Society of Echocardiography. Published by Elsevier Inc. All rights reserved.
Predicting pork loin intramuscular fat using computer vision system.
Liu, J-H; Sun, X; Young, J M; Bachmeier, L A; Newman, D J
2018-09-01
The objective of this study was to investigate the ability of computer vision system to predict pork intramuscular fat percentage (IMF%). Center-cut loin samples (n = 85) were trimmed of subcutaneous fat and connective tissue. Images were acquired and pixels were segregated to estimate image IMF% and 18 image color features for each image. Subjective IMF% was determined by a trained grader. Ether extract IMF% was calculated using ether extract method. Image color features and image IMF% were used as predictors for stepwise regression and support vector machine models. Results showed that subjective IMF% had a correlation of 0.81 with ether extract IMF% while the image IMF% had a 0.66 correlation with ether extract IMF%. Accuracy rates for regression models were 0.63 for stepwise and 0.75 for support vector machine. Although subjective IMF% has shown to have better prediction, results from computer vision system demonstrates the potential of being used as a tool in predicting pork IMF% in the future. Copyright © 2018 Elsevier Ltd. All rights reserved.
[Willingness of Patients with Obesity to Use New Media in Rehabilitation Aftercare].
Dorow, M; Löbner, M; Stein, J; Kind, P; Markert, J; Keller, J; Weidauer, E; Riedel-Heller, S G
2017-06-01
Digital media offer new possibilities in rehabilitation aftercare. This study investigates the rehabilitants' willingness to use new media (sms, internet, social networks) in rehabilitation aftercare and factors that are associated with the willingness to use media-based aftercare. 92 rehabilitants (patients with obesity) filled in a questionnaire on the willingness to use new media in rehabilitation aftercare. In order to identify influencing factors, binary logistic regression models were calculated. 3 quarters of the rehabilitants (76.1%) reported that they would be willing to use new media in rehabilitation aftercare. The binary logistic regression model yielded two factors that were associated with the willingness to use media-based aftercare: the possession of a smartphone and the willingness to receive telephone counseling for aftercare. The majority of the rehabilitants was willing to use new media in rehabilitation aftercare. The reasons for refusal of media-based aftercare need to be examined more closely. © Georg Thieme Verlag KG Stuttgart · New York.
Using Quantile and Asymmetric Least Squares Regression for Optimal Risk Adjustment.
Lorenz, Normann
2017-06-01
In this paper, we analyze optimal risk adjustment for direct risk selection (DRS). Integrating insurers' activities for risk selection into a discrete choice model of individuals' health insurance choice shows that DRS has the structure of a contest. For the contest success function (csf) used in most of the contest literature (the Tullock-csf), optimal transfers for a risk adjustment scheme have to be determined by means of a restricted quantile regression, irrespective of whether insurers are primarily engaged in positive DRS (attracting low risks) or negative DRS (repelling high risks). This is at odds with the common practice of determining transfers by means of a least squares regression. However, this common practice can be rationalized for a new csf, but only if positive and negative DRSs are equally important; if they are not, optimal transfers have to be calculated by means of a restricted asymmetric least squares regression. Using data from German and Swiss health insurers, we find considerable differences between the three types of regressions. Optimal transfers therefore critically depend on which csf represents insurers' incentives for DRS and, if it is not the Tullock-csf, whether insurers are primarily engaged in positive or negative DRS. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Associations between dairy cow inter-service interval and probability of conception.
Remnant, J G; Green, M J; Huxley, J N; Hudson, C D
2018-07-01
Recent research has indicated that the interval between inseminations in modern dairy cattle is often longer than the commonly accepted cycle length of 18-24 days. This study analysed 257,396 inseminations in 75,745 cows from 312 herds in England and Wales. The interval between subsequent inseminations in the same cow in the same lactation (inter-service interval, ISI) were calculated and inseminations categorised as successful or unsuccessful depending on whether there was a corresponding calving event. Conception risk was calculated for each individual ISI between 16 and 28 days. A random effects logistic regression model was fitted to the data with pregnancy as the outcome variable and ISI (in days) included in the model as a categorical variable. The modal ISI was 22 days and the peak conception risk was 44% for ISIs of 21 days rising from 27% at 16 days. The logistic regression model revealed significant associations of conception risk with ISI as well as 305 day milk yield, insemination number, parity and days in milk. Predicted conception risk was lower for ISIs of 16, 17 and 18 days and higher for ISIs of 20, 21 and 22 days compared to 25 day ISIs. A mixture model was specified to identify clusters in insemination frequency and conception risk for ISIs between 3 and 50 days. A "high conception risk, high insemination frequency" cluster was identified between 19 and 26 days which indicated that this time period was the true latent distribution for ISI with optimal reproductive outcome. These findings suggest that the period of increased numbers of inseminations around 22 days identified in existing work coincides with the period of increased probability of conception and therefore likely represents true return estrus events. Copyright © 2018 Elsevier Inc. All rights reserved.
Ricard, Caroline A; Dammann, Christiane E L; Dammann, Olaf
2017-01-01
Retinopathy of prematurity (ROP) is a disorder of the preterm newborn characterized by neurovascular disruption in the immature retina that may cause visual impairment and blindness. To develop a clinical screening tool for early postnatal prediction of ROP in preterm newborns based on risk information available within the first 48 h of postnatal life. Using data submitted to the Vermont Oxford Network (VON) between 1995 and 2015, we created logistic regression models based on infants born <28 completed weeks gestational age. We developed a model with 60% of the data and identified birth weight, gestational age, respiratory distress syndrome, non-Hispanic ethnicity, and multiple gestation as predictors of ROP. We tested the model in the remaining 40%, performed tenfold cross-validation, and tested the score in ELGAN study data. Of the 1,052 newborns in the VON database, 627 recorded an ROP status. Forty percent had no ROP, 40% had mild ROP (stages 1 and 2), and 20% had severe ROP (stages 3-5). We created a weighted score to predict any ROP based on the multivariable regression model. A cutoff score of 5 had the best sensitivity (95%, 95% CI 93-97), while maintaining a strong positive predictive value (63%, 95% CI 57-68). When applied to the ELGAN data, sensitivity was lower (72%, 95% CI 69-75), but PPV was higher (80%, 95% CI 77-83). STEP-ROP is a promising screening tool. It is easy to calculate, does not rely on extensive postnatal data collection, and can be calculated early after birth. Early ROP screening may help physicians limit patient exposure to additional risk factors, and may be useful for risk stratification in clinical trials aimed at reducing ROP. © 2017 S. Karger AG, Basel.
NASA Technical Reports Server (NTRS)
Sumnall, Matthew; Peduzzi, Alicia; Fox, Thomas R.; Wynne, Randolph H.; Thomas, Valerie A.; Cook, Bruce
2016-01-01
Leaf area is an important forest structural variable which serves as the primary means of mass and energy exchange within vegetated ecosystems. The objective of the current study was to determine if leaf area index (LAI) could be estimated accurately and consistently in five intensively managed pine plantation forests using two multiple-return airborne LiDAR datasets. Field measurements of LAI were made using the LiCOR LAI2000 and LAI2200 instruments within 116 plots were established of varying size and within a variety of stand conditions (i.e. stand age, nutrient regime and stem density) in North Carolina and Virginia in 2008 and 2013. A number of common LiDAR return height and intensity distribution metrics were calculated (e.g. average return height), in addition to ten indices, with two additional variants, utilized in the surrounding literature which have been used to estimate LAI and fractional cover, were calculated from return heights and intensity, for each plot extent. Each of the indices was assessed for correlation with each other, and was used as independent variables in linear regression analysis with field LAI as the dependent variable. All LiDAR derived metrics were also entered into a forward stepwise linear regression. The results from each of the indices varied from an R2 of 0.33 (S.E. 0.87) to 0.89 (S.E. 0.36). Those indices calculated using ratios of all returns produced the strongest correlations, such as the Above and Below Ratio Index (ABRI) and Laser Penetration Index 1 (LPI1). The regression model produced from a combination of three metrics did not improve correlations greatly (R2 0.90; S.E. 0.35). The results indicate that LAI can be predicted over a range of intensively managed pine plantation forest environments accurately when using different LiDAR sensor designs. Those indices which incorporated counts of specific return numbers (e.g. first returns) or return intensity correlated poorly with field measurements. There were disparities between the number of different types of returns and intensity values when comparing the results from two LiDAR sensors, indicating that predictive models developed using such metrics are not transferable between datasets with different acquisition parameters. Each of the indices were significantly correlated with one another, with one exception (LAI proxy), in particular those indices calculated from all returns, which indicates similarities in information content for those indices. It can then be argued that LiDAR indices have reached a similar stage in development to those calculated from optical-spectral sensors, but which offer a number of advantages, such as the reduction or removal of saturation issues in areas of high biomass.
Ellison, Christopher A.; Groten, Joel T.; Lorenz, David L.; Koller, Karl S.
2016-10-27
Consistent and reliable sediment data are needed by Federal, State, and local government agencies responsible for monitoring water quality, planning river restoration, quantifying sediment budgets, and evaluating the effectiveness of sediment reduction strategies. Heightened concerns about excessive sediment in rivers and the challenge to reduce costs and eliminate data gaps has guided Federal and State interests in pursuing alternative methods for measuring suspended and bedload sediment. Simple and dependable data collection and estimation techniques are needed to generate hydraulic and water-quality information for areas where data are unavailable or difficult to collect.The U.S. Geological Survey, in cooperation with the Minnesota Pollution Control Agency and the Minnesota Department of Natural Resources, completed a study to evaluate the use of dimensionless sediment rating curves (DSRCs) to accurately predict suspended-sediment concentrations (SSCs), bedload, and annual sediment loads for selected rivers and streams in Minnesota based on data collected during 2007 through 2013. This study included the application of DSRC models developed for a small group of streams located in the San Juan River Basin near Pagosa Springs in southwestern Colorado to rivers in Minnesota. Regionally based DSRC models for Minnesota also were developed and compared to DSRC models from Pagosa Springs, Colorado, to evaluate which model provided more accurate predictions of SSCs and bedload in Minnesota.Multiple measures of goodness-of-fit were developed to assess the effectiveness of DSRC models in predicting SSC and bedload for rivers in Minnesota. More than 600 dimensionless ratio values of SSC, bedload, and streamflow were evaluated and delineated according to Pfankuch stream stability categories of “good/fair” and “poor” to develop four Minnesota-based DSRC models. The basis for Pagosa Springs and Minnesota DSRC model effectiveness was founded on measures of goodness-of-fit that included proximity of the model(s) fitted line to the 95-percent confidence intervals of the site-specific model, Nash-Sutcliffe Efficiency values, model biases, and deviation of annual sediment loads from each model to the annual sediment loads calculated from measured data.Composite plots comparing Pagosa Springs DSRCs, Minnesota DSRCs, site-specific regression models, and measured data indicated that regionally developed DSRCs (Minnesota DSRC models) more closely approximated measured data for nearly every site. Pagosa Springs DSRC models had markedly larger exponents (slopes) when compared to the Minnesota DSRC models and the site-specific regression models, and over-represent SSC and bedload at streamflows exceeding bankfull. The Nash-Sutcliffe Efficiency values for the Minnesota DSRC model for suspended-sediment concentrations closely matched Nash-Sutcliffe Efficiency values of the site-specific regression models for 12 out of 16 sites. Nash-Sutcliffe Efficiency values associated with Minnesota DSRCs were greater than those associated with Pagosa Springs DSRCs for every site except the Whitewater River near Beaver, Minnesota site. Pagosa Springs DSRC models were less accurate than the mean of the measured data at predicting SSC values for one-half of the sites for good/fair stability sites and one-half of the sites for poor stability sites. Relative model biases were calculated and determined to be substantial (greater than 5 percent) for Pagosa Springs and Minnesota models, with Minnesota models having a lower mean model bias. For predicted annual suspended-sediment loads (SSL), the Minnesota DSRC models for good/fair and poor stream stability sites more closely approximated the annual SSLs calculated from the measured data as compared to the Pagosa Springs DSRC model.Results of data analyses indicate that DSRC models developed using data collected in Minnesota were more effective at compensating for differences in individual stream characteristics across a variety of basin sizes and flow regimes than DSRC models developed using data collected for Pagosa Springs, Colorado. Minnesota DSRC models retained a substantial portion of the unique sediment signatures for most rivers, although deviations were observed for streams with limited sediment supply and for rivers in southeastern Minnesota, which had markedly larger regression exponents. Compared to Pagosa Springs DSRC models, Minnesota DSRC models had regression slopes that more closely matched the slopes of site-specific regression models, had greater Nash-Sutcliffe Efficiency values, had lower model biases, and approximated measured annual sediment loads more closely. The results presented in this report indicate that regionally based DSRCs can be used to estimate reasonably accurate values of SSC and bedload.Practitioners are cautioned that DSRC reliability is dependent on representative measures of bankfull streamflow, SSC, and bedload. It is, therefore, important that samples of SSC and bedload, which will be used for estimating SSC and bedload at the bankfull streamflow, are collected over a range of conditions that includes the ascending and descending limbs of the event hydrograph. The use of DSRC models may have substantial limitations for certain conditions. For example, DSRC models should not be used to predict SSC and sediment loads for extreme streamflows, such as those that exceed twice the bankfull streamflow value because this constitutes conditions beyond the realm of current (2016) empirical modeling capability. Also, if relations between SSC and streamflow and between bedload and streamflow are not statistically significant, DSRC models should not be used to predict SSC or bedload, as this could result in large errors. For streams that do not violate these conditions, DSRC estimates of SSC and bedload can be used for stream restoration planning and design, and for estimating annual sediment loads for streams where little or no sediment data are available.
Sanson, R L; Gloster, J; Burgin, L
2011-09-24
The aims of this study were to statistically reassess the likelihood that windborne spread of foot-and-mouth disease (FMD) virus (FMDV) occurred at the start of the UK 1967 to 1968 FMD epidemic at Oswestry, Shropshire, and to derive dose-response probability of infection curves for farms exposed to airborne FMDV. To enable this, data on all farms present in 1967 in the parishes near Oswestry were assembled. Cases were infected premises whose date of appearance of first clinical signs was within 14 days of the depopulation of the index farm. Logistic regression was used to evaluate the association between infection status and distance and direction from the index farm. The UK Met Office's NAME atmospheric dispersion model (ADM) was used to generate plumes for each day that FMDV was excreted from the index farm based on actual historical weather records from October 1967. Daily airborne FMDV exposure rates for all farms in the study area were calculated using a geographical information system. Probit analyses were used to calculate dose-response probability of infection curves to FMDV, using relative exposure rates on case and control farms. Both the logistic regression and probit analyses gave strong statistical support to the hypothesis that airborne spread occurred. There was some evidence that incubation period was inversely proportional to the exposure rate.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
NASA Astrophysics Data System (ADS)
Dias, L. G.; Shimizu, K.; Farah, J. P. S.; Chaimovich, H.
2002-09-01
We propose and demonstrate the usefulness of a method, defined as generalized Born electronegativity equalization method (GBEEM) to estimate solvent-induced charge redistribution. The charges obtained by GBEEM, in a representative series of small organic molecules, were compared to PM3-CM1 charges in vacuum and in water. Linear regressions with appropriate correlation coefficients and standard deviations between GBEEM and PM3-CM1 methods were obtained ( R=0.94,SD=0.15, Ftest=234, N=32, in vacuum; R=0.94,SD=0.16, Ftest=218, N=29, in water). In order to test the GBEEM response when intermolecular interactions are involved we calculated a water dimer in dielectric water using both GBEEM and PM3-CM1 and the results were similar. Hence, the method developed here is comparable to established calculation methods.
Gifford, Katherine A; Phillips, Jeffrey S; Samuels, Lauren R; Lane, Elizabeth M; Bell, Susan P; Liu, Dandan; Hohman, Timothy J; Romano, Raymond R; Fritzsche, Laura R; Lu, Zengqi; Jefferson, Angela L
2015-07-01
A symptom of mild cognitive impairment (MCI) and Alzheimer's disease (AD) is a flat learning profile. Learning slope calculation methods vary, and the optimal method for capturing neuroanatomical changes associated with MCI and early AD pathology is unclear. This study cross-sectionally compared four different learning slope measures from the Rey Auditory Verbal Learning Test (simple slope, regression-based slope, two-slope method, peak slope) to structural neuroimaging markers of early AD neurodegeneration (hippocampal volume, cortical thickness in parahippocampal gyrus, precuneus, and lateral prefrontal cortex) across the cognitive aging spectrum [normal control (NC); (n=198; age=76±5), MCI (n=370; age=75±7), and AD (n=171; age=76±7)] in ADNI. Within diagnostic group, general linear models related slope methods individually to neuroimaging variables, adjusting for age, sex, education, and APOE4 status. Among MCI, better learning performance on simple slope, regression-based slope, and late slope (Trial 2-5) from the two-slope method related to larger parahippocampal thickness (all p-values<.01) and hippocampal volume (p<.01). Better regression-based slope (p<.01) and late slope (p<.01) were related to larger ventrolateral prefrontal cortex in MCI. No significant associations emerged between any slope and neuroimaging variables for NC (p-values ≥.05) or AD (p-values ≥.02). Better learning performances related to larger medial temporal lobe (i.e., hippocampal volume, parahippocampal gyrus thickness) and ventrolateral prefrontal cortex in MCI only. Regression-based and late slope were most highly correlated with neuroimaging markers and explained more variance above and beyond other common memory indices, such as total learning. Simple slope may offer an acceptable alternative given its ease of calculation.
NASA Astrophysics Data System (ADS)
Torréton, Jean-Pascal; Rochelle-Newall, Emma; Jouon, Aymeric; Faure, Vincent; Jacquet, Séverine; Douillet, Pascal
2007-09-01
Hydrodynamic modeling can be used to spatially characterize water renewal rates in coastal ecosystems. Using a hydrodynamic model implemented over the semi-enclosed Southwest coral lagoon of New Caledonia, a recent study computed the flushing lag as the minimum time required for a particle coming from outside the lagoon (open ocean) to reach a specific station [Jouon, A., Douillet, P., Ouillon, S., Fraunié, P., 2006. Calculations of hydrodynamic time parameters in a semi-opened coastal zone using a 3D hydrodynamic model. Continental Shelf Research 26, 1395-1415]. Local e -flushing time was calculated as the time requested to reach a local grid mesh concentration of 1/e from the precedent step. Here we present an attempt to connect physical forcing to biogeochemical functioning of this coastal ecosystem. An array of stations, located in the lagoonal channel as well as in several bays under anthropogenic influence, was sampled during three cruises. We then tested the statistical relationships between the distribution of flushing indices and those of biological and chemical variables. Among the variables tested, silicate, chlorophyll a and bacterial biomass production present the highest correlations with flushing indices. Correlations are higher with local e-flushing times than with flushing lags or the sum of these two indices. In the bays, these variables often deviate from the relationships determined in the main lagoon channel. In the three bays receiving significant riverine inputs, silicate is well above the regression line, whereas data from the bay receiving almost insignificant freshwater inputs generally fit the lagoon channel regressions. Moreover, in the three bays receiving important urban and industrial effluents, chlorophyll a and bacterial production of biomass generally display values exceeding the lagoon channel regression trends whereas in the bay under moderate anthropogenic influence values follow the regressions obtained in the lagoon channel. The South West lagoon of New Caledonia can hence be viewed as a coastal mesotrophic ecosystem that is flushed by oligotrophic oceanic waters which subsequently replace the lagoonal waters with water considerably impoverished in resources for microbial growth. This flushing was high enough during the periods of study to influence the distribution of phytoplankton biomass, bacterial production of biomass and silicate concentrations in the lagoon channel as well as in some of the bay areas.
Indian Ocean zonal mode activity in 20th century observations and simulations
NASA Astrophysics Data System (ADS)
Sendelbeck, Anja; Mölg, Thomas
2016-04-01
The Indian Ocean zonal mode (IOZM) is a coupled ocean-atmosphere system with anomalous cooling in the east, warming in the west and easterly wind anomalies, resulting in a complete reversal of the climatological zonal sea surface temperature (SST) gradient. The IOZM has a strong influence on East African climate by causing anomalously strong October - December (OND) precipitation. Using observational data and historical CMIP5 (Coupled Model Intercomparison Project phase 5) model output, the September - November (SON) dipole mode index (DMI), OND East African precipitation and SON zonal wind index (ZWI) are calculated. We pay particular attention to detrending SSTs for calculating the DMI, which seems to have been neglected in some published research. The ZWI is defined as the area-averaged zonal wind component at 850 hPa over the central Indian Ocean. Regression analysis is used to evaluate the models' capability to represent the IOZM and its impact on east African climate between 1948 and 2005. Simple correlations are calculated between SST, zonal wind and precipitation to show their interdependence. High correlation in models implies a good representation of the influence of IOZM on East African climate variability and our goal is to detect the models with the highest correlation coefficients. In future research, these model data might be used to investigate the impact of IOZM on the East African climate variability in the late 20's century with regard to anthropogenic causes and internal variability.
Effects of urban form on the urban heat island effect based on spatial regression model.
Yin, Chaohui; Yuan, Man; Lu, Youpeng; Huang, Yaping; Liu, Yanfang
2018-09-01
The urban heat island (UHI) effect is becoming more of a concern with the accelerated process of urbanization. However, few studies have examined the effect of urban form on land surface temperature (LST) especially from an urban planning perspective. This paper used spatial regression model to investigate the effects of both land use composition and urban form on LST in Wuhan City, China, based on the regulatory planning management unit. Landsat ETM+ image data was used to estimate LST. Land use composition was calculated by impervious surface area proportion, vegetated area proportion, and water proportion, while urban form indicators included sky view factor (SVF), building density, and floor area ratio (FAR). We first tested for spatial autocorrelation of urban LST, which confirmed that a traditional regression method would be invalid. A spatial error model (SEM) was chosen because its parameters were better than a spatial lag model (SLM). The results showed that urban form metrics should be the focus for mitigation efforts of UHI effects. In addition, analysis of the relationship between urban form and UHI effect based on the regulatory planning management unit was helpful for promoting corresponding UHI effect mitigation rules in practice. Finally, the spatial regression model was recommended to be an appropriate method for dealing with problems related to the urban thermal environment. Results suggested that the impact of urbanization on the UHI effect can be mitigated not only by balancing various land use types, but also by optimizing urban form, which is even more effective. This research expands the scientific understanding of effects of urban form on UHI by explicitly analyzing indicators closely related to urban detailed planning at the level of regulatory planning management unit. In addition, it may provide important insights and effective regulation measures for urban planners to mitigate future UHI effects. Copyright © 2018 Elsevier B.V. All rights reserved.
Efficient model learning methods for actor-critic control.
Grondman, Ivo; Vaandrager, Maarten; Buşoniu, Lucian; Babuska, Robert; Schuitema, Erik
2012-06-01
We propose two new actor-critic algorithms for reinforcement learning. Both algorithms use local linear regression (LLR) to learn approximations of the functions involved. A crucial feature of the algorithms is that they also learn a process model, and this, in combination with LLR, provides an efficient policy update for faster learning. The first algorithm uses a novel model-based update rule for the actor parameters. The second algorithm does not use an explicit actor but learns a reference model which represents a desired behavior, from which desired control actions can be calculated using the inverse of the learned process model. The two novel methods and a standard actor-critic algorithm are applied to the pendulum swing-up problem, in which the novel methods achieve faster learning than the standard algorithm.
NASA Astrophysics Data System (ADS)
Harris, Courtney K.; Wiberg, Patricia L.
1997-09-01
Modeling shelf sediment transport rates and bed reworking depths is problematic when the wave and current forcing conditions are not precisely known, as is usually the case when long-term sedimentation patterns are of interest. Two approaches to modeling sediment transport under such circumstances are considered. The first relies on measured or simulated time series of flow conditions to drive model calculations. The second approach uses as model input probability distribution functions of bottom boundary layer flow conditions developed from wave and current measurements. Sediment transport rates, frequency of bed resuspension by waves and currents, and bed reworking calculated using the two methods are compared at the mid-shelf STRESS (Sediment TRansport on Shelves and Slopes) site on the northern California continental shelf. Current, wave and resuspension measurements at the site are used to generate model inputs and test model results. An 11-year record of bottom wave orbital velocity, calculated from surface wave spectra measured by the National Data Buoy Center (NDBC) Buoy 46013 and verified against bottom tripod measurements, is used to characterize the frequency and duration of wave-driven transport events and to estimate the joint probability distribution of wave orbital velocity and period. A 109-day record of hourly current measurements 10 m above bottom is used to estimate the probability distribution of bottom boundary layer current velocity at this site and to develop an auto-regressive model to simulate current velocities for times when direct measurements of currents are not available. Frequency of transport, the maximum volume of suspended sediment, and average flux calculated using measured wave and simulated current time series agree well with values calculated using measured time series. A probabilistic approach is more amenable to calculations over time scales longer than existing wave records, but it tends to underestimate net transport because it does not capture the episodic nature of transport events. Both methods enable estimates to be made of the uncertainty in transport quantities that arise from an incomplete knowledge of the specific timing of wave and current conditions. 1997 Elsevier Science Ltd
Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan
2009-01-01
The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.
Gardner, Lytt I.; Marks, Gary; Wilson, Tracey E.; Giordano, Thomas P.; Sullivan, Meg; Raper, James L.; Rodriguez, Allan E.; Keruly, Jeanne; Malitz, Faye
2016-01-01
We calculated the financial impact in 6 HIV clinics of a low-effort retention in care intervention involving brief motivational messages from providers, patient brochures, and posters. We used a linear regression model to calculate absolute changes in kept primary care visits from the preintervention year (2008–2009) to the intervention year (2009–2010). Revenue from patients’ insurance was also assessed by clinic. Kept visits improved significantly in the intervention year versus the preintervention year (P < 0.0001). We found a net-positive effect on clinic revenue of +$24,000/year for an average-size clinic (7400 scheduled visits/year). We encourage HIV clinic administrators to consider implementing this low-effort intervention. PMID:25559605
Jarvis, J; Seed, M; Elton, R; Sawyer, L; Agius, R
2005-01-01
Aims: To investigate quantitatively, relationships between chemical structure and reported occupational asthma hazard for low molecular weight (LMW) organic compounds; to develop and validate a model linking asthma hazard with chemical substructure; and to generate mechanistic hypotheses that might explain the relationships. Methods: A learning dataset used 78 LMW chemical asthmagens reported in the literature before 1995, and 301 control compounds with recognised occupational exposures and hazards other than respiratory sensitisation. The chemical structures of the asthmagens and control compounds were characterised by the presence of chemical substructure fragments. Odds ratios were calculated for these fragments to determine which were associated with a likelihood of being reported as an occupational asthmagen. Logistic regression modelling was used to identify the independent contribution of these substructures. A post-1995 set of 21 asthmagens and 77 controls were selected to externally validate the model. Results: Nitrogen or oxygen containing functional groups such as isocyanate, amine, acid anhydride, and carbonyl were associated with an occupational asthma hazard, particularly when the functional group was present twice or more in the same molecule. A logistic regression model using only statistically significant independent variables for occupational asthma hazard correctly assigned 90% of the model development set. The external validation showed a sensitivity of 86% and specificity of 99%. Conclusions: Although a wide variety of chemical structures are associated with occupational asthma, bifunctional reactivity is strongly associated with occupational asthma hazard across a range of chemical substructures. This suggests that chemical cross-linking is an important molecular mechanism leading to the development of occupational asthma. The logistic regression model is freely available on the internet and may offer a useful but inexpensive adjunct to the prediction of occupational asthma hazard. PMID:15778257
Total Ambient Dose Equivalent Buildup Factor Determination for Nbs04 Concrete.
Duckic, Paulina; Hayes, Robert B
2018-06-01
Buildup factors are dimensionless multiplicative factors required by the point kernel method to account for scattered radiation through a shielding material. The accuracy of the point kernel method is strongly affected by the correspondence of analyzed parameters to experimental configurations, which is attempted to be simplified here. The point kernel method has not been found to have widespread practical use for neutron shielding calculations due to the complex neutron transport behavior through shielding materials (i.e. the variety of interaction mechanisms that neutrons may undergo while traversing the shield) as well as non-linear neutron total cross section energy dependence. In this work, total ambient dose buildup factors for NBS04 concrete are calculated in terms of neutron and secondary gamma ray transmission factors. The neutron and secondary gamma ray transmission factors are calculated using MCNP6™ code with updated cross sections. Both transmission factors and buildup factors are given in a tabulated form. Practical use of neutron transmission and buildup factors warrants rigorously calculated results with all associated uncertainties. In this work, sensitivity analysis of neutron transmission factors and total buildup factors with varying water content has been conducted. The analysis showed significant impact of varying water content in concrete on both neutron transmission factors and total buildup factors. Finally, support vector regression, a machine learning technique, has been engaged to make a model based on the calculated data for calculation of the buildup factors. The developed model can predict most of the data with 20% relative error.
2004-03-01
Breusch - Pagan test for constant variance of the residuals. Using Microsoft Excel® we calculate a p-value of 0.841237. This high p-value, which is above...our alpha of 0.05, indicates that our residuals indeed pass the Breusch - Pagan test for constant variance. In addition to the assumption tests , we...Wilk Test for Normality – Support (Reduced) Model (OLS) Finally, we perform a Breusch - Pagan test for constant variance of the residuals. Using
Solar energy distribution over Egypt using cloudiness from Meteosat photos
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mosalam Shaltout, M.A.; Hassen, A.H.
1990-01-01
In Egypt, there are 10 ground stations for measuring the global solar radiation, and five stations for measuring the diffuse solar radiation. Every day at noon, the Meteorological Authority in Cairo receives three photographs of cloudiness over Egypt from the Meteosat satellite, one in the visible, and two in the infra-red bands (10.5-12.5 {mu}m) and (5.7-7.1 {mu}m). The monthly average cloudiness for 24 sites over Egypt are measured and calculated from Meteosat observations during the period 1985-1986. Correlation analysis between the cloudiness observed by Meteosat and global solar radiation measured from the ground stations is carried out. It is foundmore » that, the correlation coefficients are about 0.90 for the simple linear regression, and increase for the second and third degree regressions. Also, the correlation coefficients for the cloudiness with the diffuse solar radiation are about 0.80 for the simple linear regression, and increase for the second and third degree regression. Models and empirical relations for estimating the global and diffuse solar radiation from Meteosat cloudiness data over Egypt are deduced and tested. Seasonal maps for the global and diffuse radiation over Egypt are carried out.« less
An efficient surrogate-based simulation-optimization method for calibrating a regional MODFLOW model
NASA Astrophysics Data System (ADS)
Chen, Mingjie; Izady, Azizallah; Abdalla, Osman A.
2017-01-01
Simulation-optimization method entails a large number of model simulations, which is computationally intensive or even prohibitive if the model simulation is extremely time-consuming. Statistical models have been examined as a surrogate of the high-fidelity physical model during simulation-optimization process to tackle this problem. Among them, Multivariate Adaptive Regression Splines (MARS), a non-parametric adaptive regression method, is superior in overcoming problems of high-dimensions and discontinuities of the data. Furthermore, the stability and accuracy of MARS model can be improved by bootstrap aggregating methods, namely, bagging. In this paper, Bagging MARS (BMARS) method is integrated to a surrogate-based simulation-optimization framework to calibrate a three-dimensional MODFLOW model, which is developed to simulate the groundwater flow in an arid hardrock-alluvium region in northwestern Oman. The physical MODFLOW model is surrogated by the statistical model developed using BMARS algorithm. The surrogate model, which is fitted and validated using training dataset generated by the physical model, can approximate solutions rapidly. An efficient Sobol' method is employed to calculate global sensitivities of head outputs to input parameters, which are used to analyze their importance for the model outputs spatiotemporally. Only sensitive parameters are included in the calibration process to further improve the computational efficiency. Normalized root mean square error (NRMSE) between measured and simulated heads at observation wells is used as the objective function to be minimized during optimization. The reasonable history match between the simulated and observed heads demonstrated feasibility of this high-efficient calibration framework.
Mochizuki, Yumi; Omura, Ken; Nakamura, Shin; Harada, Hiroyuki; Shibuya, Hitoshi; Kurabayashi, Toru
2012-02-01
This study aimed to construct a preoperative predictive model of cervical lymph node metastasis using fluorine-18 fluorodeoxyglucose positron-emission tomography/computerized tomography ((18)F-FDG PET/CT) findings in patients with oral or oropharyngeal squamous cell carcinoma (SCC). Forty-nine such patients undergoing preoperative (18)F-FDG PET/CT and neck dissection or lymph node biopsy were enrolled. Retrospective comparisons with spatial correlation between PET/CT and the anatomical sites based on histopathological examinations of surgical specimens were performed. We calculated a logistic regression model, including the SUVmax-related variable. When using the optimal cutoff point criterion of probabilities calculated from the model that included either clinical factors and delayed-phase SUVmax ≥0.087 or clinical factors and maximum standardized uptake (SUV) increasing rate (SUV-IR) ≥ 0.100, it significantly increased the sensitivity, specificity, and accuracy (87.5%, 65.7%, and 75.2%, respectively). The use of predictive models that include clinical factors and delayed-phase SUVmax and SUV-IR improve preoperative nodal diagnosis. Copyright © 2012 Elsevier Inc. All rights reserved.
Study on the medical meteorological forecast of the number of hypertension inpatient based on SVR
NASA Astrophysics Data System (ADS)
Zhai, Guangyu; Chai, Guorong; Zhang, Haifeng
2017-06-01
The purpose of this study is to build a hypertension prediction model by discussing the meteorological factors for hypertension incidence. The research method is selecting the standard data of relative humidity, air temperature, visibility, wind speed and air pressure of Lanzhou from 2010 to 2012(calculating the maximum, minimum and average value with 5 days as a unit ) as the input variables of Support Vector Regression(SVR) and the standard data of hypertension incidence of the same period as the output dependent variables to obtain the optimal prediction parameters by cross validation algorithm, then by SVR algorithm learning and training, a SVR forecast model for hypertension incidence is built. The result shows that the hypertension prediction model is composed of 15 input independent variables, the training accuracy is 0.005, the final error is 0.0026389. The forecast accuracy based on SVR model is 97.1429%, which is higher than statistical forecast equation and neural network prediction method. It is concluded that SVR model provides a new method for hypertension prediction with its simple calculation, small error as well as higher historical sample fitting and Independent sample forecast capability.
Stochastic optimal operation of reservoirs based on copula functions
NASA Astrophysics Data System (ADS)
Lei, Xiao-hui; Tan, Qiao-feng; Wang, Xu; Wang, Hao; Wen, Xin; Wang, Chao; Zhang, Jing-wen
2018-02-01
Stochastic dynamic programming (SDP) has been widely used to derive operating policies for reservoirs considering streamflow uncertainties. In SDP, there is a need to calculate the transition probability matrix more accurately and efficiently in order to improve the economic benefit of reservoir operation. In this study, we proposed a stochastic optimization model for hydropower generation reservoirs, in which 1) the transition probability matrix was calculated based on copula functions; and 2) the value function of the last period was calculated by stepwise iteration. Firstly, the marginal distribution of stochastic inflow in each period was built and the joint distributions of adjacent periods were obtained using the three members of the Archimedean copulas, based on which the conditional probability formula was derived. Then, the value in the last period was calculated by a simple recursive equation with the proposed stepwise iteration method and the value function was fitted with a linear regression model. These improvements were incorporated into the classic SDP and applied to the case study in Ertan reservoir, China. The results show that the transition probability matrix can be more easily and accurately obtained by the proposed copula function based method than conventional methods based on the observed or synthetic streamflow series, and the reservoir operation benefit can also be increased.
NASA Astrophysics Data System (ADS)
Arruda, Daniela C. S.; Sobral, J. H. A.; Abdu, M. A.; Castilho, Vivian M.; Takahashi, H.; Medeiros, A. F.; Buriti, R. A.
2006-01-01
This work presents equatorial ionospheric plasma bubble zonal drift velocity observations and their comparison with model calculations. The bubble zonal velocities were measured using airglow OI630 nm all-sky digital images and the model calculations were performed taking into account flux-tube integrated Pedersen conductivity and conductivity weighted neutral zonal winds. The digital images were obtained from an all-sky imaging system operated over the low-latitude station Cachoeira Paulista (Geogr. 22.5S, 45W, dip angle 31.5S) during the period from October 1998 to August 2000. Out of the 138 nights of imager observation, 29 nights with the presence of plasma bubbles are used in this study. These 29 nights correspond to geomagnetically rather quiet days (∑K P < 24+) and were grouped according to season. During the early night hours, the calculated zonal drift velocities were found to be larger than the experimental values. The best matching between the calculated and observed zonal velocities were seen to be for a few hours around midnight. The model calculation showed two humps around 20 LT and 24 LT that were not present in the data. Average decelerations obtained from linear regression between 20 LT and 24 LT were found to be: (a) Spring 1998, -8.61 ms -1 h -1; (b) Summer 1999, -0.59 ms -1 h -1; (c) Spring 1999, -11.72 ms -1 h -1; and (d) Summer 2000, -8.59 ms -1 h -1. Notice that Summer and Winter here correspond to southern hemisphere Summer and Winter, not northern hemisphere.
Long Term Exposure to NO2 and Diabetes Incidence in the Black Women's Health Study
Coogan, Patricia F.; White, Laura F.; Yu, Jeffrey; Burnett, Richard T.; Marshall, Julian D.; Seto, Edmund; Brook, Robert D.; Palmer, Julie R.; Rosenberg, Lynn; Jerrett, Michael
2016-01-01
While laboratory studies show that air pollutants can potentiate insulin resistance, the epidemiologic evidence regarding the association of air pollution with diabetes incidence is conflicting. The purpose of the present study was to assess the association of the traffic-related nitrogen dioxide (NO2) with the incidence of diabetes in a longitudinal cohort study of African American women. We used Cox proportional hazards models to calculate hazard ratios and 95% confidence intervals (CI) for diabetes associated with exposure to NO2 among 43,003 participants in the Black Women's Health Study (BWHS). Pollutant levels at participant residential locations were estimated with 1) a land use regression model for participants living in 56 metropolitan areas, and 2) a dispersion model for participants living in 27 of the cities. From 1995-2011, 4387 cases of diabetes occurred. The hazard ratios per interquartile range of NO2 (9.7 ppb), adjusted for age, metropolitan area, education, vigorous exercise, body mass index, smoking, and diet, were 0.96 (95% CI 0.88-1.06) using the land use regression model estimates and 0.94 (95% CI 0.80, 1.10) using the dispersion model estimates. The present results do not support the hypothesis that exposure to NO2 contributes to diabetes incidence in African American women. PMID:27124624
Modeling potential habitats for alien species Dreissena polymorpha in continental USA
Mingyang, Li; Yunwei, Ju; Kumar, Sunil; Stohlgren, Thomas J.
2008-01-01
The effective measure to minimize the damage of invasive species is to block the potential invasive species to enter into suitable areas. 1864 occurrence points with GPS coordinates and 34 environmental variables from Daymet datasets were gathered, and 4 modeling methods, i.e., Logistic Regression (LR), Classification and Regression Trees (CART), Genetic Algorithm for Rule-Set Prediction (GARP), and maximum entropy method (Maxent), were introduced to generate potential geographic distributions for invasive species Dreissena polymorpha in Continental USA. Then 3 statistical criteria of the area under the Receiver Operating Characteristic curve (AUC), Pearson correlation (COR) and Kappa value were calculated to evaluate the performance of the models, followed by analyses on major contribution variables. Results showed that in terms of the 3 statistical criteria, the prediction results of the 4 ecological niche models were either excellent or outstanding, in which Maxent outperformed the others in 3 aspects of predicting current distribution habitats, selecting major contribution factors, and quantifying the influence of environmental variables on habitats. Distance to water, elevation, frequency of precipitation and solar radiation were 4 environmental forcing factors. The method suggested in the paper can have some reference meaning for modeling habitats of alien species in China and provide a direction to prevent Mytilopsis sallei on the Chinese coast line.
Modeling CO{sub 2} and H{sub 2}S solubility in MDEA and DEA: Design implications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rochelle, G.T.; Posey, M.
1996-12-31
The solubility of H{sub 2}S and CO{sub 2} in aqueous alkanolamines affects solution capacity and the required circulation rate for acid gas absorption. These thermodynamics also determine the relationship of steam rate and the lean loading of the solution which in turn sets the leak of acid gas from the top of the absorber. Finally, the mechanisms of mass transfer and the role of kinetics, especially in stripping, depend on the vapor/liquid equilibria. Published measurements of CO{sub 2} and H{sub 2}S solubility in methyldiethanolamine (MDEA) and diethanolamine (DEA) are not in general agreement, especially at low loading of acid gas.more » The available sets of solubility data have been regressed with the AspenPlus electrolyte/NRTL model. All of the parameters and constants that make up this model have been carefully evaluated. Independent thermodynamic data such as freezing point and heat of mixing have been included in the regression to strengthen the estimates of model parameters. The parameters for each set of solubility data have been evaluated in an attempt to determine which set is correct. Each evaluated model has been used to calculate the acid gas capacity and minimum stripping steam rate for several industrial cases of acid gas absorption/stripping.« less
Motlagh, Mohadeseh Ghanbari; Kafaky, Sasan Babaie; Mataji, Asadollah; Akhavan, Reza
2018-05-21
Hyrcanian forests of North of Iran are of great importance in terms of various economic and environmental aspects. In this study, Spot-6 satellite images and regression models were applied to estimate above-ground biomass in these forests. This research was carried out in six compartments in three climatic (semi-arid to humid) types and two altitude classes. In the first step, ground sampling methods at the compartment level were used to estimate aboveground biomass (Mg/ha). Then, by reviewing the results of other studies, the most appropriate vegetation indices were selected. In this study, three indices of NDVI, RVI, and TVI were calculated. We investigated the relationship between the vegetation indices and aboveground biomass measured at sample-plot level. Based on the results, the relationship between aboveground biomass values and vegetation indices was a linear regression with the highest level of significance for NDVI in all compartments. Since at the compartment level the correlation coefficient between NDVI and aboveground biomass was the highest, NDVI was used for mapping aboveground biomass. According to the results of this study, biomass values were highly different in various climatic and altitudinal classes with the highest biomass value observed in humid climate and high-altitude class.
Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra
2010-01-01
In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.
Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver
2015-02-01
The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.
Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong
2015-01-01
Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
Hung, Shih-Chiang; Kung, Chia-Te; Hung, Chih-Wei; Liu, Ber-Ming; Liu, Jien-Wei; Chew, Ghee; Chuang, Hung-Yi; Lee, Wen-Huei; Lee, Tzu-Chi
2014-08-23
The adverse effects of delayed admission to the intensive care unit (ICU) have been recognized in previous studies. However, the definitions of delayed admission varies across studies. This study proposed a model to define "delayed admission", and explored the effect of ICU-waiting time on patients' outcome. This retrospective cohort study included non-traumatic adult patients on mechanical ventilation in the emergency department (ED), from July 2009 to June 2010. The primary outcomes measures were 21-ventilator-day mortality and prolonged hospital stays (over 30 days). Models of Cox regression and logistic regression were used for multivariate analysis. The non-delayed ICU-waiting was defined as a period in which the time effect on mortality was not statistically significant in a Cox regression model. To identify a suitable cut-off point between "delayed" and "non-delayed", subsets from the overall data were made based on ICU-waiting time and the hazard ratio of ICU-waiting hour in each subset was iteratively calculated. The cut-off time was then used to evaluate the impact of delayed ICU admission on mortality and prolonged length of hospital stay. The final analysis included 1,242 patients. The time effect on mortality emerged after 4 hours, thus we deduced ICU-waiting time in ED > 4 hours as delayed. By logistic regression analysis, delayed ICU admission affected the outcomes of 21 ventilator-days mortality and prolonged hospital stay, with odds ratio of 1.41 (95% confidence interval, 1.05 to 1.89) and 1.56 (95% confidence interval, 1.07 to 2.27) respectively. For patients on mechanical ventilation at the ED, delayed ICU admission is associated with higher probability of mortality and additional resource expenditure. A benchmark waiting time of no more than 4 hours for ICU admission is recommended.
Di Donato, Violante; Kontopantelis, Evangelos; Aletti, Giovanni; Casorelli, Assunta; Piacenti, Ilaria; Bogani, Giorgio; Lecce, Francesca; Benedetti Panici, Pierluigi
2017-06-01
Primary cytoreductive surgery (PDS) followed by platinum-based chemotherapy is the cornerstone of treatment and the absence of residual tumor after PDS is universally considered the most important prognostic factor. The aim of the present analysis was to evaluate trend and predictors of 30-day mortality in patients undergoing primary cytoreduction for ovarian cancer. Literature was searched for records reporting 30-day mortality after PDS. All cohorts were rated for quality. Simple and multiple Poisson regression models were used to quantify the association between 30-day mortality and the following: overall or severe complications, proportion of patients with stage IV disease, median age, year of publication, and weighted surgical complexity index. Using the multiple regression model, we calculated the risk of perioperative mortality at different levels for statistically significant covariates of interest. Simple regression identified median age and proportion of patients with stage IV disease as statistically significant predictors of 30-day mortality. When included in the multiple Poisson regression model, both remained statistically significant, with an incidence rate ratio of 1.087 for median age and 1.017 for stage IV disease. Disease stage was a strong predictor, with the risk estimated to increase from 2.8% (95% confidence interval 2.02-3.66) for stage III to 16.1% (95% confidence interval 6.18-25.93) for stage IV, for a cohort with a median age of 65 years. Metaregression demonstrated that increased age and advanced clinical stage were independently associated with an increased risk of mortality, and the combined effects of both factors greatly increased the risk.
Demographic responses of Pinguicula ionantha to prescribed fire: a regression-design LTRE approach.
Kesler, Herbert C; Trusty, Jennifer L; Hermann, Sharon M; Guyer, Craig
2008-06-01
This study describes the use of periodic matrix analysis and regression-design life table response experiments (LTRE) to investigate the effects of prescribed fire on demographic responses of Pinguicula ionantha, a federally listed plant endemic to the herb bog/savanna community in north Florida. Multi-state mark-recapture models with dead recoveries were used to estimate survival and transition probabilities for over 2,300 individuals in 12 populations of P. ionantha. These estimates were applied to parameterize matrix models used in further analyses. P. ionantha demographics were found to be strongly dependent on prescribed fire events. Periodic matrix models were used to evaluate season of burn (either growing or dormant season) for fire return intervals ranging from 1 to 20 years. Annual growing and biannual dormant season fires maximized population growth rates for this species. A regression design LTRE was used to evaluate the effect of number of days since last fire on population growth. Maximum population growth rates calculated using standard asymptotic analysis were realized shortly following a burn event (<2 years), and a regression design LTRE showed that short-term fire-mediated changes in vital rates translated into observed increases in population growth. The LTRE identified fecundity and individual growth as contributing most to increases in post-fire population growth. Our analyses found that the current four-year prescribed fire return intervals used at the study sites can be significantly shortened to increase the population growth rates of this rare species. Understanding the role of fire frequency and season in creating and maintaining appropriate habitat for this species may aid in the conservation of this and other rare herb bog/savanna inhabitants.
Adachi, Daiki; Nishiguchi, Shu; Fukutani, Naoto; Hotta, Takayuki; Tashiro, Yuto; Morino, Saori; Shirooka, Hidehiko; Nozaki, Yuma; Hirata, Hinako; Yamaguchi, Moe; Yorozu, Ayanori; Takahashi, Masaki; Aoyama, Tomoki
2017-05-01
The purpose of this study was to investigate which spatial and temporal parameters of the Timed Up and Go (TUG) test are associated with motor function in elderly individuals. This study included 99 community-dwelling women aged 72.9 ± 6.3 years. Step length, step width, single support time, variability of the aforementioned parameters, gait velocity, cadence, reaction time from starting signal to first step, and minimum distance between the foot and a marker placed to 3 in front of the chair were measured using our analysis system. The 10-m walk test, five times sit-to-stand (FTSTS) test, and one-leg standing (OLS) test were used to assess motor function. Stepwise multivariate linear regression analysis was used to determine which TUG test parameters were associated with each motor function test. Finally, we calculated a predictive model for each motor function test using each regression coefficient. In stepwise linear regression analysis, step length and cadence were significantly associated with the 10-m walk test, FTSTS and OLS test. Reaction time was associated with the FTSTS test, and step width was associated with the OLS test. Each predictive model showed a strong correlation with the 10-m walk test and OLS test (P < 0.01), which was not significant higher correlation than TUG test time. We showed which TUG test parameters were associated with each motor function test. Moreover, the TUG test time regarded as the lower extremity function and mobility has strong predictive ability in each motor function test. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
The Study of Rain Specific Attenuation for the Prediction of Satellite Propagation in Malaysia
NASA Astrophysics Data System (ADS)
Mandeep, J. S.; Ng, Y. Y.; Abdullah, H.; Abdullah, M.
2010-06-01
Specific attenuation is the fundamental quantity in the calculation of rain attenuation for terrestrial path and slant paths representing as rain attenuation per unit distance (dB/km). Specific attenuation is an important element in developing the predicted rain attenuation model. This paper deals with the empirical determination of the power law coefficients which allow calculating the specific attenuation in dB/km from the knowledge of the rain rate in mm/h. The main purpose of the paper is to obtain the coefficients of k and α of power law relationship between specific attenuation. Three years (from 1st January 2006 until 31st December 2008) rain gauge and beacon data taken from USM, Nibong Tebal have been used to do the empirical procedure analysis of rain specific attenuation. The data presented are semi-empirical in nature. A year-to-year variation of the coefficients has been indicated and the empirical measured data was compared with ITU-R provided regression coefficient. The result indicated that the USM empirical measured data was significantly vary from ITU-R predicted value. Hence, ITU-R recommendation for regression coefficients of rain specific attenuation is not suitable for predicting rain attenuation at Malaysia.
NASA Astrophysics Data System (ADS)
Villas Boas, M. D.; Olivera, F.; Azevedo, J. S.
2013-12-01
The evaluation of water quality through 'indexes' is widely used in environmental sciences. There are a number of methods available for calculating water quality indexes (WQI), usually based on site-specific parameters. In Brazil, WQI were initially used in the 1970s and were adapted from the methodology developed in association with the National Science Foundation (Brown et al, 1970). Specifically, the WQI 'IQA/SCQA', developed by the Institute of Water Management of Minas Gerais (IGAM), is estimated based on nine parameters: Temperature Range, Biochemical Oxygen Demand, Fecal Coliforms, Nitrate, Phosphate, Turbidity, Dissolved Oxygen, pH and Electrical Conductivity. The goal of this study was to develop a model for calculating the IQA/SCQA, for the Piabanha River basin in the State of Rio de Janeiro (Brazil), using only the parameters measurable by a Multiparameter Water Quality Sonde (MWQS) available in the study area. These parameters are: Dissolved Oxygen, pH and Electrical Conductivity. The use of this model will allow to further the water quality monitoring network in the basin, without requiring significant increases of resources. The water quality measurement with MWQS is less expensive than the laboratory analysis required for the other parameters. The water quality data used in the study were obtained by the Geological Survey of Brazil in partnership with other public institutions (i.e. universities and environmental institutes) as part of the project "Integrated Studies in Experimental and Representative Watersheds". Two models were developed to correlate the values of the three measured parameters and the IQA/SCQA values calculated based on all nine parameters. The results were evaluated according to the following validation statistics: coefficient of determination (R2), Root Mean Square Error (RMSE), Akaike information criterion (AIC) and Final Prediction Error (FPE). The first model was a linear stepwise regression between three independent variables (input) and one dependent variable (output) to establish an equation relating input to output. This model produced the following statistics: R2 = 0.85, RMSE = 6.19, AIC =0.65 and FPE = 1.93. The second model was a Feedforward Neural Network with one tan-sigmoid hidden layer (4 neurons) and one linear output layer. The neural network was trained based on a backpropagation algorithm using the input as predictors and the output as target. The following statistics were found: R2 = 0.95, RMSE = 4.86, AIC= 0.33 and FPE = 1.39. The second model produced a better fit than the first one, having a greater R2 and smaller RMSE, AIC and FPE. The best performance of the second method can be attributed to the fact that the water quality parameters often exhibit nonlinear behaviors and neural networks are capable of representing nonlinear relationship efficiently, while the regression is limited to linear relationships. References: Brown, R.M., McLelland, N.I., Deininger, R.A., Tozer, R.G.1970. A Water Quality Index-Do we dare? Water & Sewage Works, October: 339-343.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mbah, Chamberlain, E-mail: chamberlain.mbah@ugent.be; Department of Mathematical Modeling, Statistics, and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent; Thierens, Hubert
Purpose: To identify the main causes underlying the failure of prediction models for radiation therapy toxicity to replicate. Methods and Materials: Data were used from two German cohorts, Individual Radiation Sensitivity (ISE) (n=418) and Mammary Carcinoma Risk Factor Investigation (MARIE) (n=409), of breast cancer patients with similar characteristics and radiation therapy treatments. The toxicity endpoint chosen was telangiectasia. The LASSO (least absolute shrinkage and selection operator) logistic regression method was used to build a predictive model for a dichotomized endpoint (Radiation Therapy Oncology Group/European Organization for the Research and Treatment of Cancer score 0, 1, or ≥2). Internal areas undermore » the receiver operating characteristic curve (inAUCs) were calculated by a naïve approach whereby the training data (ISE) were also used for calculating the AUC. Cross-validation was also applied to calculate the AUC within the same cohort, a second type of inAUC. Internal AUCs from cross-validation were calculated within ISE and MARIE separately. Models trained on one dataset (ISE) were applied to a test dataset (MARIE) and AUCs calculated (exAUCs). Results: Internal AUCs from the naïve approach were generally larger than inAUCs from cross-validation owing to overfitting the training data. Internal AUCs from cross-validation were also generally larger than the exAUCs, reflecting heterogeneity in the predictors between cohorts. The best models with largest inAUCs from cross-validation within both cohorts had a number of common predictors: hypertension, normalized total boost, and presence of estrogen receptors. Surprisingly, the effect (coefficient in the prediction model) of hypertension on telangiectasia incidence was positive in ISE and negative in MARIE. Other predictors were also not common between the 2 cohorts, illustrating that overcoming overfitting does not solve the problem of replication failure of prediction models completely. Conclusions: Overfitting and cohort heterogeneity are the 2 main causes of replication failure of prediction models across cohorts. Cross-validation and similar techniques (eg, bootstrapping) cope with overfitting, but the development of validated predictive models for radiation therapy toxicity requires strategies that deal with cohort heterogeneity.« less
An example of complex modelling in dentistry using Markov chain Monte Carlo (MCMC) simulation.
Helfenstein, Ulrich; Menghini, Giorgio; Steiner, Marcel; Murati, Francesca
2002-09-01
In the usual regression setting one regression line is computed for a whole data set. In a more complex situation, each person may be observed for example at several points in time and thus a regression line might be calculated for each person. Additional complexities, such as various forms of errors in covariables may make a straightforward statistical evaluation difficult or even impossible. During recent years methods have been developed allowing convenient analysis of problems where the data and the corresponding models show these and many other forms of complexity. The methodology makes use of a Bayesian approach and Markov chain Monte Carlo (MCMC) simulations. The methods allow the construction of increasingly elaborate models by building them up from local sub-models. The essential structure of the models can be represented visually by directed acyclic graphs (DAG). This attractive property allows communication and discussion of the essential structure and the substantial meaning of a complex model without needing algebra. After presentation of the statistical methods an example from dentistry is presented in order to demonstrate their application and use. The dataset of the example had a complex structure; each of a set of children was followed up over several years. The number of new fillings in permanent teeth had been recorded at several ages. The dependent variables were markedly different from the normal distribution and could not be transformed to normality. In addition, explanatory variables were assumed to be measured with different forms of error. Illustration of how the corresponding models can be estimated conveniently via MCMC simulation, in particular, 'Gibbs sampling', using the freely available software BUGS is presented. In addition, how the measurement error may influence the estimates of the corresponding coefficients is explored. It is demonstrated that the effect of the independent variable on the dependent variable may be markedly underestimated if the measurement error is not taken into account ('regression dilution bias'). Markov chain Monte Carlo methods may be of great value to dentists in allowing analysis of data sets which exhibit a wide range of different forms of complexity.
Røislien, Jo; Clausen, Thomas; Gran, Jon Michael; Bukten, Anne
2014-05-17
The reduction of crime is an important outcome of opioid maintenance treatment (OMT). Criminal intensity and treatment regimes vary among OMT patients, but this is rarely adjusted for in statistical analyses, which tend to focus on cohort incidence rates and rate ratios. The purpose of this work was to estimate the relationship between treatment and criminal convictions among OMT patients, adjusting for individual covariate information and timing of events, fitting time-to-event regression models of increasing complexity. National criminal records were cross linked with treatment data on 3221 patients starting OMT in Norway 1997-2003. In addition to calculating cohort incidence rates, criminal convictions was modelled as a recurrent event dependent variable, and treatment a time-dependent covariate, in Cox proportional hazards, Aalen's additive hazards, and semi-parametric additive hazards regression models. Both fixed and dynamic covariates were included. During OMT, the number of days with criminal convictions for the cohort as a whole was 61% lower than when not in treatment. OMT was associated with reduced number of days with criminal convictions in all time-to-event regression models, but the hazard ratio (95% CI) was strongly attenuated when adjusting for covariates; from 0.40 (0.35, 0.45) in a univariate model to 0.79 (0.72, 0.87) in a fully adjusted model. The hazard was lower for females and decreasing with older age, while increasing with high numbers of criminal convictions prior to application to OMT (all p < 0.001). The strongest predictors were level of criminal activity prior to entering into OMT, and having a recent criminal conviction (both p < 0.001). The effect of several predictors was significantly time-varying with their effects diminishing over time. Analyzing complex observational data regarding to fixed factors only overlooks important temporal information, and naïve cohort level incidence rates might result in biased estimates of the effect of interventions. Applying time-to-event regression models, properly adjusting for individual covariate information and timing of various events, allows for more precise and reliable effect estimates, as well as painting a more nuanced picture that can aid health care professionals and policy makers.
Baldwin, Austin K.; Robertson, Dale M.; Saad, David A.; Magruder, Christopher
2013-01-01
In 2008, the U.S. Geological Survey and the Milwaukee Metropolitan Sewerage District initiated a study to develop regression models to estimate real-time concentrations and loads of chloride, suspended solids, phosphorus, and bacteria in streams near Milwaukee, Wisconsin. To collect monitoring data for calibration of models, water-quality sensors and automated samplers were installed at six sites in the Menomonee River drainage basin. The sensors continuously measured four potential explanatory variables: water temperature, specific conductance, dissolved oxygen, and turbidity. Discrete water-quality samples were collected and analyzed for five response variables: chloride, total suspended solids, total phosphorus, Escherichia coli bacteria, and fecal coliform bacteria. Using the first year of data, regression models were developed to continuously estimate the response variables on the basis of the continuously measured explanatory variables. Those models were published in a previous report. In this report, those models are refined using 2 years of additional data, and the relative improvement in model predictability is discussed. In addition, a set of regression models is presented for a new site in the Menomonee River Basin, Underwood Creek at Wauwatosa. The refined models use the same explanatory variables as the original models. The chloride models all used specific conductance as the explanatory variable, except for the model for the Little Menomonee River near Freistadt, which used both specific conductance and turbidity. Total suspended solids and total phosphorus models used turbidity as the only explanatory variable, and bacteria models used water temperature and turbidity as explanatory variables. An analysis of covariance (ANCOVA), used to compare the coefficients in the original models to those in the refined models calibrated using all of the data, showed that only 3 of the 25 original models changed significantly. Root-mean-squared errors (RMSEs) calculated for both the original and refined models using the entire dataset showed a median improvement in RMSE of 2.1 percent, with a range of 0.0–13.9 percent. Therefore most of the original models did almost as well at estimating concentrations during the validation period (October 2009–September 2011) as the refined models, which were calibrated using those data. Application of these refined models can produce continuously estimated concentrations of chloride, total suspended solids, total phosphorus, E. coli bacteria, and fecal coliform bacteria that may assist managers in quantifying the effects of land-use changes and improvement projects, establish total maximum daily loads, and enable better informed decision making in the future.
New functions for estimating AOT40 from ozone passive sampling
NASA Astrophysics Data System (ADS)
De Marco, Alessandra; Vitale, Marcello; Kilic, Umit; Serengil, Yusuf; Paoletti, Elena
2014-10-01
AOT40 is the present European standard to assess whether ozone (O3) pollution is a risk for vegetation, and is calculated by using hourly O3 concentrations from automatic devices, i.e. by active monitoring. Passive O3 monitoring is widespread in remote environments. The Loibl function estimates the mean daily O3 profile and thus hourly O3 concentrations, and has been proposed to calculate AOT40 from passive samplers. We investigated whether this function performs well in inhomogeneous terrains such as over the Italian country. Data from 75 active monitoring stations (28 rural and 47 suburban) were analysed over two years. AOT40 was calculated from hourly O3 data either measured by active measurements or estimated by the Loibl function applied to biweekly averages of active-measurement hourly data. The latter approach simulated the data obtained from passive monitoring, as two weeks is the usual exposure window of passive samplers. Residuals between AOT40 estimated by applying the Loibl function and AOT40 calculated from active monitoring ranged from +241% to -107%, suggesting that the Loibl function is inadequate to accurately predict AOT40 in Italy. New statistical models were built for both rural and suburban areas by using non-linear models and including predictors that can be easily measured at forest sites. The modelled AOT40 values strongly depended on physical predictors (latitude and longitude), alone or in combination with other predictors, such as seasonal cumulated ozone and elevation. These results suggest that multi-variate, non-linear regression models work better than the Loibl-based approach in estimating AOT40.
NASA Astrophysics Data System (ADS)
Gong, Lunkun; Chen, Xiong; Musa, Omer; Yang, Haitao; Zhou, Changsheng
2017-12-01
Numerical and experimental investigation on the solid-fuel ramjet was carried out to study the effect of geometry on combustion characteristics. The two-dimensional axisymmetric program developed in the present study adopted finite rate chemistry and second-order moment turbulence-chemistry models, together with k-ω shear stress transport (SST) turbulence model. Experimental data were obtained by burning cylindrical polyethylene using a connected pipe facility. The simulation results show that a fuel-rich zone near the solid fuel surface and an air-rich zone in the core exist in the chamber, and the chemical reactions occur mainly in the interface of this two regions; The physical reasons for the effect of geometry on regression rate is the variation of turbulent viscosity due to the geometry change. Port-to-inlet diameter ratio is the main parameter influencing the turbulent viscosity, and a linear relationship between port-to-inlet diameter and regression rate were obtained. The air mass flow rate and air-fuel ratio are the main influencing factors on ramjet performances. Based on the simulation results, the correlations between geometry and air-fuel ratio were obtained, and the effect of geometry on ramjet performances was analyzed according to the correlation. Three-dimensional regression rate contour obtained experimentally indicates that the regression rate which shows axisymmetric distribution due to the symmetry structure increases sharply, followed by slow decrease in axial direction. The radiation heat transfer in recirculation zone cannot be ignored. Compared with the experimental results, the deviations of calculated average regression rate and characteristic velocity are about 5%. Concerning the effect of geometry on air-fuel ratio, the deviations between experimental and theoretical results are less than 10%.
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Isma’eel, Hussain A.; Sakr, George E.; Almedawar, Mohamad M.; Fathallah, Jihan; Garabedian, Torkom; Eddine, Savo Bou Zein
2015-01-01
Background High dietary salt intake is directly linked to hypertension and cardiovascular diseases (CVDs). Predicting behaviors regarding salt intake habits is vital to guide interventions and increase their effectiveness. We aim to compare the accuracy of an artificial neural network (ANN) based tool that predicts behavior from key knowledge questions along with clinical data in a high cardiovascular risk cohort relative to the least square models (LSM) method. Methods We collected knowledge, attitude and behavior data on 115 patients. A behavior score was calculated to classify patients’ behavior towards reducing salt intake. Accuracy comparison between ANN and regression analysis was calculated using the bootstrap technique with 200 iterations. Results Starting from a 69-item questionnaire, a reduced model was developed and included eight knowledge items found to result in the highest accuracy of 62% CI (58-67%). The best prediction accuracy in the full and reduced models was attained by ANN at 66% and 62%, respectively, compared to full and reduced LSM at 40% and 34%, respectively. The average relative increase in accuracy over all in the full and reduced models is 82% and 102%, respectively. Conclusions Using ANN modeling, we can predict salt reduction behaviors with 66% accuracy. The statistical model has been implemented in an online calculator and can be used in clinics to estimate the patient’s behavior. This will help implementation in future research to further prove clinical utility of this tool to guide therapeutic salt reduction interventions in high cardiovascular risk individuals. PMID:26090333
NASA Astrophysics Data System (ADS)
Haack, Lukas; Peniche, Ricardo; Sommer, Lutz; Kather, Alfons
2017-06-01
At early project stages, the main CSP plant design parameters such as turbine capacity, solar field size, and thermal storage capacity are varied during the techno-economic optimization to determine most suitable plant configurations. In general, a typical meteorological year with at least hourly time resolution is used to analyze each plant configuration. Different software tools are available to simulate the annual energy yield. Software tools offering a thermodynamic modeling approach of the power block and the CSP thermal cycle, such as EBSILONProfessional®, allow a flexible definition of plant topologies. In EBSILON, the thermodynamic equilibrium for each time step is calculated iteratively (quasi steady state), which requires approximately 45 minutes to process one year with hourly time resolution. For better presentation of gradients, 10 min time resolution is recommended, which increases processing time by a factor of 5. Therefore, analyzing a large number of plant sensitivities, as required during the techno-economic optimization procedure, the detailed thermodynamic simulation approach becomes impracticable. Suntrace has developed an in-house CSP-Simulation tool (CSPsim), based on EBSILON and applying predictive models, to approximate the CSP plant performance for central receiver and parabolic trough technology. CSPsim significantly increases the speed of energy yield calculations by factor ≥ 35 and has automated the simulation run of all predefined design configurations in sequential order during the optimization procedure. To develop the predictive models, multiple linear regression techniques and Design of Experiment methods are applied. The annual energy yield and derived LCOE calculated by the predictive model deviates less than ±1.5 % from the thermodynamic simulation in EBSILON and effectively identifies the optimal range of main design parameters for further, more specific analysis.
Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent; ...
2018-03-06
The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent
The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less
Age estimation standards for a Western Australian population using the coronal pulp cavity index.
Karkhanis, Shalmira; Mack, Peter; Franklin, Daniel
2013-09-10
Age estimation is a vital aspect in creating a biological profile and aids investigators by narrowing down potentially matching identities from the available pool. In addition to routine casework, in the present global political scenario, age estimation in living individuals is required in cases of refugees, asylum seekers, human trafficking and to ascertain age of criminal responsibility. Thus robust methods that are simple, non-invasive and ethically viable are required. The aim of the present study is, therefore, to test the reliability and applicability of the coronal pulp cavity index method, for the purpose of developing age estimation standards for an adult Western Australian population. A total of 450 orthopantomograms (220 females and 230 males) of Australian individuals were analyzed. Crown and coronal pulp chamber heights were measured in the mandibular left and right premolars, and the first and second molars. These measurements were then used to calculate the tooth coronal index. Data was analyzed using paired sample t-tests to assess bilateral asymmetry followed by simple linear and multiple regressions to develop age estimation models. The most accurate age estimation based on simple linear regression model was with mandibular right first molar (SEE ±8.271 years). Multiple regression models improved age prediction accuracy considerably and the most accurate model was with bilateral first and second molars (SEE ±6.692 years). This study represents the first investigation of this method in a Western Australian population and our results indicate that the method is suitable for forensic application. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Srinivas, Nuggehally R; Syed, Muzeeb
2016-01-01
Limited pharmacokinetic sampling strategy may be useful for predicting the area under the curve (AUC) for triptans and may have clinical utility as a prospective tool for prediction. Using appropriate intranasal pharmacokinetic data, a Cmax vs. AUC relationship was established by linear regression models for sumatriptan and zolmitriptan. The predictions of the AUC values were performed using published mean/median Cmax data and appropriate regression lines. The quotient of observed and predicted values rendered fold-difference calculation. The mean absolute error (MAE), mean positive error (MPE), mean negative error (MNE), root mean square error (RMSE), correlation coefficient (r), and the goodness of the AUC fold prediction were used to evaluate the two triptans. Also, data from the mean concentration profiles at time points of 1 hour (sumatriptan) and 3 hours (zolmitriptan) were used for the AUC prediction. The Cmax vs. AUC models displayed excellent correlation for both sumatriptan (r = .9997; P < .001) and zolmitriptan (r = .9999; P < .001). Irrespective of the two triptans, the majority of the predicted AUCs (83%-85%) were within 0.76-1.25-fold difference using the regression model. The prediction of AUC values for sumatriptan or zolmitriptan using the concentration data that reflected the Tmax occurrence were in the proximity of the reported values. In summary, the Cmax vs. AUC models exhibited strong correlations for sumatriptan and zolmitriptan. The usefulness of the prediction of the AUC values was established by a rigorous statistical approach.
Hendriks, A Jan; Smítková, Hana; Huijbregts, Mark A J
2007-11-01
Exposure of humans to chemicals in beef or milk is part of almost all risk evaluation procedures carried out to reduce emissions or to remediate sites. Concentrations of substances in these livestock products are often estimated using log-log regressions that relate the biotransfer factor BTF to the octanol-water partition ratio K(ow). However, the correctness of these empirical correlations has been questioned. Here, we compare them to the mechanistic model OMEGA that describes the distribution of substances in organisms by integrating theory on chemical fugacity and biological allometry. OMEGA has been calibrated and validated on thousands of laboratory and field data, reflecting many chemical substances and biological species. Overall fluxes of water, food, tissue (growth), milk and stable substances calculated by OMEGA are within a factor of two from independent data obtained in experiments. Rate constants measured for elimination of individual compounds of a recalcitrant nature vary around the level expected from the model for output to faeces and milk. Both data and model suggest that biotransfer BTF of stable substances to beef and milk is independent of the octanol-water partition ratio K(ow) in the range of 10(3)-10(6). This contradicts empirical regressions including stable and labile compounds. As expected, levels of labile substances vary widely around a tentative indication derived from the model. Transformation and accumulation of labile substances remains highly specific for the chemical and organism concerned but depends weakly on the octanol-water partition ratio K(ow). Several possibilities for additional refinement are identified.
Boiret, Mathieu; Meunier, Loïc; Ginot, Yves-Michel
2011-02-20
A near infrared (NIR) method was developed for determination of tablet potency of active pharmaceutical ingredient (API) in a complex coated tablet matrix. The calibration set contained samples from laboratory and production scale batches. The reference values were obtained by high performance liquid chromatography (HPLC) and partial least squares (PLS) regression was used to establish a model. The model was challenged by calculating tablet potency of two external test sets. Root mean square errors of prediction were respectively equal to 2.0% and 2.7%. To use this model with a second spectrometer from the production field, a calibration transfer method called piecewise direct standardisation (PDS) was used. After the transfer, the root mean square error of prediction of the first test set was 2.4% compared to 4.0% without transferring the spectra. A statistical technique using bootstrap of PLS residuals was used to estimate confidence intervals of tablet potency calculations. This method requires an optimised PLS model, selection of the bootstrap number and determination of the risk. In the case of a chemical analysis, the tablet potency value will be included within the confidence interval calculated by the bootstrap method. An easy to use graphical interface was developed to easily determine if the predictions, surrounded by minimum and maximum values, are within the specifications defined by the regulatory organisation. Copyright © 2010 Elsevier B.V. All rights reserved.
Hartzell, S.; Leeds, A.; Frankel, A.; Williams, R.A.; Odum, J.; Stephenson, W.; Silva, W.
2002-01-01
The Seattle fault poses a significant seismic hazard to the city of Seattle, Washington. A hybrid, low-frequency, high-frequency method is used to calculate broadband (0-20 Hz) ground-motion time histories for a M 6.5 earthquake on the Seattle fault. Low frequencies (1 Hz) are calculated by a stochastic method that uses a fractal subevent size distribution to give an ω-2 displacement spectrum. Time histories are calculated for a grid of stations and then corrected for the local site response using a classification scheme based on the surficial geology. Average shear-wave velocity profiles are developed for six surficial geologic units: artificial fill, modified land, Esperance sand, Lawton clay, till, and Tertiary sandstone. These profiles together with other soil parameters are used to compare linear, equivalent-linear, and nonlinear predictions of ground motion in the frequency band 0-15 Hz. Linear site-response corrections are found to yield unreasonably large ground motions. Equivalent-linear and nonlinear calculations give peak values similar to the 1994 Northridge, California, earthquake and those predicted by regression relationships. Ground-motion variance is estimated for (1) randomization of the velocity profiles, (2) variation in source parameters, and (3) choice of nonlinear model. Within the limits of the models tested, the results are found to be most sensitive to the nonlinear model and soil parameters, notably the over consolidation ratio.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Forecast of future aviation fuels: The model
NASA Technical Reports Server (NTRS)
Ayati, M. B.; Liu, C. Y.; English, J. M.
1981-01-01
A conceptual models of the commercial air transportation industry is developed which can be used to predict trends in economics, demand, and consumption. The methodology is based on digraph theory, which considers the interaction of variables and propagation of changes. Air transportation economics are treated by examination of major variables, their relationships, historic trends, and calculation of regression coefficients. A description of the modeling technique and a compilation of historic airline industry statistics used to determine interaction coefficients are included. Results of model validations show negligible difference between actual and projected values over the twenty-eight year period of 1959 to 1976. A limited application of the method presents forecasts of air tranportation industry demand, growth, revenue, costs, and fuel consumption to 2020 for two scenarios of future economic growth and energy consumption.
Genetic evaluation of lactation persistency for five breeds of dairy cattle.
Cole, J B; Null, D J
2009-05-01
Cows with high lactation persistency tend to produce less milk than expected at the beginning of lactation and more than expected at the end. Best prediction of lactation persistency is calculated as a function of trait-specific standard lactation curves and linear regressions of test-day deviations on days in milk. Because regression coefficients are deviations from a tipping point selected to make yield and lactation persistency phenotypically uncorrelated it should be possible to use 305-d actual yield and lactation persistency to predict yield for lactations with later endpoints. The objectives of this study were to calculate (co)variance components and breeding values for best predictions of lactation persistency of milk (PM), fat (PF), protein (PP), and somatic cell score (PSCS) in breeds other than Holstein, and to demonstrate the calculation of prediction equations for 400-d actual milk yield. Data included lactations from Ayrshire, Brown Swiss, Guernsey (GU), Jersey (JE), and Milking Shorthorn (MS) cows calving since 1997. The number of sires evaluated ranged from 86 (MS) to 3,192 (JE), and mean sire estimated breeding value for PM ranged from 0.001 (Ayrshire) to 0.10 (Brown Swiss); mean estimated breeding value for PSCS ranged from -0.01 (MS) to -0.043 (JE). Heritabilities were generally highest for PM (0.09 to 0.15) and lowest for PSCS (0.03 to 0.06), with PF and PP having intermediate values (0.07 to 0.13). Repeatabilities varied considerably between breeds, ranging from 0.08 (PSCS in GU, JE, and MS) to 0.28 (PM in GU). Genetic correlations of PM, PF, and PP with PSCS were moderate and favorable (negative), indicating that increasing lactation persistency of yield traits is associated with decreases in lactation persistency of SCS, as expected. Genetic correlations among yield and lactation persistency were low to moderate and ranged from -0.55 (PP in GU) to 0.40 (PP in MS). Prediction equations for 400-d milk yield were calculated for each breed by regression of both 305-d yield and 305-d yield and lactation persistency on 400-d yield. Goodness-of-fit was very good for both models, but the addition of lactation persistency to the model significantly improved fit in all cases. Routine genetic evaluations for lactation persistency, as well as the development of prediction equations for several lactation end-points, may provide producers with tools to better manage their herds.
Survival Regression Modeling Strategies in CVD Prediction.
Barkhordari, Mahnaz; Padyab, Mojgan; Sardarinia, Mahsa; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza
2016-04-01
A fundamental part of prevention is prediction. Potential predictors are the sine qua non of prediction models. However, whether incorporating novel predictors to prediction models could be directly translated to added predictive value remains an area of dispute. The difference between the predictive power of a predictive model with (enhanced model) and without (baseline model) a certain predictor is generally regarded as an indicator of the predictive value added by that predictor. Indices such as discrimination and calibration have long been used in this regard. Recently, the use of added predictive value has been suggested while comparing the predictive performances of the predictive models with and without novel biomarkers. User-friendly statistical software capable of implementing novel statistical procedures is conspicuously lacking. This shortcoming has restricted implementation of such novel model assessment methods. We aimed to construct Stata commands to help researchers obtain the aforementioned statistical indices. We have written Stata commands that are intended to help researchers obtain the following. 1, Nam-D'Agostino X 2 goodness of fit test; 2, Cut point-free and cut point-based net reclassification improvement index (NRI), relative absolute integrated discriminatory improvement index (IDI), and survival-based regression analyses. We applied the commands to real data on women participating in the Tehran lipid and glucose study (TLGS) to examine if information relating to a family history of premature cardiovascular disease (CVD), waist circumference, and fasting plasma glucose can improve predictive performance of Framingham's general CVD risk algorithm. The command is adpredsurv for survival models. Herein we have described the Stata package "adpredsurv" for calculation of the Nam-D'Agostino X 2 goodness of fit test as well as cut point-free and cut point-based NRI, relative and absolute IDI, and survival-based regression analyses. We hope this work encourages the use of novel methods in examining predictive capacity of the emerging plethora of novel biomarkers.
Bioprosthetic Aortic Valve Durability: A Meta-Regression of Published Studies.
Wang, Mansen; Furnary, Anthony P; Li, Hsin-Fang; Grunkemeier, Gary L
2017-09-01
To compare structural valve deterioration (SVD) among bioprosthetic aortic valve types, a PubMed search found 54 papers containing SVD-free curves extending to at least 10 years. The curves were digitized and fit to Weibull distributions, and the mean times to valve failure (MTTF) were calculated. Twelve valve models were collapsed into four valve types: Medtronic (Medtronic, Minneapolis, MN) and Edwards (Edwards Lifesciences, Irvine, CA) porcine; and Sorin (Sorin Group [now LivaNova], London, United Kingdom) and Edwards pericardial. Meta-regression found MTTF was associated with the patient's age, publication year, SVD definition, and valve type. Sorin pericardial valves had significantly lower risk-adjusted MTTF (higher SVD risk), and there were no significant differences in MTTF among the other three valve types. Copyright © 2017 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Ren, Jianqiang; Chen, Zhongxin; Tang, Huajun
2006-12-01
Taking Jining City of Shandong Province, one of the most important winter wheat production regions in Huanghuaihai Plain as an example, the winter wheat yield was estimated by using the 250 m MODIS-NDVI data smoothed by Savitzky-Golay filter. The NDVI values between 0. 20 and 0. 80 were selected, and the sum of NDVI value for each county was calculated to build its relation with winter wheat yield. By using stepwise regression method, the linear regression model between NDVI and winter wheat yield was established, with the precision validated by the ground survey data. The results showed that the relative error of predicted yield was between -3.6% and 3.9%, suggesting that the method was relatively accurate and feasible.
NASA Technical Reports Server (NTRS)
Shykoff, Barbara E.; Swanson, Harvey T.
1987-01-01
A new method for correction of mass spectrometer output signals is described. Response-time distortion is reduced independently of any model of mass spectrometer behavior. The delay of the system is found first from the cross-correlation function of a step change and its response. A two-sided time-domain digital correction filter (deconvolution filter) is generated next from the same step response data using a regression procedure. Other data are corrected using the filter and delay. The mean squared error between a step response and a step is reduced considerably more after the use of a deconvolution filter than after the application of a second-order model correction. O2 consumption and CO2 production values calculated from data corrupted by a simulated dynamic process return to near the uncorrupted values after correction. Although a clean step response or the ensemble average of several responses contaminated with noise is needed for the generation of the filter, random noise of magnitude not above 0.5 percent added to the response to be corrected does not impair the correction severely.
Methodology for the development of normative data for Spanish-speaking pediatric populations.
Rivera, D; Arango-Lasprilla, J C
2017-01-01
To describe the methodology utilized to calculate reliability and the generation of norms for 10 neuropsychological tests for children in Spanish-speaking countries. The study sample consisted of over 4,373 healthy children from nine countries in Latin America (Chile, Cuba, Ecuador, Guatemala, Honduras, Mexico, Paraguay, Peru, and Puerto Rico) and Spain. Inclusion criteria for all countries were to have between 6 to 17 years of age, an Intelligence Quotient of≥80 on the Test of Non-Verbal Intelligence (TONI-2), and score of <19 on the Children's Depression Inventory. Participants completed 10 neuropsychological tests. Reliability and norms were calculated for all tests. Test-retest analysis showed excellent or good- reliability on all tests (r's>0.55; p's<0.001) except M-WCST perseverative errors whose coefficient magnitude was fair. All scores were normed using multiple linear regressions and standard deviations of residual values. Age, age2, sex, and mean level of parental education (MLPE) were included as predictors in the models by country. The non-significant variables (p > 0.05) were removed and the analysis were run again. This is the largest Spanish-speaking children and adolescents normative study in the world. For the generation of normative data, the method based on linear regression models and the standard deviation of residual values was used. This method allows determination of the specific variables that predict test scores, helps identify and control for collinearity of predictive variables, and generates continuous and more reliable norms than those of traditional methods.
Carsin-Vu, Aline; Corouge, Isabelle; Commowick, Olivier; Bouzillé, Guillaume; Barillot, Christian; Ferré, Jean-Christophe; Proisy, Maia
2018-04-01
To investigate changes in cerebral blood flow (CBF) in gray matter (GM) between 6 months and 15 years of age and to provide CBF values for the brain, GM, white matter (WM), hemispheres and lobes. Between 2013 and 2016, we retrospectively included all clinical MRI examinations with arterial spin labeling (ASL). We excluded subjects with a condition potentially affecting brain perfusion. For each subject, mean values of CBF in the brain, GM, WM, hemispheres and lobes were calculated. GM CBF was fitted using linear, quadratic and cubic polynomial regression against age. Regression models were compared with Akaike's information criterion (AIC), and Likelihood Ratio tests. 84 children were included (44 females/40 males). Mean CBF values were 64.2 ± 13.8 mL/100 g/min in GM, and 29.3 ± 10.0 mL/100 g/min in WM. The best-fit model of brain perfusion was the cubic polynomial function (AIC = 672.7, versus respectively AIC = 673.9 and AIC = 674.1 with the linear negative function and the quadratic polynomial function). A statistically significant difference between the tested models demonstrating the superiority of the quadratic (p = 0.18) or cubic polynomial model (p = 0.06), over the negative linear regression model was not found. No effect of general anesthesia (p = 0.34) or of gender (p = 0.16) was found. we provided values for ASL CBF in the brain, GM, WM, hemispheres, and lobes over a wide pediatric age range, approximately showing inverted U-shaped changes in GM perfusion over the course of childhood. Copyright © 2018 Elsevier B.V. All rights reserved.
Parametric regression model for survival data: Weibull regression model as an example
2016-01-01
Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846
Risk Factors for Suicidal Ideation in People at Risk for Huntington's Disease.
Anderson, Karen E; Eberly, Shirley; Groves, Mark; Kayson, Elise; Marder, Karen; Young, Anne B; Shoulson, Ira
2016-12-15
Suicidal ideation (SI) and attempts are increased in Huntington's disease (HD), making risk factor assessment a priority. To determine whether, hopelessness, irritability, aggression, anxiety, CAG expansion status, depression, and motor signs/symptoms were associated with Suicidal Ideation (SI) in those at risk for HD. Behavioral and neurological data were collected from subjects in an observational study. Subject characteristics were calculated by CAG status and SI. Logistic regression models were adjusted for demographics. Separate logistic regressions were used to compare SI and non-SI subjects. A combined logistic regression model, including 4 pre-specified predictors, (hopelessness, irritability, aggression, anxiety) was used to assess the relationship of SI to these predictors. 801 subjects were assessed, 40 were classified as having SI, 6.3% of CAG mutation expansion carriers had SI, compared with 4.3% of non- CAG mutation expansion carriers (p = 0.2275). SI subjects had significantly increased depression (p < 0.0001), hopelessness (p < 0.0001), irritability (p < 0.0001), aggression (p = 0.0089), and anxiety (p < 0.0001), and an elevated motor score (p = 0.0098). Impulsivity, assessed in a subgroup of subjects, was also associated with SI (p = 0.0267). Hopelessness and anxiety remained significant in combined model (p < 0.001; p < 0.0198, respectively) even when motor score was included. Behavioral symptoms were significantly higher in those reporting SI. Hopelessness and anxiety showed a particularly strong association with SI. Risk identification could assist in assessment of suicidality in this group.
Inflammation, homocysteine and carotid intima-media thickness.
Baptista, Alexandre P; Cacdocar, Sanjiva; Palmeiro, Hugo; Faísca, Marília; Carrasqueira, Herménio; Morgado, Elsa; Sampaio, Sandra; Cabrita, Ana; Silva, Ana Paula; Bernardo, Idalécio; Gome, Veloso; Neves, Pedro L
2008-01-01
Cardiovascular disease is the main cause of morbidity and mortality in chronic renal patients. Carotid intima-media thickness (CIMT) is one of the most accurate markers of atherosclerosis risk. In this study, the authors set out to evaluate a population of chronic renal patients to determine which factors are associated with an increase in intima-media thickness. We included 56 patients (F=22, M=34), with a mean age of 68.6 years, and an estimated glomerular filtration rate of 15.8 ml/min (calculated by the MDRD equation). Various laboratory and inflammatory parameters (hsCRP, IL-6 and TNF-alpha) were evaluated. All subjects underwent measurement of internal carotid artery intima-media thickness by high-resolution real-time B-mode ultrasonography using a 10 MHz linear transducer. Intima-media thickness was used as a dependent variable in a simple linear regression model, with the various laboratory parameters as independent variables. Only parameters showing a significant correlation with CIMT were evaluated in a multiple regression model: age (p=0.001), hemoglobin (p=00.3), logCRP (p=0.042), logIL-6 (p=0.004) and homocysteine (p=0.002). In the multiple regression model we found that age (p=0.001) and homocysteine (p=0.027) were independently correlated with CIMT. LogIL-6 did not reach statistical significance (p=0.057), probably due to the small population size. The authors conclude that age and homocysteine correlate with carotid intima-media thickness, and thus can be considered as markers/risk factors in chronic renal patients.
Hu, Hai; Zhou, Yangyang; Ren, Shujuan; Wu, Jiajin; Zhu, Meiying; Chen, Donghui; Yang, Haiyan; Wang, Liwei
2015-01-01
Background Colorectal cancer (CRC) is a major cause of cancer morbidity and mortality. In previous epidemiologic studies, the respective correlation between lifestyle factors and comorbidity and CRC has been extensively studied. However, little is known about their joint effects on CRC. Methods We conducted a retrospective case-control study of 1,144 diagnosed CRC patients and 60,549 community controls. A structured questionnaire was administered to the participants about their socio-demographic factors, anthropometric measures, comorbidity history and lifestyle factors. Logistic regression model was used to calculate the odds ratio (ORs) and 95% confidence intervals (95%CIs) for each factor. According to the results from logistic regression model, we further developed healthy lifestyle index (HLI) and comorbidity history index (CHI) to investigate their independent and joint effects on CRC risk. Results Four lifestyle factors (including physical activities, sleep, red meat and vegetable consumption) and four types of comorbidity (including diabetes, hyperlipidemia, history of inflammatory bowel disease and polyps) were found to be independently associated with the risk of CRC in multivariant logistic regression model. Intriguingly, their combined pattern- HLI and CHI demonstrated significant correlation with CRC risk independently (ORHLI: 3.91, 95%CI: 3.13–4.88; ORCHI: 2.49, 95%CI: 2.11–2.93) and jointly (OR: 10.33, 95%CI: 6.59–16.18). Conclusions There are synergistic effects of lifestyle factors and comorbidity on the risk of colorectal cancer in the Chinese population. PMID:26710070
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K
2014-10-01
Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.
Rabideau, Dustin J; Pei, Pamela P; Walensky, Rochelle P; Zheng, Amy; Parker, Robert A
2018-02-01
The expected value of sample information (EVSI) can help prioritize research but its application is hampered by computational infeasibility, especially for complex models. We investigated an approach by Strong and colleagues to estimate EVSI by applying generalized additive models (GAM) to results generated from a probabilistic sensitivity analysis (PSA). For 3 potential HIV prevention and treatment strategies, we estimated life expectancy and lifetime costs using the Cost-effectiveness of Preventing AIDS Complications (CEPAC) model, a complex patient-level microsimulation model of HIV progression. We fitted a GAM-a flexible regression model that estimates the functional form as part of the model fitting process-to the incremental net monetary benefits obtained from the CEPAC PSA. For each case study, we calculated the expected value of partial perfect information (EVPPI) using both the conventional nested Monte Carlo approach and the GAM approach. EVSI was calculated using the GAM approach. For all 3 case studies, the GAM approach consistently gave similar estimates of EVPPI compared with the conventional approach. The EVSI behaved as expected: it increased and converged to EVPPI for larger sample sizes. For each case study, generating the PSA results for the GAM approach required 3 to 4 days on a shared cluster, after which EVPPI and EVSI across a range of sample sizes were evaluated in minutes. The conventional approach required approximately 5 weeks for the EVPPI calculation alone. Estimating EVSI using the GAM approach with results from a PSA dramatically reduced the time required to conduct a computationally intense project, which would otherwise have been impractical. Using the GAM approach, we can efficiently provide policy makers with EVSI estimates, even for complex patient-level microsimulation models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brink, Carsten, E-mail: carsten.brink@rsyd.dk; Laboratory of Radiation Physics, Odense University Hospital; Bernchou, Uffe
2014-07-15
Purpose: Large interindividual variations in volume regression of non-small cell lung cancer (NSCLC) are observable on standard cone beam computed tomography (CBCT) during fractionated radiation therapy. Here, a method for automated assessment of tumor volume regression is presented and its potential use in response adapted personalized radiation therapy is evaluated empirically. Methods and Materials: Automated deformable registration with calculation of the Jacobian determinant was applied to serial CBCT scans in a series of 99 patients with NSCLC. Tumor volume at the end of treatment was estimated on the basis of the first one third and two thirds of the scans.more » The concordance between estimated and actual relative volume at the end of radiation therapy was quantified by Pearson's correlation coefficient. On the basis of the estimated relative volume, the patients were stratified into 2 groups having volume regressions below or above the population median value. Kaplan-Meier plots of locoregional disease-free rate and overall survival in the 2 groups were used to evaluate the predictive value of tumor regression during treatment. Cox proportional hazards model was used to adjust for other clinical characteristics. Results: Automatic measurement of the tumor regression from standard CBCT images was feasible. Pearson's correlation coefficient between manual and automatic measurement was 0.86 in a sample of 9 patients. Most patients experienced tumor volume regression, and this could be quantified early into the treatment course. Interestingly, patients with pronounced volume regression had worse locoregional tumor control and overall survival. This was significant on patient with non-adenocarcinoma histology. Conclusions: Evaluation of routinely acquired CBCT images during radiation therapy provides biological information on the specific tumor. This could potentially form the basis for personalized response adaptive therapy.« less
Solving large mixed linear models using preconditioned conjugate gradient iteration.
Strandén, I; Lidauer, M
1999-12-01
Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
Sabel, Michael S; Rice, John D; Griffith, Kent A; Lowe, Lori; Wong, Sandra L; Chang, Alfred E; Johnson, Timothy M; Taylor, Jeremy M G
2012-01-01
To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid sentinel lymph node biopsy (SLNB), several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests, and support vector machines. We sought to validate recently published models meant to predict sentinel node status. We queried our comprehensive, prospectively collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon four published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false-negative rate (FNR). Logistic regression performed comparably with our data when considering NPV (89.4 versus 93.6%); however, the model's specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsy rates that were lower (87.7 versus 94.1 and 29.8 versus 14.3, respectively). Two published models could not be applied to our data due to model complexity and the use of proprietary software. Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Statistical predictive models must be developed in a clinically applicable manner to allow for both validation and ultimately clinical utility.
Baqué, Michèle; Amendt, Jens
2013-01-01
Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Hemmateenejad, Bahram; Yazdani, Mahdieh
2009-02-16
Steroids are widely distributed in nature and are found in plants, animals, and fungi in abundance. A data set consists of a diverse set of steroids have been used to develop quantitative structure-electrochemistry relationship (QSER) models for their half-wave reduction potential. Modeling was established by means of multiple linear regression (MLR) and principle component regression (PCR) analyses. In MLR analysis, the QSPR models were constructed by first grouping descriptors and then stepwise selection of variables from each group (MLR1) and stepwise selection of predictor variables from the pool of all calculated descriptors (MLR2). Similar procedure was used in PCR analysis so that the principal components (or features) were extracted from different group of descriptors (PCR1) and from entire set of descriptors (PCR2). The resulted models were evaluated using cross-validation, chance correlation, application to prediction reduction potential of some test samples and accessing applicability domain. Both MLR approaches represented accurate results however the QSPR model found by MLR1 was statistically more significant. PCR1 approach produced a model as accurate as MLR approaches whereas less accurate results were obtained by PCR2 approach. In overall, the correlation coefficients of cross-validation and prediction of the QSPR models resulted from MLR1, MLR2 and PCR1 approaches were higher than 90%, which show the high ability of the models to predict reduction potential of the studied steroids.
[Using fractional polynomials to estimate the safety threshold of fluoride in drinking water].
Pan, Shenling; An, Wei; Li, Hongyan; Yang, Min
2014-01-01
To study the dose-response relationship between fluoride content in drinking water and prevalence of dental fluorosis on the national scale, then to determine the safety threshold of fluoride in drinking water. Meta-regression analysis was applied to the 2001-2002 national endemic fluorosis survey data of key wards. First, fractional polynomial (FP) was adopted to establish fixed effect model, determining the best FP structure, after that restricted maximum likelihood (REML) was adopted to estimate between-study variance, then the best random effect model was established. The best FP structure was first-order logarithmic transformation. Based on the best random effect model, the benchmark dose (BMD) of fluoride in drinking water and its lower limit (BMDL) was calculated as 0.98 mg/L and 0.78 mg/L. Fluoride in drinking water can only explain 35.8% of the variability of the prevalence, among other influencing factors, ward type was a significant factor, while temperature condition and altitude were not. Fractional polynomial-based meta-regression method is simple, practical and can provide good fitting effect, based on it, the safety threshold of fluoride in drinking water of our country is determined as 0.8 mg/L.
Personalized pseudophakic model
NASA Astrophysics Data System (ADS)
Ribeiro, F.; Castanheira-Dinis, A.; Dias, J. M.
2014-08-01
With the aim of taking into account all optical aberrations, a personalized pseudophakic optical model was designed for refractive evaluation using ray tracing software. Starting with a generic model, all clinically measurable data were replaced by personalized measurements. Data from corneal anterior and posterior surfaces were imported from a grid of elevation data obtained by topography, and a formula for the calculation of the intraocular lens (IOL) position was developed based on the lens equator. For the assessment of refractive error, a merit function minimized by the approximation of the Modulation Transfer Function values to diffraction limit values on the frequencies corresponding up to the discrimination limits of the human eye, weighted depending on the human contrast sensitivity function, was built. The model was tested on the refractive evaluation of 50 pseudophakic eyes. The developed model shows good correlation with subjective evaluation of a pseudophakic population, having the added advantage of being independent of corrective factors, allowing it to be immediately adaptable to new technological developments. In conclusion, this personalized model, which uses individual biometric values, allows for a precise refractive assessment and is a valuable tool for an accurate IOL power calculation, including in conditions to which population averages and the commonly used regression correction factors do not apply, thus achieving the goal of being both personalized and universally applicable.
Barrett, Bruce; Brown, Roger; Mundt, Marlon
2008-02-01
Evaluative health-related quality-of-life instruments used in clinical trials should be able to detect small but important changes in health status. Several approaches to minimal important difference (MID) and responsiveness have been developed. To compare anchor-based and distributional approaches to important difference and responsiveness for the Wisconsin Upper Respiratory Symptom Survey (WURSS), an illness-specific quality of life outcomes instrument. Participants with community-acquired colds self-reported daily using the WURSS-44. Distribution-based methods calculated standardized effect size (ES) and standard error of measurement (SEM). Anchor-based methods compared daily interval changes to global ratings of change, using: (1) standard MID methods based on correspondence to ratings of "a little better" or "somewhat better," and (2) two-level multivariate regression models. About 150 adults were monitored throughout their colds (1,681 sick days.): 88% were white, 69% were women, and 50% had completed college. The mean age was 35.5 years (SD = 14.7). WURSS scores increased 2.2 points from the first to second day, and then dropped by an average of 8.2 points per day from days 2 to 7. The SEM averaged 9.1 during these 7 days. Standard methods yielded a between day MID of 22 points. Regression models of MID projected 11.3-point daily changes. Dividing these estimates of small-but-important-difference by pooled SDs yielded coefficients of .425 for standard MID, .218 for regression model, .177 for SEM, and .157 for ES. These imply per-group sample sizes of 870 using ES, 616 for SEM, 302 for regression model, and 89 for standard MID, assuming alpha = .05, beta = .20 (80% power), and two-tailed testing. Distribution and anchor-based approaches provide somewhat different estimates of small but important difference, which in turn can have substantial impact on trial design.
Monte Carlo calculation of the maximum therapeutic gain of tumor antivascular alpha therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Chen-Yu; Oborn, Bradley M.; Guatelli, Susanna
Purpose: Metastatic melanoma lesions experienced marked regression after systemic targeted alpha therapy in a phase 1 clinical trial. This unexpected response was ascribed to tumor antivascular alpha therapy (TAVAT), in which effective tumor regression is achieved by killing endothelial cells (ECs) in tumor capillaries and, thus, depriving cancer cells of nutrition and oxygen. The purpose of this paper is to quantitatively analyze the therapeutic efficacy and safety of TAVAT by building up the testing Monte Carlo microdosimetric models. Methods: Geant4 was adapted to simulate the spatial nonuniform distribution of the alpha emitter {sup 213}Bi. The intraluminal model was designed tomore » simulate the background dose to normal tissue capillary ECs from the nontargeted activity in the blood. The perivascular model calculates the EC dose from the activity bound to the perivascular cancer cells. The key parameters are the probability of an alpha particle traversing an EC nucleus, the energy deposition, the lineal energy transfer, and the specific energy. These results were then applied to interpret the clinical trial. Cell survival rate and therapeutic gain were determined. Results: The specific energy for an alpha particle hitting an EC nucleus in the intraluminal and perivascular models is 0.35 and 0.37 Gy, respectively. As the average probability of traversal in these models is 2.7% and 1.1%, the mean specific energy per decay drops to 1.0 cGy and 0.4 cGy, which demonstrates that the source distribution has a significant impact on the dose. Using the melanoma clinical trial activity of 25 mCi, the dose to tumor EC nucleus is found to be 3.2 Gy and to a normal capillary EC nucleus to be 1.8 cGy. These data give a maximum therapeutic gain of about 180 and validate the TAVAT concept. Conclusions: TAVAT can deliver a cytotoxic dose to tumor capillaries without being toxic to normal tissue capillaries.« less
Otsuka, Eri; Abe, Hiroyuki; Aburada, Masaki; Otsuka, Makoto
2010-07-01
A suppository dosage form has a rapid effect on therapeutics, because it dissolves in the rectum, is absorbed in the bloodstream, and passes the hepatic metabolism. However, the dosage form is unstable, because a suppository is made in a semisolid form, and so it is not easy to mix the bulk drug powder in the base. This article describes a nondestructive method of determining the drug content of suppositories using near-infrared spectrometry (NIR) combined with chemometrics. Suppositories (aspirin content: 1.8, 2.7, 4.5, 7.3, and 9.1%, w/w) were produced by mixing an aspirin bulk powder with hard fat at 50 degrees C and pouring the melt mixture into a plastic mold (2.25 mL). NIR spectra of 12 calibration and 12 validation sample sets were recorded 5 times. A total of 60 spectral data were used as a calibration set to establish a calibration model to predict drug content with a partial least-squares (PLS) regression analysis. NIR data of the suppository samples were divided into two wave number ranges, 4000-12500 cm(-1) (LR), and 5900-6300 cm(-1) (SR). Calibration models for the aspirin content of the suppositories were calculated based on LR and SR ranges of second-derivative NIR spectra using PLS. The models for LR and SR consisted of five and one principal components (PC), respectively. The plots of predicted values against actual values gave a straight line with regression coefficient constants of 0.9531 and 0.9749, respectively. The mean bias and mean accuracy of the calibration models were calculated based on the SR of variation data sets, and were lower than those of LR, respectively. Limiting the wave number of spectral data sets is useful to help understand the calibration model because of noise cancellation and to measure objective functions.
Forsberg, Flemming; Ro, Raymond J.; Fox, Traci B; Liu, Ji-Bin; Chiou, See-Ying; Potoczek, Magdalena; Goldberg, Barry B
2010-01-01
The purpose of this study was to prospectively compare noninvasive, quantitative measures of vascularity obtained from 4 contrast enhanced ultrasound (US) techniques to 4 invasive immunohistochemical markers of tumor angiogenesis in a large group of murine xenografts. Glioma (C6) or breast cancer (NMU) cells were implanted in 144 rats. The contrast agent Optison (GE Healthcare, Princeton, NJ) was injected in a tail vein (dose: 0.4ml/kg). Power Doppler imaging (PDI), pulse-subtraction harmonic imaging (PSHI), flash-echo imaging (FEI), and Microflow imaging (MFI; a technique creating maximum intensity projection images over time) was performed with an Aplio scanner (Toshiba America Medical Systems, Tustin, CA) and a 7.5 MHz linear array. Fractional tumor neovascularity was calculated from digital clips of contrast US, while the relative area stained was calculated from specimens. Results were compared using a factorial, repeated measures ANOVA, linear regression and z-tests. The tortuous morphology of tumor neovessels was visualized better with MFI than with the other US modes. Cell line, implantation method and contrast US imaging technique were significant parameters in the ANOVA model (p<0.05). The strongest correlation determined by linear regression in the C6 model was between PSHI and percent area stained with CD31 (r=0.37, p<0.0001). In the NMU model the strongest correlation was between FEI and COX-2 (r=0.46, p<0.0001). There were no statistically significant differences between correlations obtained with the various US methods (p>0.05). In conclusion, the largest study of contrast US of murine xenografts to date has been conducted and quantitative contrast enhanced US measures of tumor neovascularity in glioma and breast cancer xenograft models appear to provide a noninvasive marker for angiogenesis; although the best method for monitoring angiogenesis was not conclusively established. PMID:21144542
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Verly, Eliseu; Steluti, Josiane; Fisberg, Regina Mara; Marchioni, Dirce Maria Lobo
2014-01-01
A reduction in homocysteine concentration due to the use of supplemental folic acid is well recognized, although evidence of the same effect for natural folate sources, such as fruits and vegetables (FV), is lacking. The traditional statistical analysis approaches do not provide further information. As an alternative, quantile regression allows for the exploration of the effects of covariates through percentiles of the conditional distribution of the dependent variable. To investigate how the associations of FV intake with plasma total homocysteine (tHcy) differ through percentiles in the distribution using quantile regression. A cross-sectional population-based survey was conducted among 499 residents of Sao Paulo City, Brazil. The participants provided food intake and fasting blood samples. Fruit and vegetable intake was predicted by adjusting for day-to-day variation using a proper measurement error model. We performed a quantile regression to verify the association between tHcy and the predicted FV intake. The predicted values of tHcy for each percentile model were calculated considering an increase of 200 g in the FV intake for each percentile. The results showed that tHcy was inversely associated with FV intake when assessed by linear regression whereas, the association was different when using quantile regression. The relationship with FV consumption was inverse and significant for almost all percentiles of tHcy. The coefficients increased as the percentile of tHcy increased. A simulated increase of 200 g in the FV intake could decrease the tHcy levels in the overall percentiles, but the higher percentiles of tHcy benefited more. This study confirms that the effect of FV intake on lowering the tHcy levels is dependent on the level of tHcy using an innovative statistical approach. From a public health point of view, encouraging people to increase FV intake would benefit people with high levels of tHcy.
Yue, Xu; Mickley, Loretta J.; Logan, Jennifer A.; Kaplan, Jed O.
2013-01-01
We estimate future wildfire activity over the western United States during the mid-21st century (2046–2065), based on results from 15 climate models following the A1B scenario. We develop fire prediction models by regressing meteorological variables from the current and previous years together with fire indexes onto observed regional area burned. The regressions explain 0.25–0.60 of the variance in observed annual area burned during 1980–2004, depending on the ecoregion. We also parameterize daily area burned with temperature, precipitation, and relative humidity. This approach explains ~0.5 of the variance in observed area burned over forest ecoregions but shows no predictive capability in the semi-arid regions of Nevada and California. By applying the meteorological fields from 15 climate models to our fire prediction models, we quantify the robustness of our wildfire projections at mid-century. We calculate increases of 24–124% in area burned using regressions and 63–169% with the parameterization. Our projections are most robust in the southwestern desert, where all GCMs predict significant (p<0.05) meteorological changes. For forested ecoregions, more GCMs predict significant increases in future area burned with the parameterization than with the regressions, because the latter approach is sensitive to hydrological variables that show large inter-model variability in the climate projections. The parameterization predicts that the fire season lengthens by 23 days in the warmer and drier climate at mid-century. Using a chemical transport model, we find that wildfire emissions will increase summertime surface organic carbon aerosol over the western United States by 46–70% and black carbon by 20–27% at midcentury, relative to the present day. The pollution is most enhanced during extreme episodes: above the 84th percentile of concentrations, OC increases by ~90% and BC by ~50%, while visibility decreases from 130 km to 100 km in 32 Federal Class 1 areas in Rocky Mountains Forest. PMID:24015109
NASA Astrophysics Data System (ADS)
Hartman, Joshua D.; Monaco, Stephen; Schatschneider, Bohdan; Beran, Gregory J. O.
2015-09-01
We assess the quality of fragment-based ab initio isotropic 13C chemical shift predictions for a collection of 25 molecular crystals with eight different density functionals. We explore the relative performance of cluster, two-body fragment, combined cluster/fragment, and the planewave gauge-including projector augmented wave (GIPAW) models relative to experiment. When electrostatic embedding is employed to capture many-body polarization effects, the simple and computationally inexpensive two-body fragment model predicts both isotropic 13C chemical shifts and the chemical shielding tensors as well as both cluster models and the GIPAW approach. Unlike the GIPAW approach, hybrid density functionals can be used readily in a fragment model, and all four hybrid functionals tested here (PBE0, B3LYP, B3PW91, and B97-2) predict chemical shifts in noticeably better agreement with experiment than the four generalized gradient approximation (GGA) functionals considered (PBE, OPBE, BLYP, and BP86). A set of recommended linear regression parameters for mapping between calculated chemical shieldings and observed chemical shifts are provided based on these benchmark calculations. Statistical cross-validation procedures are used to demonstrate the robustness of these fits.
Hartman, Joshua D; Monaco, Stephen; Schatschneider, Bohdan; Beran, Gregory J O
2015-09-14
We assess the quality of fragment-based ab initio isotropic (13)C chemical shift predictions for a collection of 25 molecular crystals with eight different density functionals. We explore the relative performance of cluster, two-body fragment, combined cluster/fragment, and the planewave gauge-including projector augmented wave (GIPAW) models relative to experiment. When electrostatic embedding is employed to capture many-body polarization effects, the simple and computationally inexpensive two-body fragment model predicts both isotropic (13)C chemical shifts and the chemical shielding tensors as well as both cluster models and the GIPAW approach. Unlike the GIPAW approach, hybrid density functionals can be used readily in a fragment model, and all four hybrid functionals tested here (PBE0, B3LYP, B3PW91, and B97-2) predict chemical shifts in noticeably better agreement with experiment than the four generalized gradient approximation (GGA) functionals considered (PBE, OPBE, BLYP, and BP86). A set of recommended linear regression parameters for mapping between calculated chemical shieldings and observed chemical shifts are provided based on these benchmark calculations. Statistical cross-validation procedures are used to demonstrate the robustness of these fits.
MDR-TB patients in KwaZulu-Natal, South Africa: Cost-effectiveness of 5 models of care
Wallengren, Kristina; Reddy, Tarylee; Besada, Donela; Brust, James C. M.; Voce, Anna; Desai, Harsha; Ngozo, Jacqueline; Radebe, Zanele; Master, Iqbal; Padayatchi, Nesri; Daviaud, Emmanuelle
2018-01-01
Background South Africa has a high burden of MDR-TB, and to provide accessible treatment the government has introduced different models of care. We report the most cost-effective model after comparing cost per patient successfully treated across 5 models of care: centralized hospital, district hospitals (2), and community-based care through clinics or mobile injection teams. Methods In an observational study five cohorts were followed prospectively. The cost analysis adopted a provider perspective and economic cost per patient successfully treated was calculated based on country protocols and length of treatment per patient per model of care. Logistic regression was used to calculate propensity score weights, to compare pairs of treatment groups, whilst adjusting for baseline imbalances between groups. Propensity score weighted costs and treatment success rates were used in the ICER analysis. Sensitivity analysis focused on varying treatment success and length of hospitalization within each model. Results In 1,038 MDR-TB patients 75% were HIV-infected and 56% were successfully treated. The cost per successfully treated patient was 3 to 4.5 times lower in the community-based models with no hospitalization. Overall, the Mobile model was the most cost-effective. Conclusion Reducing the length of hospitalization and following community-based models of care improves the affordability of MDR-TB treatment without compromising its effectiveness. PMID:29668748
The analysis of initial Juno magnetometer data using a sparse magnetic field representation
NASA Astrophysics Data System (ADS)
Moore, Kimberly M.; Bloxham, Jeremy; Connerney, John E. P.; Jørgensen, John L.; Merayo, José M. G.
2017-05-01
The Juno spacecraft, now in polar orbit about Jupiter, passes much closer to Jupiter's surface than any previous spacecraft, presenting a unique opportunity to study the largest and most accessible planetary dynamo in the solar system. Here we present an analysis of magnetometer observations from Juno's first perijove pass (PJ1; to within 1.06 RJ of Jupiter's center). We calculate the residuals between the vector magnetic field observations and that calculated using the VIP4 spherical harmonic model and fit these residuals using an elastic net regression. The resulting model demonstrates how effective Juno's near-surface observations are in improving the spatial resolution of the magnetic field within the immediate vicinity of the orbit track. We identify two features resulting from our analyses: the presence of strong, oppositely signed pairs of flux patches near the equator and weak, possibly reversed-polarity patches of magnetic field over the polar regions. Additional orbits will be required to assess how robust these intriguing features are.
Simulating sunflower canopy temperatures to infer root-zone soil water potential
NASA Technical Reports Server (NTRS)
Choudhury, B. J.; Idso, S. B.
1983-01-01
A soil-plant-atmosphere model for sunflower (Helianthus annuus L.), together with clear sky weather data for several days, is used to study the relationship between canopy temperature and root-zone soil water potential. Considering the empirical dependence of stomatal resistance on insolation, air temperature and leaf water potential, a continuity equation for water flux in the soil-plant-atmosphere system is solved for the leaf water potential. The transpirational flux is calculated using Monteith's combination equation, while the canopy temperature is calculated from the energy balance equation. The simulation shows that, at high soil water potentials, canopy temperature is determined primarily by air and dew point temperatures. These results agree with an empirically derived linear regression equation relating canopy-air temperature differential to air vapor pressure deficit. The model predictions of leaf water potential are also in agreement with observations, indicating that measurements of canopy temperature together with a knowledge of air and dew point temperatures can provide a reliable estimate of the root-zone soil water potential.
Braun, Sabine; Schindler, Christian; Leuzinger, Sebastian
2010-09-01
For a quantitative estimate of the ozone effect on vegetation reliable models for ozone uptake through the stomata are needed. Because of the analogy of ozone uptake and transpiration it is possible to utilize measurements of water loss such as sap flow for quantification of ozone uptake. This technique was applied in three beech (Fagus sylvatica) stands in Switzerland. A canopy conductance was calculated from sap flow velocity and normalized to values between 0 and 1. It represents mainly stomatal conductance as the boundary layer resistance in forests is usually small. Based on this relative conductance, stomatal functions to describe the dependence on light, temperature, vapour pressure deficit and soil moisture were derived using multivariate nonlinear regression. These functions were validated by comparison with conductance values directly estimated from sap flow. The results corroborate the current flux parameterization for beech used in the DO3SE model. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Non-contact measurement of pulse wave velocity using RGB cameras
NASA Astrophysics Data System (ADS)
Nakano, Kazuya; Aoki, Yuta; Satoh, Ryota; Hoshi, Akira; Suzuki, Hiroyuki; Nishidate, Izumi
2016-03-01
Non-contact measurement of pulse wave velocity (PWV) using red, green, and blue (RGB) digital color images is proposed. Generally, PWV is used as the index of arteriosclerosis. In our method, changes in blood volume are calculated based on changes in the color information, and is estimated by combining multiple regression analysis (MRA) with a Monte Carlo simulation (MCS) model of the transit of light in human skin. After two pulse waves of human skins were measured using RGB cameras, and the PWV was calculated from the difference of the pulse transit time and the distance between two measurement points. The measured forehead-finger PWV (ffPWV) was on the order of m/s and became faster as the values of vital signs raised. These results demonstrated the feasibility of this method.
Estimating carbon and showing impacts of drought using satellite data in regression-tree models
Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.
2018-01-01
Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.
NASA Astrophysics Data System (ADS)
Soja, G.; Soja, A.-M.
This study tested the usefulness of extremely simple meteorological models for the prediction of ozone indices. The models were developed with the input parameters of daily maximum temperature and sunshine duration and are based on a data collection period of three years. For a rural environment in eastern Austria, the meteorological and ozone data of three summer periods have been used to develop functions to describe three ozone exposure indices (daily maximum, 7 h mean 9.00-16.00 h, accumulated ozone dose AOT40). Data sets for other years or stations not included in the development of the models were used as test data to validate the performance of the models. Generally, optimized regression models performed better than simplest linear models, especially in the case of AOT40. For the description of the summer period from May to September, the mean absolute daily differences between observed and calculated indices were 8±6 ppb for the maximum half hour mean value, 6±5 ppb for the 7 h mean and 41±40 ppb h for the AOT40. When the parameters were further optimized to describe individual months separately, the mean absolute residuals decreased by ⩽10%. Neural network models did not always perform better than the regression models. This is attributed to the low number of inputs in this comparison and to the simple architecture of these models (2-2-1). Further factorial analyses of those days when the residuals were higher than the mean plus one standard deviation should reveal possible reasons why the models did not perform well on certain days. It was observed that overestimations by the models mainly occurred on days with partly overcast, hazy or very windy conditions. Underestimations more frequently occurred on weekdays than on weekends. It is suggested that the application of this kind of meteorological model will be more successful in topographically homogeneous regions and in rural environments with relatively constant rates of emission and long-range transport of ozone precursors. Under conditions too demanding for advanced physico/chemical models, the presented models may offer useful alternatives to derive ecologically relevant ozone indices directly from meteorological parameters.
Villeneuve, Paul J; Goldberg, Mark S; Krewski, Daniel; Burnett, Richard T; Chen, Yue
2002-11-01
We used Poisson regression methods to examine the relation between temporal changes in the levels of fine particulate air pollution (PM(2.5)) and the risk of mortality among participants of the Harvard Six Cities longitudinal study. Our analyses were based on 1430 deaths that occurred between 1974 and 1991 in a cohort that accumulated 105,714 person-years of follow-up. For each city, indices of PM(2.5) were derived using daily samples. Individual level data were collected on several risk factors including: smoking, education, body mass index (BMI), and occupational exposure to dusts. Time-dependent indices of PM(2.5) were created across 13 calendar periods (< 1979, 1979, 1980, em leader, 1989, >/= 1990) to explore whether recent or chronic exposures were more important predictors of mortality. The relative risk (RR) of mortality calculated using Poisson regression based on average city-specific exposures that remained constant during follow-up was 1.31 [95% confidence interval (CI) = 1.12-1.52] per 18.6 microg/m(3) of PM(2.5). This result was similar to the risk calculated using the Cox model (RR = 1.26, 95% CI = 1.08-1.46). The RR of mortality was attenuated when the Poisson regression model included a time-dependent estimate of exposure (RR = 1.19, 95% CI = 1.04-1.36). There was little variation in RR across time-dependent indices of PM(2.5). The attenuated risk of mortality that was observed with a time-dependent index of PM(2.5) is due to the combined influence of city-specific variations in mortality rates and decreasing levels of air pollution that occurred during follow-up. The RR of mortality associated with PM(2.5) did not depend on when exposure occurred in relation to death, possibly because of little variation between the time-dependent city-specific exposure indices.
Anodic microbial community diversity as a predictor of the power output of microbial fuel cells.
Stratford, James P; Beecroft, Nelli J; Slade, Robert C T; Grüning, André; Avignone-Rossa, Claudio
2014-03-01
The relationship between the diversity of mixed-species microbial consortia and their electrogenic potential in the anodes of microbial fuel cells was examined using different diversity measures as predictors. Identical microbial fuel cells were sampled at multiple time-points. Biofilm and suspension communities were analysed by denaturing gradient gel electrophoresis to calculate the number and relative abundance of species. Shannon and Simpson indices and richness were examined for association with power using bivariate and multiple linear regression, with biofilm DNA as an additional variable. In simple bivariate regressions, the correlation of Shannon diversity of the biofilm and power is stronger (r=0.65, p=0.001) than between power and richness (r=0.39, p=0.076), or between power and the Simpson index (r=0.5, p=0.018). Using Shannon diversity and biofilm DNA as predictors of power, a regression model can be constructed (r=0.73, p<0.001). Ecological parameters such as the Shannon index are predictive of the electrogenic potential of microbial communities. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Valizadeh, Maryam; Sohrabi, Mahmoud Reza
2018-03-01
In the present study, artificial neural networks (ANNs) and support vector regression (SVR) as intelligent methods coupled with UV spectroscopy for simultaneous quantitative determination of Dorzolamide (DOR) and Timolol (TIM) in eye drop. Several synthetic mixtures were analyzed for validating the proposed methods. At first, neural network time series, which one type of network from the artificial neural network was employed and its efficiency was evaluated. Afterwards, the radial basis network was applied as another neural network. Results showed that the performance of this method is suitable for predicting. Finally, support vector regression was proposed to construct the Zilomole prediction model. Also, root mean square error (RMSE) and mean recovery (%) were calculated for SVR method. Moreover, the proposed methods were compared to the high-performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them. Also, the effect of interferences was investigated in spike solutions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fave, X; Court, L; UT Health Science Center, Graduate School of Biomedical Sciences, Houston, TX
Purpose: To determine how radiomics features change during radiation therapy and whether those changes (delta-radiomics features) can improve prognostic models built with clinical factors. Methods: 62 radiomics features, including histogram, co-occurrence, run-length, gray-tone difference, and shape features, were calculated from pretreatment and weekly intra-treatment CTs for 107 stage III NSCLC patients (5–9 images per patient). Image preprocessing for each feature was determined using the set of pretreatment images: bit-depth resample and/or a smoothing filter were tested for their impact on volume-correlation and significance of each feature in univariate cox regression models to maximize their information content. Next, the optimized featuresmore » were calculated from the intratreatment images and tested in linear mixed-effects models to determine which features changed significantly with dose-fraction. The slopes in these significant features were defined as delta-radiomics features. To test their prognostic potential multivariate cox regression models were fitted, first using only clinical features and then clinical+delta-radiomics features for overall-survival, local-recurrence, and distant-metastases. Leave-one-out cross validation was used for model-fitting and patient predictions. Concordance indices(c-index) and p-values for the log-rank test with patients stratified at the median were calculated. Results: Approximately one-half of the 62 optimized features required no preprocessing, one-fourth required smoothing, and one-fourth required smoothing and resampling. From these, 54 changed significantly during treatment. For overall-survival, the c-index improved from 0.52 for clinical factors alone to 0.62 for clinical+delta-radiomics features. For distant-metastases, the c-index improved from 0.53 to 0.58, while for local-recurrence it did not improve. Patient stratification significantly improved (p-value<0.05) for overallsurvival and distant-metastases when delta-radiomics features were included. The delta-radiomics versions of autocorrelation, kurtosis, and compactness were selected most frequently in leave-one-out iterations. Conclusion: Weekly changes in radiomics features can potentially be used to evaluate treatment response and predict patient outcomes. High-risk patients could be recommended for dose escalation or consolidation chemotherapy. This project was funded in part by grants from the National Cancer Institute (NCI) and the Cancer Prevention Research Institute of Texas (CPRIT).« less
NASA Astrophysics Data System (ADS)
Cox, Courtney E.; Phifer, Jeremy R.; Ferreira da Silva, Larissa; Gonçalves Nogueira, Gabriel; Ley, Ryan T.; O'Loughlin, Elizabeth J.; Pereira Barbosa, Ana Karolyne; Rygelski, Brett T.; Paluch, Andrew S.
2017-02-01
Solubility parameter based methods have long been a valuable tool for solvent formulation and selection. Of these methods, the MOdified Separation of Cohesive Energy Density (MOSCED) has recently been shown to correlate well the equilibrium solubility of multifunctional non-electrolyte solids. However, before it can be applied to a novel solute, a limited amount of reference solubility data is required to regress the necessary MOSCED parameters. Here we demonstrate for the solutes methylparaben, ethylparaben, propylparaben, butylparaben, lidocaine and ephedrine how conventional molecular simulation free energy calculations or electronic structure calculations in a continuum solvent, here the SMD or SM8 solvation model, can instead be used to generate the necessary reference data, resulting in a predictive flavor of MOSCED. Adopting the melting point temperature and enthalpy of fusion of these compounds from experiment, we are able to predict equilibrium solubilities. We find the method is able to well correlate the (mole fraction) equilibrium solubility in non-aqueous solvents over four orders of magnitude with good quantitative agreement.
Cox, Courtney E; Phifer, Jeremy R; Ferreira da Silva, Larissa; Gonçalves Nogueira, Gabriel; Ley, Ryan T; O'Loughlin, Elizabeth J; Pereira Barbosa, Ana Karolyne; Rygelski, Brett T; Paluch, Andrew S
2017-02-01
Solubility parameter based methods have long been a valuable tool for solvent formulation and selection. Of these methods, the MOdified Separation of Cohesive Energy Density (MOSCED) has recently been shown to correlate well the equilibrium solubility of multifunctional non-electrolyte solids. However, before it can be applied to a novel solute, a limited amount of reference solubility data is required to regress the necessary MOSCED parameters. Here we demonstrate for the solutes methylparaben, ethylparaben, propylparaben, butylparaben, lidocaine and ephedrine how conventional molecular simulation free energy calculations or electronic structure calculations in a continuum solvent, here the SMD or SM8 solvation model, can instead be used to generate the necessary reference data, resulting in a predictive flavor of MOSCED. Adopting the melting point temperature and enthalpy of fusion of these compounds from experiment, we are able to predict equilibrium solubilities. We find the method is able to well correlate the (mole fraction) equilibrium solubility in non-aqueous solvents over four orders of magnitude with good quantitative agreement.
Sabel, Michael S.; Rice, John D.; Griffith, Kent A.; Lowe, Lori; Wong, Sandra L.; Chang, Alfred E.; Johnson, Timothy M.; Taylor, Jeremy M.G.
2013-01-01
Introduction To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid SLN biopsy (SLNB). Several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests and support vector machines. We sought to validate recently published models meant to predict sentinel node status. Methods We queried our comprehensive, prospectively-collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon 4 published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false negative rate (FNR). Results Logistic regression performed comparably with our data when considering NPV (89.4% vs. 93.6%); however the model’s specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsies rates that were lower 87.7% vs. 94.1% and 29.8% vs. 14.3%, respectively. Two published models could not be applied to our data due to model complexity and the use of proprietary software. Conclusions Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Development of statistical predictive models must be created in a clinically applicable manner to allow for both validation and ultimately clinical utility. PMID:21822550
Bioinactivation: Software for modelling dynamic microbial inactivation.
Garre, Alberto; Fernández, Pablo S; Lindqvist, Roland; Egea, Jose A
2017-03-01
This contribution presents the bioinactivation software, which implements functions for the modelling of isothermal and non-isothermal microbial inactivation. This software offers features such as user-friendliness, modelling of dynamic conditions, possibility to choose the fitting algorithm and generation of prediction intervals. The software is offered in two different formats: Bioinactivation core and Bioinactivation SE. Bioinactivation core is a package for the R programming language, which includes features for the generation of predictions and for the fitting of models to inactivation experiments using non-linear regression or a Markov Chain Monte Carlo algorithm (MCMC). The calculations are based on inactivation models common in academia and industry (Bigelow, Peleg, Mafart and Geeraerd). Bioinactivation SE supplies a user-friendly interface to selected functions of Bioinactivation core, namely the model fitting of non-isothermal experiments and the generation of prediction intervals. The capabilities of bioinactivation are presented in this paper through a case study, modelling the non-isothermal inactivation of Bacillus sporothermodurans. This study has provided a full characterization of the response of the bacteria to dynamic temperature conditions, including confidence intervals for the model parameters and a prediction interval of the survivor curve. We conclude that the MCMC algorithm produces a better characterization of the biological uncertainty and variability than non-linear regression. The bioinactivation software can be relevant to the food and pharmaceutical industry, as well as to regulatory agencies, as part of a (quantitative) microbial risk assessment. Copyright © 2017 Elsevier Ltd. All rights reserved.
Noizet, Odile; Leclerc, Francis; Sadik, Ahmed; Grandbastien, Bruno; Riou, Yvon; Dorkenoo, Aimée; Fourier, Catherine; Cremer, Robin; Leteurtre, Stephane
2005-01-01
Introduction We conducted the present study to determine whether a combination of the mechanical ventilation weaning predictors proposed by the collective Task Force of the American College of Chest Physicians (TF) and weaning endurance indices enhance prediction of weaning success. Method Conducted in a tertiary paediatric intensive care unit at a university hospital, this prospective study included 54 children receiving mechanical ventilation (≥6 hours) who underwent 57 episodes of weaning. We calculated the indices proposed by the TF (spontaneous respiratory rate, paediatric rapid shallow breathing, rapid shallow breathing occlusion pressure [ROP] and maximal inspiratory pressure during an occlusion test [Pimax]) and weaning endurance indices (pressure-time index, tension-time index obtained from P0.1 [TTI1] and from airway pressure [TTI2]) during spontaneous breathing. Performances of each TF index and combinations of them were calculated, and the best single index and combination were identified. Weaning endurance parameters (TTI1 and TTI2) were calculated and the best index was determined using a logistic regression model. Regression coefficients were estimated using the maximum likelihood ratio (LR) method. Hosmer–Lemeshow test was used to estimate goodness-of-fit of the model. An equation was constructed to predict weaning success. Finally, we calculated the performances of combinations of best TF indices and best endurance index. Results The best single TF index was ROP, the best TF combination was represented by the expression (0.66 × ROP) + (0.34 × Pimax), and the best endurance index was the TTI2, although their performance was poor. The best model resulting from the combination of these indices was defined by the following expression: (0.6 × ROP) – (0.1 × Pimax) + (0.5 × TTI2). This integrated index was a good weaning predictor (P < 0.01), with a LR+ of 6.4 and LR+/LR- ratio of 12.5. However, at a threshold value <1.3 it was only predictive of weaning success (LR- = 0.5). Conclusion The proposed combined index, incorporating endurance, was of modest value in predicting weaning outcome. This is the first report of the value of endurance parameters in predicting weaning success in children. Currently, clinical judgement associated with spontaneous breathing trials apparently remain superior. PMID:16356229
A Group Increment Scheme for Infrared Absorption Intensities of Greenhouse Gases
NASA Technical Reports Server (NTRS)
Kokkila, Sara I.; Bera, Partha P.; Francisco, Joseph S.; Lee, Timothy J.
2012-01-01
A molecule's absorption in the atmospheric infrared (IR) window (IRW) is an indicator of its efficiency as a greenhouse gas. A model for estimating the absorption of a fluorinated molecule within the IRW was developed to assess its radiative impact. This model will be useful in comparing different hydrofluorocarbons and hydrofluoroethers contribution to global warming. The absorption of radiation by greenhouse gases, in particular hydrofluoroethers and hydrofluorocarbons, was investigated using ab initio quantum mechanical methods. Least squares regression techniques were used to create a model based on this data. The placement and number of fluorines in the molecule were found to affect the absorption in the IR window and were incorporated into the model. Several group increment models are discussed. An additive model based on one-carbon groups is found to work satisfactorily in predicting the ab initio calculated vibrational intensities.
Development of Great Lakes algorithms for the Nimbus-G coastal zone color scanner
NASA Technical Reports Server (NTRS)
Tanis, F. J.; Lyzenga, D. R.
1981-01-01
A series of experiments in the Great Lakes designed to evaluate the application of the Nimbus G satellite Coastal Zone Color Scanner (CZCS) were conducted. Absorption and scattering measurement data were reduced to obtain a preliminary optical model for the Great Lakes. Available optical models were used in turn to calculate subsurface reflectances for expected concentrations of chlorophyll-a pigment and suspended minerals. Multiple nonlinear regression techniques were used to derive CZCS water quality prediction equations from Great Lakes simulation data. An existing atmospheric model was combined with a water model to provide the necessary simulation data for evaluation of the preliminary CZCS algorithms. A CZCS scanner model was developed which accounts for image distorting scanner and satellite motions. This model was used in turn to generate mapping polynomials that define the transformation from the original image to one configured in a polyconic projection. Four computer programs (FORTRAN IV) for image transformation are presented.
An empirical model for polarized and cross-polarized scattering from a vegetation layer
NASA Technical Reports Server (NTRS)
Liu, H. L.; Fung, A. K.
1988-01-01
An empirical model for scattering from a vegetation layer above an irregular ground surface is developed in terms of the first-order solution for like-polarized scattering and the second-order solution for cross-polarized scattering. The effects of multiple scattering within the layer and at the surface-volume boundary are compensated by using a correction factor based on the matrix doubling method. The major feature of this model is that all parameters in the model are physical parameters of the vegetation medium. There are no regression parameters. Comparisons of this empirical model with theoretical matrix-doubling method and radar measurements indicate good agreements in polarization, angular trends, and k sub a up to 4, where k is the wave number and a is the disk radius. The computational time is shortened by a factor of 8, relative to the theoretical model calculation.