Sample records for regression analysis produced

  1. Resting-state functional magnetic resonance imaging: the impact of regression analysis.

    PubMed

    Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi

    2015-01-01

    To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.

  2. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Beckstead, Jason W.

    2012-01-01

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

  3. Assessing landslide susceptibility by statistical data analysis and GIS: the case of Daunia (Apulian Apennines, Italy)

    NASA Astrophysics Data System (ADS)

    Ceppi, C.; Mancini, F.; Ritrovato, G.

    2009-04-01

    This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.

  4. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    PubMed

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.

    ERIC Educational Resources Information Center

    Thompson, Bruce

    Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…

  6. Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

    NASA Astrophysics Data System (ADS)

    Gusriani, N.; Firdaniza

    2018-03-01

    The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.

  7. A systematic review and meta-analysis of the effects of antibiotic consumption on antibiotic resistance

    PubMed Central

    2014-01-01

    Background Greater use of antibiotics during the past 50 years has exerted selective pressure on susceptible bacteria and may have favoured the survival of resistant strains. Existing information on antibiotic resistance patterns from pathogens circulating among community-based patients is substantially less than from hospitalized patients on whom guidelines are often based. We therefore chose to assess the relationship between the antibiotic resistance pattern of bacteria circulating in the community and the consumption of antibiotics in the community. Methods Both gray literature and published scientific literature in English and other European languages was examined. Multiple regression analysis was used to analyse whether studies found a positive relationship between antibiotic consumption and resistance. A subsequent meta-analysis and meta-regression was conducted for studies for which a common effect size measure (odds ratio) could be calculated. Results Electronic searches identified 974 studies but only 243 studies were considered eligible for inclusion by the two independent reviewers who extracted the data. A binomial test revealed a positive relationship between antibiotic consumption and resistance (p < .001) but multiple regression modelling did not produce any significant predictors of study outcome. The meta-analysis generated a significant pooled odds ratio of 2.3 (95% confidence interval 2.2 to 2.5) with a meta-regression producing several significant predictors (F(10,77) = 5.82, p < .01). Countries in southern Europe produced a stronger link between consumption and resistance than other regions. Conclusions Using a large set of studies we found that antibiotic consumption is associated with the development of antibiotic resistance. A subsequent meta-analysis, with a subsample of the studies, generated several significant predictors. Countries in southern Europe produced a stronger link between consumption and resistance than other regions so efforts at reducing antibiotic consumption may need to be strengthened in this area. Increased consumption of antibiotics may not only produce greater resistance at the individual patient level but may also produce greater resistance at the community, country, and regional levels, which can harm individual patients. PMID:24405683

  8. Review and statistical analysis of the use of ultrasonic velocity for estimating the porosity fraction in polycrystalline materials

    NASA Technical Reports Server (NTRS)

    Roth, D. J.; Swickard, S. M.; Stang, D. B.; Deguire, M. R.

    1991-01-01

    A review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials is presented. Initially, a semiempirical model is developed showing the origin of the linear relationship between ultrasonic velocity and porosity fraction. Then, from a compilation of data produced by many researchers, scatter plots of velocity versus percent porosity data are shown for Al2O3, MgO, porcelain-based ceramics, PZT, SiC, Si3N4, steel, tungsten, UO2,(U0.30Pu0.70)C, and YBa2Cu3O(7-x). Linear regression analysis produces predicted slope, intercept, correlation coefficient, level of significance, and confidence interval statistics for the data. Velocity values predicted from regression analysis of fully-dense materials are in good agreement with those calculated from elastic properties.

  9. Review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials

    NASA Technical Reports Server (NTRS)

    Roth, D. J.; Swickard, S. M.; Stang, D. B.; Deguire, M. R.

    1990-01-01

    A review and statistical analysis of the ultrasonic velocity method for estimating the porosity fraction in polycrystalline materials is presented. Initially, a semi-empirical model is developed showing the origin of the linear relationship between ultrasonic velocity and porosity fraction. Then, from a compilation of data produced by many researchers, scatter plots of velocity versus percent porosity data are shown for Al2O3, MgO, porcelain-based ceramics, PZT, SiC, Si3N4, steel, tungsten, UO2,(U0.30Pu0.70)C, and YBa2Cu3O(7-x). Linear regression analysis produced predicted slope, intercept, correlation coefficient, level of significance, and confidence interval statistics for the data. Velocity values predicted from regression analysis for fully-dense materials are in good agreement with those calculated from elastic properties.

  10. Advanced microwave soil moisture studies. [Big Sioux River Basin, Iowa

    NASA Technical Reports Server (NTRS)

    Dalsted, K. J.; Harlan, J. C.

    1983-01-01

    Comparisons of low level L-band brightness temperature (TB) and thermal infrared (TIR) data as well as the following data sets: soil map and land cover data; direct soil moisture measurement; and a computer generated contour map were statistically evaluated using regression analysis and linear discriminant analysis. Regression analysis of footprint data shows that statistical groupings of ground variables (soil features and land cover) hold promise for qualitative assessment of soil moisture and for reducing variance within the sampling space. Dry conditions appear to be more conductive to producing meaningful statistics than wet conditions. Regression analysis using field averaged TB and TIR data did not approach the higher sq R values obtained using within-field variations. The linear discriminant analysis indicates some capacity to distinguish categories with the results being somewhat better on a field basis than a footprint basis.

  11. Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 1. Theory

    USGS Publications Warehouse

    Cooley, Richard L.

    1982-01-01

    Prior information on the parameters of a groundwater flow model can be used to improve parameter estimates obtained from nonlinear regression solution of a modeling problem. Two scales of prior information can be available: (1) prior information having known reliability (that is, bias and random error structure) and (2) prior information consisting of best available estimates of unknown reliability. A regression method that incorporates the second scale of prior information assumes the prior information to be fixed for any particular analysis to produce improved, although biased, parameter estimates. Approximate optimization of two auxiliary parameters of the formulation is used to help minimize the bias, which is almost always much smaller than that resulting from standard ridge regression. It is shown that if both scales of prior information are available, then a combined regression analysis may be made.

  12. Development and evaluation of habitat models for herpetofauna and small mammals

    Treesearch

    William M. Block; Michael L. Morrison; Peter E. Scott

    1998-01-01

    We evaluated the ability of discriminant analysis (DA), logistic regression (LR), and multiple regression (MR) to describe habitat use by amphibians, reptiles, and small mammals found in California oak woodlands. We also compared models derived from pitfall and live trapping data for several species. Habitat relations modeled by DA and LR produced similar results,...

  13. Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data.

    PubMed

    Levine, Matthew E; Albers, David J; Hripcsak, George

    2016-01-01

    Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.

  14. Regression analysis using dependent Polya trees.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  15. A comparative study on generating simulated Landsat NDVI images using data fusion and regression method-the case of the Korean Peninsula.

    PubMed

    Lee, Mi Hee; Lee, Soo Bong; Eo, Yang Dam; Kim, Sun Woong; Woo, Jung-Hun; Han, Soo Hee

    2017-07-01

    Landsat optical images have enough spatial and spectral resolution to analyze vegetation growth characteristics. But, the clouds and water vapor degrade the image quality quite often, which limits the availability of usable images for the time series vegetation vitality measurement. To overcome this shortcoming, simulated images are used as an alternative. In this study, weighted average method, spatial and temporal adaptive reflectance fusion model (STARFM) method, and multilinear regression analysis method have been tested to produce simulated Landsat normalized difference vegetation index (NDVI) images of the Korean Peninsula. The test results showed that the weighted average method produced the images most similar to the actual images, provided that the images were available within 1 month before and after the target date. The STARFM method gives good results when the input image date is close to the target date. Careful regional and seasonal consideration is required in selecting input images. During summer season, due to clouds, it is very difficult to get the images close enough to the target date. Multilinear regression analysis gives meaningful results even when the input image date is not so close to the target date. Average R 2 values for weighted average method, STARFM, and multilinear regression analysis were 0.741, 0.70, and 0.61, respectively.

  16. An Attempt at Quantifying Factors that Affect Efficiency in the Management of Solid Waste Produced by Commercial Businesses in the City of Tshwane, South Africa

    PubMed Central

    Worku, Yohannes; Muchie, Mammo

    2012-01-01

    Objective. The objective was to investigate factors that affect the efficient management of solid waste produced by commercial businesses operating in the city of Pretoria, South Africa. Methods. Data was gathered from 1,034 businesses. Efficiency in solid waste management was assessed by using a structural time-based model designed for evaluating efficiency as a function of the length of time required to manage waste. Data analysis was performed using statistical procedures such as frequency tables, Pearson's chi-square tests of association, and binary logistic regression analysis. Odds ratios estimated from logistic regression analysis were used for identifying key factors that affect efficiency in the proper disposal of waste. Results. The study showed that 857 of the 1,034 businesses selected for the study (83%) were found to be efficient enough with regards to the proper collection and disposal of solid waste. Based on odds ratios estimated from binary logistic regression analysis, efficiency in the proper management of solid waste was significantly influenced by 4 predictor variables. These 4 influential predictor variables are lack of adherence to waste management regulations, wrong perception, failure to provide customers with enough trash cans, and operation of businesses by employed managers, in a decreasing order of importance. PMID:23209483

  17. A method for nonlinear exponential regression analysis

    NASA Technical Reports Server (NTRS)

    Junkin, B. G.

    1971-01-01

    A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.

  18. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    PubMed

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.

  19. The Use of Linear Programming for Prediction.

    ERIC Educational Resources Information Center

    Schnittjer, Carl J.

    The purpose of the study was to develop a linear programming model to be used for prediction, test the accuracy of the predictions, and compare the accuracy with that produced by curvilinear multiple regression analysis. (Author)

  20. Sparse partial least squares regression for simultaneous dimension reduction and variable selection

    PubMed Central

    Chun, Hyonho; Keleş, Sündüz

    2010-01-01

    Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611

  1. Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis.

    PubMed

    Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung

    2012-07-01

    In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.

  2. Cross-sectional study of equol producer status and cognitive impairment in older adults.

    PubMed

    Igase, Michiya; Igase, Keiji; Tabara, Yasuharu; Ohyagi, Yasumasa; Kohara, Katsuhiko

    2017-11-01

    It is well known that consumption of isoflavones reduces the risk of cardiovascular disease. However, the effectiveness of isoflavones in preventing dementia is controversial. A number of intervention studies have produced conflicting results. One possible reason is that the ability to produce equol, a metabolite of a soy isoflavone, differs greatly in individuals. In addition to existing data, we sought to confirm whether an apparent beneficial effect in cognitive function is observed after soy consumption in equol producers compared with non-producers. The present study was a cross-sectional, observational study of 152 (male/female = 61/91, mean age 69.2 ± 9.2 years) individuals. Participants were divided into two groups according to equol production status, which was determined using urine samples collected after a soy challenge test. Cognitive function was assessed using two computer-based questionnaires (touch panel-type dementia assessment scale [TDAS] and mild cognitive impairment [MCI] screen). Overall, 60 (40%) of 152 participants were equol producers. Both TDAS and prevalence of MCI were significantly higher in the equol producer group than in the non-producer group. In univariate analyses, TDAS significantly correlated with age, serum creatinine, estimated glomerular filtration rate and low-density lipoprotein cholesterol. In multiple regression analysis using TDAS as a dependent variable, equol producer (β = 0.236, P = 0.005) was selected as an independent variable. In addition, multiple logistic regression analysis to assess the presence of MCI showed that being an equol producer was an independent risk factor for MCI (odds ratio 3.961). Compared with equol non-producers, equol producers showed an apparent beneficial effect in cognitive function after soy intake. Geriatr Gerontol Int 2017; 17: 2103-2108. © 2017 Japan Geriatrics Society.

  3. Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits

    PubMed Central

    Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia

    2004-01-01

    Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415

  4. Development of a hybrid proximal sensing method for rapid identification of petroleum contaminated soils.

    PubMed

    Chakraborty, Somsubhra; Weindorf, David C; Li, Bin; Ali Aldabaa, Abdalsamad Abdalsatar; Ghosh, Rakesh Kumar; Paul, Sathi; Nasim Ali, Md

    2015-05-01

    Using 108 petroleum contaminated soil samples, this pilot study proposed a new analytical approach of combining visible near-infrared diffuse reflectance spectroscopy (VisNIR DRS) and portable X-ray fluorescence spectrometry (PXRF) for rapid and improved quantification of soil petroleum contamination. Results indicated that an advanced fused model where VisNIR DRS spectra-based penalized spline regression (PSR) was used to predict total petroleum hydrocarbon followed by PXRF elemental data-based random forest regression was used to model the PSR residuals, it outperformed (R(2)=0.78, residual prediction deviation (RPD)=2.19) all other models tested, even producing better generalization than using VisNIR DRS alone (RPD's of 1.64, 1.86, and 1.96 for random forest, penalized spline regression, and partial least squares regression, respectively). Additionally, unsupervised principal component analysis using the PXRF+VisNIR DRS system qualitatively separated contaminated soils from control samples. Fusion of PXRF elemental data and VisNIR derivative spectra produced an optimized model for total petroleum hydrocarbon quantification in soils. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Bias due to two-stage residual-outcome regression analysis in genetic association studies.

    PubMed

    Demissie, Serkalem; Cupples, L Adrienne

    2011-11-01

    Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.

  6. A matching framework to improve causal inference in interrupted time-series analysis.

    PubMed

    Linden, Ariel

    2018-04-01

    Interrupted time-series analysis (ITSA) is a popular evaluation methodology in which a single treatment unit's outcome is studied over time and the intervention is expected to "interrupt" the level and/or trend of the outcome, subsequent to its introduction. When ITSA is implemented without a comparison group, the internal validity may be quite poor. Therefore, adding a comparable control group to serve as the counterfactual is always preferred. This paper introduces a novel matching framework, ITSAMATCH, to create a comparable control group by matching directly on covariates and then use these matches in the outcomes model. We evaluate the effect of California's Proposition 99 (passed in 1988) for reducing cigarette sales, by comparing California to other states not exposed to smoking reduction initiatives. We compare ITSAMATCH results to 2 commonly used matching approaches, synthetic controls (SYNTH), and regression adjustment; SYNTH reweights nontreated units to make them comparable to the treated unit, and regression adjusts covariates directly. Methods are compared by assessing covariate balance and treatment effects. Both ITSAMATCH and SYNTH achieved covariate balance and estimated similar treatment effects. The regression model found no treatment effect and produced inconsistent covariate adjustment. While the matching framework achieved results comparable to SYNTH, it has the advantage of being technically less complicated, while producing statistical estimates that are straightforward to interpret. Conversely, regression adjustment may "adjust away" a treatment effect. Given its advantages, ITSAMATCH should be considered as a primary approach for evaluating treatment effects in multiple-group time-series analysis. © 2017 John Wiley & Sons, Ltd.

  7. The association between short interpregnancy interval and preterm birth in Louisiana: a comparison of methods.

    PubMed

    Howard, Elizabeth J; Harville, Emily; Kissinger, Patricia; Xiong, Xu

    2013-07-01

    There is growing interest in the application of propensity scores (PS) in epidemiologic studies, especially within the field of reproductive epidemiology. This retrospective cohort study assesses the impact of a short interpregnancy interval (IPI) on preterm birth and compares the results of the conventional logistic regression analysis with analyses utilizing a PS. The study included 96,378 singleton infants from Louisiana birth certificate data (1995-2007). Five regression models designed for methods comparison are presented. Ten percent (10.17 %) of all births were preterm; 26.83 % of births were from a short IPI. The PS-adjusted model produced a more conservative estimate of the exposure variable compared to the conventional logistic regression method (β-coefficient: 0.21 vs. 0.43), as well as a smaller standard error (0.024 vs. 0.028), odds ratio and 95 % confidence intervals [1.15 (1.09, 1.20) vs. 1.23 (1.17, 1.30)]. The inclusion of more covariate and interaction terms in the PS did not change the estimates of the exposure variable. This analysis indicates that PS-adjusted regression may be appropriate for validation of conventional methods in a large dataset with a fairly common outcome. PS's may be beneficial in producing more precise estimates, especially for models with many confounders and effect modifiers and where conventional adjustment with logistic regression is unsatisfactory. Short intervals between pregnancies are associated with preterm birth in this population, according to either technique. Birth spacing is an issue that women have some control over. Educational interventions, including birth control, should be applied during prenatal visits and following delivery.

  8. Analysis of Private Returns to Vocational Education and Training: Support Document

    ERIC Educational Resources Information Center

    Lee, Wang-Sheng; Coelli, Michael

    2010-01-01

    This document is an appendix that is meant to accompany the main report, "Analysis of Private Returns to Vocational Education and Training". Included here are the detailed regression results that correspond to Tables 4 to 59 of the main report. This document was produced by the authors based on their research for the main report, and is…

  9. Predictions of biochar production and torrefaction performance from sugarcane bagasse using interpolation and regression analysis.

    PubMed

    Chen, Wei-Hsin; Hsu, Hung-Jen; Kumar, Gopalakrishnan; Budzianowski, Wojciech M; Ong, Hwai Chyuan

    2017-12-01

    This study focuses on the biochar formation and torrefaction performance of sugarcane bagasse, and they are predicted using the bilinear interpolation (BLI), inverse distance weighting (IDW) interpolation, and regression analysis. It is found that the biomass torrefied at 275°C for 60min or at 300°C for 30min or longer is appropriate to produce biochar as alternative fuel to coal with low carbon footprint, but the energy yield from the torrefaction at 300°C is too low. From the biochar yield, enhancement factor of HHV, and energy yield, the results suggest that the three methods are all feasible for predicting the performance, especially for the enhancement factor. The power parameter of unity in the IDW method provides the best predictions and the error is below 5%. The second order in regression analysis gives a more reasonable approach than the first order, and is recommended for the predictions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

    USGS Publications Warehouse

    Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

    2015-09-28

    Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.

  11. Application of principal component regression and partial least squares regression in ultraviolet spectrum water quality detection

    NASA Astrophysics Data System (ADS)

    Li, Jiangtong; Luo, Yongdao; Dai, Honglin

    2018-01-01

    Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.

  12. A robust ridge regression approach in the presence of both multicollinearity and outliers in the data

    NASA Astrophysics Data System (ADS)

    Shariff, Nurul Sima Mohamad; Ferdaos, Nur Aqilah

    2017-08-01

    Multicollinearity often leads to inconsistent and unreliable parameter estimates in regression analysis. This situation will be more severe in the presence of outliers it will cause fatter tails in the error distributions than the normal distributions. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is expected to be affected by the presence of outliers due to some assumptions imposed in the modeling procedure. Thus, the robust version of existing ridge method with some modification in the inverse matrix and the estimated response value is introduced. The performance of the proposed method is discussed and comparisons are made with several existing estimators namely, Ordinary Least Squares (OLS), ridge regression and robust ridge regression based on GM-estimates. The finding of this study is able to produce reliable parameter estimates in the presence of both multicollinearity and outliers in the data.

  13. Regression Models for the Analysis of Longitudinal Gaussian Data from Multiple Sources

    PubMed Central

    O’Brien, Liam M.; Fitzmaurice, Garrett M.

    2006-01-01

    We present a regression model for the joint analysis of longitudinal multiple source Gaussian data. Longitudinal multiple source data arise when repeated measurements are taken from two or more sources, and each source provides a measure of the same underlying variable and on the same scale. This type of data generally produces a relatively large number of observations per subject; thus estimation of an unstructured covariance matrix often may not be possible. We consider two methods by which parsimonious models for the covariance can be obtained for longitudinal multiple source data. The methods are illustrated with an example of multiple informant data arising from a longitudinal interventional trial in psychiatry. PMID:15726666

  14. Investigating bias in squared regression structure coefficients

    PubMed Central

    Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273

  15. Analysis of Variables to Predict First Year Persistence Using Logistic Regression Analysis at the University of South Florida

    ERIC Educational Resources Information Center

    Miller, T. E.; Herreid, C. H.

    2008-01-01

    This article presents a project intended to produce a model for predicting the risk of attrition of individual students enrolled at the University of South Florida. The project is premised upon the principle that college student attrition is as highly individual and personal as any other aspect of the college-going experience. Students make…

  16. Multivariate functional response regression, with application to fluorescence spectroscopy in a cervical pre-cancer study.

    PubMed

    Zhu, Hongxiao; Morris, Jeffrey S; Wei, Fengrong; Cox, Dennis D

    2017-07-01

    Many scientific studies measure different types of high-dimensional signals or images from the same subject, producing multivariate functional data. These functional measurements carry different types of information about the scientific process, and a joint analysis that integrates information across them may provide new insights into the underlying mechanism for the phenomenon under study. Motivated by fluorescence spectroscopy data in a cervical pre-cancer study, a multivariate functional response regression model is proposed, which treats multivariate functional observations as responses and a common set of covariates as predictors. This novel modeling framework simultaneously accounts for correlations between functional variables and potential multi-level structures in data that are induced by experimental design. The model is fitted by performing a two-stage linear transformation-a basis expansion to each functional variable followed by principal component analysis for the concatenated basis coefficients. This transformation effectively reduces the intra-and inter-function correlations and facilitates fast and convenient calculation. A fully Bayesian approach is adopted to sample the model parameters in the transformed space, and posterior inference is performed after inverse-transforming the regression coefficients back to the original data domain. The proposed approach produces functional tests that flag local regions on the functional effects, while controlling the overall experiment-wise error rate or false discovery rate. It also enables functional discriminant analysis through posterior predictive calculation. Analysis of the fluorescence spectroscopy data reveals local regions with differential expressions across the pre-cancer and normal samples. These regions may serve as biomarkers for prognosis and disease assessment.

  17. Selection of vegetation indices for mapping the sugarcane condition around the oil and gas field of North West Java Basin, Indonesia

    NASA Astrophysics Data System (ADS)

    Muji Susantoro, Tri; Wikantika, Ketut; Saepuloh, Asep; Handoyo Harsolumakso, Agus

    2018-05-01

    Selection of vegetation indices in plant mapping is needed to provide the best information of plant conditions. The methods used in this research are the standard deviation and the linear regression. This research tried to determine the vegetation indices used for mapping the sugarcane conditions around oil and gas fields. The data used in this study is Landsat 8 OLI/TIRS. The standard deviation analysis on the 23 vegetation indices with 27 samples has resulted in the six highest standard deviations of vegetation indices, termed as GRVI, SR, NLI, SIPI, GEMI and LAI. The standard deviation values are 0.47; 0.43; 0.30; 0.17; 0.16 and 0.13. Regression correlation analysis on the 23 vegetation indices with 280 samples has resulted in the six vegetation indices, termed as NDVI, ENDVI, GDVI, VARI, LAI and SIPI. This was performed based on regression correlation with the lowest value R2 than 0,8. The combined analysis of the standard deviation and the regression correlation has obtained the five vegetation indices, termed as NDVI, ENDVI, GDVI, LAI and SIPI. The results of the analysis of both methods show that a combination of two methods needs to be done to produce a good analysis of sugarcane conditions. It has been clarified through field surveys and showed good results for the prediction of microseepages.

  18. Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data

    ERIC Educational Resources Information Center

    Lee, Sik-Yum

    2006-01-01

    A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…

  19. Centering Effects in HLM Level-1 Predictor Variables.

    ERIC Educational Resources Information Center

    Schumacker, Randall E.; Bembry, Karen

    Research has suggested that important research questions can be addressed with meaningful interpretations using hierarchical linear modeling (HLM). The proper interpretation of results, however, is invariably linked to the choice of centering for the Level-1 predictor variables that produce the outcome measure for the Level-2 regression analysis.…

  20. The Impact of Consumer Credentialism on Employee and Entrepreneur Returns to Higher Education.

    ERIC Educational Resources Information Center

    Tucker, Irvin B., III

    1987-01-01

    Examines the relative importance of education credentials in consumer perceptions of self-employed business people. Using 1980 national cross-sectional data on goods- and service-producing occupations, the regression analysis shows that highly educated entrepreneurs are not influenced by consumer credentialism. Includes 17 references. (MLH)

  1. Academic Admission Requirements as Predictors of Counseling Knowledge, Personal Development, and Counseling Skills

    ERIC Educational Resources Information Center

    Smaby, Marlowe H.; Maddux, Cleborne D.; Richmond, Aaron S.; Lepkowski, William J.; Packman, Jill

    2005-01-01

    The authors investigated whether undergraduates' scores on the Verbal and Quantitative tests of the Graduate Record Examinations and their undergraduate grade point average can be used to predict knowledge, personal development, and skills of graduates of counseling programs. Multiple regression analysis produced significant models predicting…

  2. River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

    NASA Astrophysics Data System (ADS)

    Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

    2018-06-01

    Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.

  3. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  4. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    NASA Astrophysics Data System (ADS)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  5. The effect of working in an infection isolation room on hospital nurses' job satisfaction.

    PubMed

    Kagan, Ilya; Fridman, Shoshana; Shalom, Esther; Melnikov, Semyon

    2018-03-01

    To examine how the nature of working in a carbapenemase-producing Klebsiella pneumoniae infection isolation room affects nurses' job performance and job satisfaction. Job satisfaction is under intensive research as a factor in the retention of nursing staff. In a cross-sectional design study, a convenience sample of 87 registered nurses who had worked in carbapenemase-producing Klebsiella pneumoniae isolation rooms in a tertiary medical centre in Israel answered a self-administered questionnaire. Data were analysed by descriptive statistics, Pearson correlation coefficients, t tests, one-way ANOVA and multiple regression analysis. Job satisfaction was significantly correlated with perceived knowledge of carbapenemase-producing Klebsiella pneumoniae, with personal experience of working in an isolation room and the perceived level of professional functioning. Multiple regression analysis found that the quality of the nurses' personal experience of isolation room work and their perceived level of professional functioning there explained 33% of the variance in job satisfaction. Managers need to take into account that prolonged work in isolation can negatively impinge upon both performance and job satisfaction. Managers can consider refraining from lengthy nurse assignment to the isolation room. This would also apply to other areas of nursing practice where work is performed in isolation. © 2017 John Wiley & Sons Ltd.

  6. Estimating top-of-atmosphere thermal infrared radiance using MERRA-2 atmospheric data

    NASA Astrophysics Data System (ADS)

    Kleynhans, Tania; Montanaro, Matthew; Gerace, Aaron; Kanan, Christopher

    2017-05-01

    Thermal infrared satellite images have been widely used in environmental studies. However, satellites have limited temporal resolution, e.g., 16 day Landsat or 1 to 2 day Terra MODIS. This paper investigates the use of the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis data product, produced by NASA's Global Modeling and Assimilation Office (GMAO) to predict global topof-atmosphere (TOA) thermal infrared radiance. The high temporal resolution of the MERRA-2 data product presents opportunities for novel research and applications. Various methods were applied to estimate TOA radiance from MERRA-2 variables namely (1) a parameterized physics based method, (2) Linear regression models and (3) non-linear Support Vector Regression. Model prediction accuracy was evaluated using temporally and spatially coincident Moderate Resolution Imaging Spectroradiometer (MODIS) thermal infrared data as reference data. This research found that Support Vector Regression with a radial basis function kernel produced the lowest error rates. Sources of errors are discussed and defined. Further research is currently being conducted to train deep learning models to predict TOA thermal radiance

  7. Nonparametric methods for drought severity estimation at ungauged sites

    NASA Astrophysics Data System (ADS)

    Sadri, S.; Burn, D. H.

    2012-12-01

    The objective in frequency analysis is, given extreme events such as drought severity or duration, to estimate the relationship between that event and the associated return periods at a catchment. Neural networks and other artificial intelligence approaches in function estimation and regression analysis are relatively new techniques in engineering, providing an attractive alternative to traditional statistical models. There are, however, few applications of neural networks and support vector machines in the area of severity quantile estimation for drought frequency analysis. In this paper, we compare three methods for this task: multiple linear regression, radial basis function neural networks, and least squares support vector regression (LS-SVR). The area selected for this study includes 32 catchments in the Canadian Prairies. From each catchment drought severities are extracted and fitted to a Pearson type III distribution, which act as observed values. For each method-duration pair, we use a jackknife algorithm to produce estimated values at each site. The results from these three approaches are compared and analyzed, and it is found that LS-SVR provides the best quantile estimates and extrapolating capacity.

  8. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression.

    PubMed

    Beckstead, Jason W

    2012-03-30

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic strategy to isolate, examine, and remove suppression effects has been offered. In this article such an approach, rooted in confirmatory factor analysis theory and employing matrix algebra, is developed. Suppression is viewed as the result of criterion-irrelevant variance operating among predictors. Decomposition of predictor variables into criterion-relevant and criterion-irrelevant components using structural equation modeling permits derivation of regression weights with the effects of criterion-irrelevant variance omitted. Three examples with data from applied research are used to illustrate the approach: the first assesses child and parent characteristics to explain why some parents of children with obsessive-compulsive disorder accommodate their child's compulsions more so than do others, the second examines various dimensions of personal health to explain individual differences in global quality of life among patients following heart surgery, and the third deals with quantifying the relative importance of various aptitudes for explaining academic performance in a sample of nursing students. The approach is offered as an analytic tool for investigators interested in understanding predictor-criterion relationships when complex patterns of intercorrelation among predictors are present and is shown to augment dominance analysis.

  9. GIS and statistical analysis for landslide susceptibility mapping in the Daunia area, Italy

    NASA Astrophysics Data System (ADS)

    Mancini, F.; Ceppi, C.; Ritrovato, G.

    2010-09-01

    This study focuses on landslide susceptibility mapping in the Daunia area (Apulian Apennines, Italy) and achieves this by using a multivariate statistical method and data processing in a Geographical Information System (GIS). The Logistic Regression (hereafter LR) method was chosen to produce a susceptibility map over an area of 130 000 ha where small settlements are historically threatened by landslide phenomena. By means of LR analysis, the tendency to landslide occurrences was, therefore, assessed by relating a landslide inventory (dependent variable) to a series of causal factors (independent variables) which were managed in the GIS, while the statistical analyses were performed by means of the SPSS (Statistical Package for the Social Sciences) software. The LR analysis produced a reliable susceptibility map of the investigated area and the probability level of landslide occurrence was ranked in four classes. The overall performance achieved by the LR analysis was assessed by local comparison between the expected susceptibility and an independent dataset extrapolated from the landslide inventory. Of the samples classified as susceptible to landslide occurrences, 85% correspond to areas where landslide phenomena have actually occurred. In addition, the consideration of the regression coefficients provided by the analysis demonstrated that a major role is played by the "land cover" and "lithology" causal factors in determining the occurrence and distribution of landslide phenomena in the Apulian Apennines.

  10. Modeling critical habitat for Flammulated Owls (Otus flammeolus)

    Treesearch

    David A. Christie; Astrid M. van Woudenberg

    1997-01-01

    Multiple logistic regression analysis was used to produce a prediction model for Flammulated Owl (Otus flammeolus) breeding habitat within the Kamloops Forest Region in south-central British Columbia. Using the model equation, a pilot habitat prediction map was created within a Geographic Information System (GIS) environment that had a 75.7 percent...

  11. Automating annotation of information-giving for analysis of clinical conversation.

    PubMed

    Mayfield, Elijah; Laws, M Barton; Wilson, Ira B; Penstein Rosé, Carolyn

    2014-02-01

    Coding of clinical communication for fine-grained features such as speech acts has produced a substantial literature. However, annotation by humans is laborious and expensive, limiting application of these methods. We aimed to show that through machine learning, computers could code certain categories of speech acts with sufficient reliability to make useful distinctions among clinical encounters. The data were transcripts of 415 routine outpatient visits of HIV patients which had previously been coded for speech acts using the Generalized Medical Interaction Analysis System (GMIAS); 50 had also been coded for larger scale features using the Comprehensive Analysis of the Structure of Encounters System (CASES). We aggregated selected speech acts into information-giving and requesting, then trained the machine to automatically annotate using logistic regression classification. We evaluated reliability by per-speech act accuracy. We used multiple regression to predict patient reports of communication quality from post-visit surveys using the patient and provider information-giving to information-requesting ratio (briefly, information-giving ratio) and patient gender. Automated coding produces moderate reliability with human coding (accuracy 71.2%, κ=0.57), with high correlation between machine and human prediction of the information-giving ratio (r=0.96). The regression significantly predicted four of five patient-reported measures of communication quality (r=0.263-0.344). The information-giving ratio is a useful and intuitive measure for predicting patient perception of provider-patient communication quality. These predictions can be made with automated annotation, which is a practical option for studying large collections of clinical encounters with objectivity, consistency, and low cost, providing greater opportunity for training and reflection for care providers.

  12. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  13. Time Series Analysis of Soil Radon Data Using Multiple Linear Regression and Artificial Neural Network in Seismic Precursory Studies

    NASA Astrophysics Data System (ADS)

    Singh, S.; Jaishi, H. P.; Tiwari, R. P.; Tiwari, R. C.

    2017-07-01

    This paper reports the analysis of soil radon data recorded in the seismic zone-V, located in the northeastern part of India (latitude 23.73N, longitude 92.73E). Continuous measurements of soil-gas emission along Chite fault in Mizoram (India) were carried out with the replacement of solid-state nuclear track detectors at weekly interval. The present study was done for the period from March 2013 to May 2015 using LR-115 Type II detectors, manufactured by Kodak Pathe, France. In order to reduce the influence of meteorological parameters, statistical analysis tools such as multiple linear regression and artificial neural network have been used. Decrease in radon concentration was recorded prior to some earthquakes that occurred during the observation period. Some false anomalies were also recorded which may be attributed to the ongoing crustal deformation which was not major enough to produce an earthquake.

  14. Large biases in regression-based constituent flux estimates: causes and diagnostic tools

    USGS Publications Warehouse

    Hirsch, Robert M.

    2014-01-01

    It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.

  15. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth.

    PubMed

    Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi

    2014-04-01

    To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.

  16. Modeling Outcomes with Floor or Ceiling Effects: An Introduction to the Tobit Model

    ERIC Educational Resources Information Center

    McBee, Matthew

    2010-01-01

    In gifted education research, it is common for outcome variables to exhibit strong floor or ceiling effects due to insufficient range of measurement of many instruments when used with gifted populations. Common statistical methods (e.g., analysis of variance, linear regression) produce biased estimates when such effects are present. In practice,…

  17. Regression and Geostatistical Techniques: Considerations and Observations from Experiences in NE-FIA

    Treesearch

    Rachel Riemann; Andrew Lister

    2005-01-01

    Maps of forest variables improve our understanding of the forest resource by allowing us to view and analyze it spatially. The USDA Forest Service's Northeastern Forest Inventory and Analysis unit (NE-FIA) has used geostatistical techniques, particularly stochastic simulation, to produce maps and spatial data sets of FIA variables. That work underscores the...

  18. Advantages of continuous genotype values over genotype classes for GWAS in higher polyploids: a comparative study in hexaploid chrysanthemum.

    PubMed

    Grandke, Fabian; Singh, Priyanka; Heuven, Henri C M; de Haan, Jorn R; Metzler, Dirk

    2016-08-24

    Association studies are an essential part of modern plant breeding, but are limited for polyploid crops. The increased number of possible genotype classes complicates the differentiation between them. Available methods are limited with respect to the ploidy level or data producing technologies. While genotype classification is an established noise reduction step in diploids, it gains complexity with increasing ploidy levels. Eventually, the errors produced by misclassifications exceed the benefits of genotype classes. Alternatively, continuous genotype values can be used for association analysis in higher polyploids. We associated continuous genotypes to three different traits and compared the results to the output of the genotype caller SuperMASSA. Linear, Bayesian and partial least squares regression were applied, to determine if the use of continuous genotypes is limited to a specific method. A disease, a flowering and a growth trait with h (2) of 0.51, 0.78 and 0.91 were associated with a hexaploid chrysanthemum genotypes. The data set consisted of 55,825 probes and 228 samples. We were able to detect associating probes using continuous genotypes for multiple traits, using different regression methods. The identified probe sets were overlapping, but not identical between the methods. Baysian regression was the most restrictive method, resulting in ten probes for one trait and none for the others. Linear and partial least squares regression led to numerous associating probes. Association based on genotype classes resulted in similar values, but missed several significant probes. A simulation study was used to successfully validate the number of associating markers. Association of various phenotypic traits with continuous genotypes is successful with both uni- and multivariate regression methods. Genotype calling does not improve the association and shows no advantages in this study. Instead, use of continuous genotypes simplifies the analysis, saves computational time and results more potential markers.

  19. The effect of playing tactics and situational variables on achieving score-box possessions in a professional soccer team.

    PubMed

    Lago-Ballesteros, Joaquin; Lago-Peñas, Carlos; Rey, Ezequiel

    2012-01-01

    The aim of this study was to analyse the influence of playing tactics, opponent interaction and situational variables on achieving score-box possessions in professional soccer. The sample was constituted by 908 possessions obtained by a team from the Spanish soccer league in 12 matches played during the 2009-2010 season. Multidimensional qualitative data obtained from 12 ordered categorical variables were used. Sampled matches were registered by the AMISCO PRO system. Data were analysed using chi-square analysis and multiple logistic regression analysis. Of 908 possessions, 303 (33.4%) produced score-box possessions, 477 (52.5%) achieved progression and 128 (14.1%) failed to reach any sort of progression. Multiple logistic regression showed that, for the main variable "team possession type", direct attacks and counterattacks were three times more effective than elaborate attacks for producing a score-box possession (P < 0.05). Team possession originating from the middle zones and playing against less than six defending players (P < 0.001) registered a higher success than those started in the defensive zone with a balanced defence. When the team was drawing or winning, the probability of reaching the score-box decreased by 43 and 53 percent, respectively, compared with the losing situation (P < 0.05). Accounting for opponent interactions and situational variables is critical to evaluate the effectiveness of offensive playing tactics on producing score-box possessions.

  20. A step-by-step guide to non-linear regression analysis of experimental data using a Microsoft Excel spreadsheet.

    PubMed

    Brown, A M

    2001-06-01

    The objective of this present study was to introduce a simple, easily understood method for carrying out non-linear regression analysis based on user input functions. While it is relatively straightforward to fit data with simple functions such as linear or logarithmic functions, fitting data with more complicated non-linear functions is more difficult. Commercial specialist programmes are available that will carry out this analysis, but these programmes are expensive and are not intuitive to learn. An alternative method described here is to use the SOLVER function of the ubiquitous spreadsheet programme Microsoft Excel, which employs an iterative least squares fitting routine to produce the optimal goodness of fit between data and function. The intent of this paper is to lead the reader through an easily understood step-by-step guide to implementing this method, which can be applied to any function in the form y=f(x), and is well suited to fast, reliable analysis of data in all fields of biology.

  1. Utility-Based Instruments for People with Dementia: A Systematic Review and Meta-Regression Analysis.

    PubMed

    Li, Li; Nguyen, Kim-Huong; Comans, Tracy; Scuffham, Paul

    2018-04-01

    Several utility-based instruments have been applied in cost-utility analysis to assess health state values for people with dementia. Nevertheless, concerns and uncertainty regarding their performance for people with dementia have been raised. To assess the performance of available utility-based instruments for people with dementia by comparing their psychometric properties and to explore factors that cause variations in the reported health state values generated from those instruments by conducting meta-regression analyses. A literature search was conducted and psychometric properties were synthesized to demonstrate the overall performance of each instrument. When available, health state values and variables such as the type of instrument and cognitive impairment levels were extracted from each article. A meta-regression analysis was undertaken and available covariates were included in the models. A total of 64 studies providing preference-based values were identified and included. The EuroQol five-dimension questionnaire demonstrated the best combination of feasibility, reliability, and validity. Meta-regression analyses suggested that significant differences exist between instruments, type of respondents, and mode of administration and the variations in estimated utility values had influences on incremental quality-adjusted life-year calculation. This review finds that the EuroQol five-dimension questionnaire is the most valid utility-based instrument for people with dementia, but should be replaced by others under certain circumstances. Although no utility estimates were reported in the article, the meta-regression analyses that examined variations in utility estimates produced by different instruments impact on cost-utility analysis, potentially altering the decision-making process in some circumstances. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  2. Evaluating the utility of companion animal tick surveillance practices for monitoring spread and occurrence of human Lyme disease in West Virginia, 2014-2016.

    PubMed

    Hendricks, Brian; Mark-Carew, Miguella; Conley, Jamison

    2017-11-13

    Domestic dogs and cats are potentially effective sentinel populations for monitoring occurrence and spread of Lyme disease. Few studies have evaluated the public health utility of sentinel programmes using geo-analytic approaches. Confirmed Lyme disease cases diagnosed by physicians and ticks submitted by veterinarians to the West Virginia State Health Department were obtained for 2014-2016. Ticks were identified to species, and only Ixodes scapularis were incorporated in the analysis. Separate ordinary least squares (OLS) and spatial lag regression models were conducted to estimate the association between average numbers of Ix. scapularis collected on pets and human Lyme disease incidence. Regression residuals were visualised using Local Moran's I as a diagnostic tool to identify spatial dependence. Statistically significant associations were identified between average numbers of Ix. scapularis collected from dogs and human Lyme disease in the OLS (β=20.7, P<0.001) and spatial lag (β=12.0, P=0.002) regression. No significant associations were identified for cats in either regression model. Statistically significant (P≤0.05) spatial dependence was identified in all regression models. Local Moran's I maps produced for spatial lag regression residuals indicated a decrease in model over- and under-estimation, but identified a higher number of statistically significant outliers than OLS regression. Results support previous conclusions that dogs are effective sentinel populations for monitoring risk of human exposure to Lyme disease. Findings reinforce the utility of spatial analysis of surveillance data, and highlight West Virginia's unique position within the eastern United States in regards to Lyme disease occurrence.

  3. A single determinant dominates the rate of yeast protein evolution.

    PubMed

    Drummond, D Allan; Raval, Alpan; Wilke, Claus O

    2006-02-01

    A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.

  4. Regression Is a Univariate General Linear Model Subsuming Other Parametric Methods as Special Cases.

    ERIC Educational Resources Information Center

    Vidal, Sherry

    Although the concept of the general linear model (GLM) has existed since the 1960s, other univariate analyses such as the t-test and the analysis of variance models have remained popular. The GLM produces an equation that minimizes the mean differences of independent variables as they are related to a dependent variable. From a computer printout…

  5. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  6. High regression rate hybrid rocket fuel grains with helical port structures

    NASA Astrophysics Data System (ADS)

    Walker, Sean D.

    Hybrid rockets are popular in the aerospace industry due to their storage safety, simplicity, and controllability during rocket motor burn. However, they produce fuel regression rates typically 25% lower than solid fuel motors of the same thrust level. These lowered regression rates produce unacceptably high oxidizer-to-fuel (O/F) ratios that produce a potential for motor instability, nozzle erosion, and reduced motor duty cycles. To achieve O/F ratios that produce acceptable combustion characteristics, traditional cylindrical fuel ports are fabricated with very long length-to-diameter ratios to increase the total burning area. These high aspect ratios produce further reduced fuel regression rate and thrust levels, poor volumetric efficiency, and a potential for lateral structural loading issues during high thrust burns. In place of traditional cylindrical fuel ports, it is proposed that by researching the effects of centrifugal flow patterns introduced by embedded helical fuel port structures, a significant increase in fuel regression rates can be observed. The benefits of increasing volumetric efficiencies by lengthening the internal flow path will also be observed. The mechanisms of this increased fuel regression rate are driven by enhancing surface skin friction and reducing the effect of boundary layer "blowing" to enhance convective heat transfer to the fuel surface. Preliminary results using additive manufacturing to fabricate hybrid rocket fuel grains from acrylonitrile-butadiene-styrene (ABS) with embedded helical fuel port structures have been obtained, with burn-rate amplifications up to 3.0x than that of cylindrical fuel ports.

  7. On The Impact of Climate Change to Agricultural Productivity in East Java

    NASA Astrophysics Data System (ADS)

    Kuswanto, Heri; Salamah, Mutiah; Mumpuni Retnaningsih, Sri; Dwi Prastyo, Dedy

    2018-03-01

    Many researches showed that climate change has significant impact on agricultural sector, which threats the food security especially in developing countries. It has been observed also that the climate change increases the intensity of extreme events. This research investigated the impact climate to the agricultural productivity in East Java, as one of the main rice producers in Indonesia. Standard regression as well as panel regression models have been performed in order to find the best model which is able to describe the climate change impact. The analysis found that the fixed effect model of panel regression outperforms the others showing that climate change had negatively impacted the rice productivity in East Java. The effect in Malang and Pasuruan were almost the same, while the impact in Sumenep was the least one compared to other districts.

  8. Prediction models for CO2 emission in Malaysia using best subsets regression and multi-linear regression

    NASA Astrophysics Data System (ADS)

    Tan, C. H.; Matjafri, M. Z.; Lim, H. S.

    2015-10-01

    This paper presents the prediction models which analyze and compute the CO2 emission in Malaysia. Each prediction model for CO2 emission will be analyzed based on three main groups which is transportation, electricity and heat production as well as residential buildings and commercial and public services. The prediction models were generated using data obtained from World Bank Open Data. Best subset method will be used to remove irrelevant data and followed by multi linear regression to produce the prediction models. From the results, high R-square (prediction) value was obtained and this implies that the models are reliable to predict the CO2 emission by using specific data. In addition, the CO2 emissions from these three groups are forecasted using trend analysis plots for observation purpose.

  9. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  10. [Establishment of multiple regression model for virulence factors of Saccharomyces albicans by random amplified polymorphic DNA bands].

    PubMed

    Liu, Qi; Wu, Youcong; Yuan, Youhua; Bai, Li; Niu, Kun

    2011-12-01

    To research the relationship between the virulence factors of Saccharomyces albicans (S. albicans) and the random amplified polymorphic DNA (RAPD) bands of them, and establish the regression model by multiple regression analysis. Extracellular phospholipase, secreted proteinase, ability to generate germ tubes and adhere to oral mucosal cells of 92 strains of S. albicans were measured in vitro; RAPD-polymerase chain reaction (RAPD-PCR) was used to get their bands. Multiple regression for virulence factors of S. albicans and RAPD-PCR bands was established. The extracellular phospholipase activity was associated with 4 RAPD bands: 350, 450, 650 and 1 300 bp (P < 0.05); secreted proteinase activity of S. albicans was associated with 2 bands: 350 and 1 200 bp (P < 0.05); the ability of germ tube produce was associated with 2 bands: 400 and 550 bp (P < 0.05). Some RAPD bands will reflect the virulence factors of S. albicans indirectly. These bands would contain some important messages for regulation of S. albicans virulence factors.

  11. An Analysis of the Number of Medical Malpractice Claims and Their Amounts

    PubMed Central

    Bonetti, Marco; Cirillo, Pasquale; Musile Tanzi, Paola; Trinchero, Elisabetta

    2016-01-01

    Starting from an extensive database, pooling 9 years of data from the top three insurance brokers in Italy, and containing 38125 reported claims due to alleged cases of medical malpractice, we use an inhomogeneous Poisson process to model the number of medical malpractice claims in Italy. The intensity of the process is allowed to vary over time, and it depends on a set of covariates, like the size of the hospital, the medical department and the complexity of the medical operations performed. We choose the combination medical department by hospital as the unit of analysis. Together with the number of claims, we also model the associated amounts paid by insurance companies, using a two-stage regression model. In particular, we use logistic regression for the probability that a claim is closed with a zero payment, whereas, conditionally on the fact that an amount is strictly positive, we make use of lognormal regression to model it as a function of several covariates. The model produces estimates and forecasts that are relevant to both insurance companies and hospitals, for quality assurance, service improvement and cost reduction. PMID:27077661

  12. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  13. Non-destructive analysis of sensory traits of dry-cured loins by MRI-computer vision techniques and data mining.

    PubMed

    Caballero, Daniel; Antequera, Teresa; Caro, Andrés; Ávila, María Del Mar; G Rodríguez, Pablo; Perez-Palacios, Trinidad

    2017-07-01

    Magnetic resonance imaging (MRI) combined with computer vision techniques have been proposed as an alternative or complementary technique to determine the quality parameters of food in a non-destructive way. The aim of this work was to analyze the sensory attributes of dry-cured loins using this technique. For that, different MRI acquisition sequences (spin echo, gradient echo and turbo 3D), algorithms for MRI analysis (GLCM, NGLDM, GLRLM and GLCM-NGLDM-GLRLM) and predictive data mining techniques (multiple linear regression and isotonic regression) were tested. The correlation coefficient (R) and mean absolute error (MAE) were used to validate the prediction results. The combination of spin echo, GLCM and isotonic regression produced the most accurate results. In addition, the MRI data from dry-cured loins seems to be more suitable than the data from fresh loins. The application of predictive data mining techniques on computational texture features from the MRI data of loins enables the determination of the sensory traits of dry-cured loins in a non-destructive way. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  14. Linear regression metamodeling as a tool to summarize and present simulation model results.

    PubMed

    Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

    2013-10-01

    Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

  15. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  16. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    PubMed

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  17. Multivariable confounding adjustment in distributed data networks without sharing of patient-level data.

    PubMed

    Toh, Sengwee; Reichman, Marsha E; Houstoun, Monika; Ding, Xiao; Fireman, Bruce H; Gravel, Eric; Levenson, Mark; Li, Lingling; Moyneur, Erick; Shoaibi, Azadeh; Zornberg, Gwen; Hennessy, Sean

    2013-11-01

    It is increasingly necessary to analyze data from multiple sources when conducting public health safety surveillance or comparative effectiveness research. However, security, privacy, proprietary, and legal concerns often reduce data holders' willingness to share highly granular information. We describe and compare two approaches that do not require sharing of patient-level information to adjust for confounding in multi-site studies. We estimated the risks of angioedema associated with angiotensin-converting enzyme inhibitors (ACEIs), angiotensin receptor blockers (ARBs), and aliskiren in comparison with beta-blockers within Mini-Sentinel, which has created a distributed data system of 18 health plans. To obtain the adjusted hazard ratios (HRs) and 95% confidence intervals (CIs), we performed (i) a propensity score-stratified case-centered logistic regression analysis, a method identical to a stratified Cox regression analysis but needing only aggregated risk set data, and (ii) an inverse variance-weighted meta-analysis, which requires only the site-specific HR and variance. We also performed simulations to further compare the two methods. Compared with beta-blockers, the adjusted HR was 3.04 (95% CI: 2.81, 3.27) for ACEIs, 1.16 (1.00, 1.34) for ARBs, and 2.85 (1.34, 6.04) for aliskiren in the case-centered analysis. The corresponding HRs were 2.98 (2.76, 3.21), 1.15 (1.00, 1.33), and 2.86 (1.35, 6.04) in the meta-analysis. Simulations suggested that the two methods may produce different results under certain analytic scenarios. The case-centered analysis and the meta-analysis produced similar results without the need to share patient-level data across sites in our empirical study, but may provide different results in other study settings. Copyright © 2013 John Wiley & Sons, Ltd.

  18. Environmental Regulations and Changes in Petroleum Refining Operations (Short-Term Energy Outlook Supplement June 1998)

    EIA Publications

    1998-01-01

    Changes in domestic refining operations are identified and related to the summer Reid vapor pressure (RVP) restrictions and oxygenate blending requirements. This analysis uses published Energy Information Administration survey data and linear regression equations from the Short-Term Integrated Forecasting System (STIFS). The STIFS model is used for producing forecasts appearing in the Short-Term Energy Outlook.

  19. Bark yields of 11-year-old loblolly pine as influenced by competition control and fertilization

    Treesearch

    Allan E. Tiarks; James D. Haywood

    1992-01-01

    Bolts cut from 11-year-old loblolly pines (Pinus taeda L.) were measured to determine the effects of applications of fertilizer and competition control treatments on the amount of pine bark produced. Bark thickness at breast height was not significantly affected by any of the treatments. Regression analysis showed that the dry weight of bark per unit...

  20. Estimating equations estimates of trends

    USGS Publications Warehouse

    Link, W.A.; Sauer, J.R.

    1994-01-01

    The North American Breeding Bird Survey monitors changes in bird populations through time using annual counts at fixed survey sites. The usual method of estimating trends has been to use the logarithm of the counts in a regression analysis. It is contended that this procedure is reasonably satisfactory for more abundant species, but produces biased estimates for less abundant species. An alternative estimation procedure based on estimating equations is presented.

  1. Two-dimensional advective transport in ground-water flow parameter estimation

    USGS Publications Warehouse

    Anderman, E.R.; Hill, M.C.; Poeter, E.P.

    1996-01-01

    Nonlinear regression is useful in ground-water flow parameter estimation, but problems of parameter insensitivity and correlation often exist given commonly available hydraulic-head and head-dependent flow (for example, stream and lake gain or loss) observations. To address this problem, advective-transport observations are added to the ground-water flow, parameter-estimation model MODFLOWP using particle-tracking methods. The resulting model is used to investigate the importance of advective-transport observations relative to head-dependent flow observations when either or both are used in conjunction with hydraulic-head observations in a simulation of the sewage-discharge plume at Otis Air Force Base, Cape Cod, Massachusetts, USA. The analysis procedure for evaluating the probable effect of new observations on the regression results consists of two steps: (1) parameter sensitivities and correlations calculated at initial parameter values are used to assess the model parameterization and expected relative contributions of different types of observations to the regression; and (2) optimal parameter values are estimated by nonlinear regression and evaluated. In the Cape Cod parameter-estimation model, advective-transport observations did not significantly increase the overall parameter sensitivity; however: (1) inclusion of advective-transport observations decreased parameter correlation enough for more unique parameter values to be estimated by the regression; (2) realistic uncertainties in advective-transport observations had a small effect on parameter estimates relative to the precision with which the parameters were estimated; and (3) the regression results and sensitivity analysis provided insight into the dynamics of the ground-water flow system, especially the importance of accurate boundary conditions. In this work, advective-transport observations improved the calibration of the model and the estimation of ground-water flow parameters, and use of regression and related techniques produced significant insight into the physical system.

  2. Velocity structure in long period variable star atmospheres

    NASA Technical Reports Server (NTRS)

    Pilachowski, C.; Wallerstein, G.; Willson, L. A.

    1980-01-01

    A regression analysis of the dependence of absorption line velocities on wavelength, line strength, excitation potential, and ionization potential is presented. The method determines the region of formation of the absorption lines for a given data and wavelength region. It is concluded that the scatter which is frequently found in velocity measurements of absorption lines in long period variables is probably the result of a shock of moderate amplitude located in or near the reversing layer and that the frequently observed correlation of velocity with excitation and ionization are a result of the velocity gradients produced by this shock in the atmosphere. A simple interpretation of the signs of the coefficients of the regression analysis is presented in terms of preshock, post shock, or across the shock, together with criteria for evaluating the validity of the fit. The amplitude of the reversing layer shock is estimated from an analysis of a series of plates for four long period variable stars along with the most probable stellar velocity for these stars.

  3. Stereophotogrammetrie Mass Distribution Parameter Determination Of The Lower Body Segments For Use In Gait Analysis

    NASA Astrophysics Data System (ADS)

    Sheffer, Daniel B.; Schaer, Alex R.; Baumann, Juerg U.

    1989-04-01

    Inclusion of mass distribution information in biomechanical analysis of motion is a requirement for the accurate calculation of external moments and forces acting on the segmental joints during locomotion. Regression equations produced from a variety of photogrammetric, anthropometric and cadaeveric studies have been developed and espoused in literature. Because of limitations in the accuracy of predicted inertial properties based on the application of regression equation developed on one population and then applied on a different study population, the employment of a measurement technique that accurately defines the shape of each individual subject measured is desirable. This individual data acquisition method is especially needed when analyzing the gait of subjects with large differences in their extremity geo-metry from those considered "normal", or who may possess gross asymmetries in shape in their own contralateral limbs. This study presents the photogrammetric acquisition and data analysis methodology used to assess the inertial tensors of two groups of subjects, one with spastic diplegic cerebral palsy and the other considered normal.

  4. Short wavelength Raman spectroscopy applied to the discrimination and characterization of three cultivars of extra virgin olive oils in different maturation stages.

    PubMed

    Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A

    2015-01-01

    Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. An analysis of the magnitude and frequency of floods on Oahu, Hawaii

    USGS Publications Warehouse

    Nakahara, R.H.

    1980-01-01

    An analysis of available peak-flow data for the island of Oahu, Hawaii, was made by using multiple regression techniques which related flood-frequency data to basin and climatic characteristics for 74 gaging stations on Oahu. In the analysis, several different groupings of stations were investigated, including divisions by geographic location and size of drainage area. The grouping consisting of two leeward divisions and one windward division produced the best results. Drainage basins ranged in area from 0.03 to 45.7 square miles. Equations relating flood magnitudes of selected frequencies to basin characteristics were developed for the three divisions of Oahu. These equations can be used to estimate the magnitude and frequency of floods for any site, gaged or ungaged, for any desired recurrence interval from 2 to 100 years. Data on basin characteristics, flood magnitudes for various recurrence intervals from individual station-frequency curves, and computed flood magnitudes by use of the regression equation are tabulated to provide the needed data. (USGS)

  6. XAP, a program for deconvolution and analysis of complex X-ray spectra

    USGS Publications Warehouse

    Quick, James E.; Haleby, Abdul Malik

    1989-01-01

    The X-ray analysis program (XAP) is a spectral-deconvolution program written in BASIC and specifically designed to analyze complex spectra produced by energy-dispersive X-ray analytical systems (EDS). XAP compensates for spectrometer drift, utilizes digital filtering to remove background from spectra, and solves for element abundances by least-squares, multiple-regression analysis. Rather than base analyses on only a few channels, broad spectral regions of a sample are reconstructed from standard reference spectra. The effects of this approach are (1) elimination of tedious spectrometer adjustments, (2) removal of background independent of sample composition, and (3) automatic correction for peak overlaps. Although the program was written specifically to operate a KEVEX 7000 X-ray fluorescence analytical system, it could be adapted (with minor modifications) to analyze spectra produced by scanning electron microscopes, electron microprobes, and probes, and X-ray defractometer patterns obtained from whole-rock powders.

  7. Chicken barn climate and hazardous volatile compounds control using simple linear regression and PID

    NASA Astrophysics Data System (ADS)

    Abdullah, A. H.; Bakar, M. A. A.; Shukor, S. A. A.; Saad, F. S. A.; Kamis, M. S.; Mustafa, M. H.; Khalid, N. S.

    2016-07-01

    The hazardous volatile compounds from chicken manure in chicken barn are potentially to be a health threat to the farm animals and workers. Ammonia (NH3) and hydrogen sulphide (H2S) produced in chicken barn are influenced by climate changes. The Electronic Nose (e-nose) is used for the barn's air, temperature and humidity data sampling. Simple Linear Regression is used to identify the correlation between temperature-humidity, humidity-ammonia and ammonia-hydrogen sulphide. MATLAB Simulink software was used for the sample data analysis using PID controller. Results shows that the performance of PID controller using the Ziegler-Nichols technique can improve the system controller to control climate in chicken barn.

  8. Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression

    NASA Astrophysics Data System (ADS)

    García-Rodríguez, M. J.; Malpica, J. A.; Benito, B.; Díaz, M.

    2008-03-01

    This work has evaluated the probability of earthquake-triggered landslide occurrence in the whole of El Salvador, with a Geographic Information System (GIS) and a logistic regression model. Slope gradient, elevation, aspect, mean annual precipitation, lithology, land use, and terrain roughness are the predictor variables used to determine the dependent variable of occurrence or non-occurrence of landslides within an individual grid cell. The results illustrate the importance of terrain roughness and soil type as key factors within the model — using only these two variables the analysis returned a significance level of 89.4%. The results obtained from the model within the GIS were then used to produce a map of relative landslide susceptibility.

  9. Interlaboratory comparability, bias, and precision for four laboratories measuring constituents in precipitation, November 1982-August 1983

    USGS Publications Warehouse

    Brooks, M.H.; Schroder, L.J.; Malo, B.A.

    1985-01-01

    Four laboratories were evaluated in their analysis of identical natural and simulated precipitation water samples. Interlaboratory comparability was evaluated using analysis of variance coupled with Duncan 's multiple range test, and linear-regression models describing the relations between individual laboratory analytical results for natural precipitation samples. Results of the statistical analyses indicate that certain pairs of laboratories produce different results when analyzing identical samples. Analyte bias for each laboratory was examined using analysis of variance coupled with Duncan 's multiple range test on data produced by the laboratories from the analysis of identical simulated precipitation samples. Bias for a given analyte produced by a single laboratory has been indicated when the laboratory mean for that analyte is shown to be significantly different from the mean for the most-probable analyte concentrations in the simulated precipitation samples. Ion-chromatographic methods for the determination of chloride, nitrate, and sulfate have been compared with the colorimetric methods that were also in use during the study period. Comparisons were made using analysis of variance coupled with Duncan 's multiple range test for means produced by the two methods. Analyte precision for each laboratory has been estimated by calculating a pooled variance for each analyte. Analyte estimated precisions have been compared using F-tests and differences in analyte precisions for laboratory pairs have been reported. (USGS)

  10. An application of robust ridge regression model in the presence of outliers to real data problem

    NASA Astrophysics Data System (ADS)

    Shariff, N. S. Md.; Ferdaos, N. A.

    2017-09-01

    Multicollinearity and outliers are often leads to inconsistent and unreliable parameter estimates in regression analysis. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is believed are affected by the presence of outlier. The combination of GM-estimation and ridge parameter that is robust towards both problems is on interest in this study. As such, both techniques are employed to investigate the relationship between stock market price and macroeconomic variables in Malaysia due to curiosity of involving the multicollinearity and outlier problem in the data set. There are four macroeconomic factors selected for this study which are Consumer Price Index (CPI), Gross Domestic Product (GDP), Base Lending Rate (BLR) and Money Supply (M1). The results demonstrate that the proposed procedure is able to produce reliable results towards the presence of multicollinearity and outliers in the real data.

  11. Characterization of the spatial variability of soil available zinc at various sampling densities using grouped soil type information.

    PubMed

    Song, Xiao-Dong; Zhang, Gan-Lin; Liu, Feng; Li, De-Cheng; Zhao, Yu-Guo

    2016-11-01

    The influence of anthropogenic activities and natural processes involved high uncertainties to the spatial variation modeling of soil available zinc (AZn) in plain river network regions. Four datasets with different sampling densities were split over the Qiaocheng district of Bozhou City, China. The difference of AZn concentrations regarding soil types was analyzed by the principal component analysis (PCA). Since the stationarity was not indicated and effective ranges of four datasets were larger than the sampling extent (about 400 m), two investigation tools, namely F3 test and stationarity index (SI), were employed to test the local non-stationarity. Geographically weighted regression (GWR) technique was performed to describe the spatial heterogeneity of AZn concentrations under the non-stationarity assumption. GWR based on grouped soil type information (GWRG for short) was proposed so as to benefit the local modeling of soil AZn within each soil-landscape unit. For reference, the multiple linear regression (MLR) model, a global regression technique, was also employed and incorporated the same predictors as in the GWR models. Validation results based on 100 times realization demonstrated that GWRG outperformed MLR and can produce similar or better accuracy than the GWR approach. Nevertheless, GWRG can generate better soil maps than GWR for limit soil data. Two-sample t test of produced soil maps also confirmed significantly different means. Variogram analysis of the model residuals exhibited weak spatial correlation, rejecting the use of hybrid kriging techniques. As a heuristically statistical method, the GWRG was beneficial in this study and potentially for other soil properties.

  12. In vitro evaluation of Augmentin by broth microdilution and disk diffusion susceptibility testing: regression analysis, tentative interpretive criteria, and quality control limits.

    PubMed Central

    Fuchs, P C; Barry, A L; Thornsberry, C; Gavan, T L; Jones, R N

    1983-01-01

    Augmentin (Beecham Laboratories, Bristol, Tenn.), a combination drug consisting of two parts amoxicillin to one part clavulanic acid and a potent beta-lactamase inhibitor, was evaluated in vitro in comparison with ampicillin or amoxicillin or both for its inhibitory and bactericidal activities against selected clinical isolates. Regression analysis was performed and tentative disk diffusion susceptibility breakpoints were determined. A multicenter performance study of the disk diffusion test was conducted with three quality control organisms to determine tentative quality control limits. All methicillin-susceptible staphylococci and Haemophilus influenzae isolates were susceptible to Augmentin, although the minimal inhibitory concentrations for beta-lactamase-producing strains of both groups were, on the average, fourfold higher than those for enzyme-negative strains. Among the Enterobacteriaceae, Augmentin exhibited significantly greater activity than did ampicillin against Klebsiella pneumoniae, Citrobacter diversus, Proteus vulgaris, and about one-third of the Escherichia coli strains tested. Bactericidal activity usually occurred at the minimal inhibitory concentration. There was a slight inoculum concentration effect on the Augmentin minimal inhibitory concentrations. On the basis of regression and error rate-bounded analyses, the suggested interpretive disk diffusion susceptibility breakpoints for Augmentin are: susceptible, greater than or equal to 18 mm; resistant, less than or equal to 13 mm (gram-negative bacilli); and susceptible, greater than or equal to 20 mm (staphylococci and H. influenzae). The use of a beta-lactamase-producing organism, such as E. coli Beecham 1532, is recommended for quality assurance of Augmentin susceptibility testing. PMID:6625554

  13. Prediction of performance on the RCMP physical ability requirement evaluation.

    PubMed

    Stanish, H I; Wood, T M; Campagna, P

    1999-08-01

    The Royal Canadian Mounted Police use the Physical Ability Requirement Evaluation (PARE) for screening applicants. The purposes of this investigation were to identify those field tests of physical fitness that were associated with PARE performance and determine which most accurately classified successful and unsuccessful PARE performers. The participants were 27 female and 21 male volunteers. Testing included measures of aerobic power, anaerobic power, agility, muscular strength, muscular endurance, and body composition. Multiple regression analysis revealed a three-variable model for males (70-lb bench press, standing long jump, and agility) explaining 79% of the variability in PARE time, whereas a one-variable model (agility) explained 43% of the variability for females. Analysis of the classification accuracy of the males' data was prohibited because 91% of the males passed the PARE. Classification accuracy of the females' data, using logistic regression, produced a two-variable model (agility, 1.5-mile endurance run) with 93% overall classification accuracy.

  14. Comparison of partial least squares and random forests for evaluating relationship between phenolics and bioactivities of Neptunia oleracea.

    PubMed

    Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah

    2018-01-01

    Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

  15. Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)

    PubMed Central

    Churchill, Morgan; Clementz, Mark T; Kohno, Naoki

    2014-01-01

    Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814

  16. Expression profiling reveals distinct sets of genes altered during induction and regression of cardiac hypertrophy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Friddle, Carl J; Koga, Teiichiro; Rubin, Edward M.

    2000-03-15

    While cardiac hypertrophy has been the subject of intensive investigation, regression of hypertrophy has been significantly less studied, precluding large-scale analysis of the relationship between these processes. In the present study, using pharmacological models of hypertrophy in mice, expression profiling was performed with fragments of more than 3,000 genes to characterize and contrast expression changes during induction and regression of hypertrophy. Administration of angiotensin II and isoproterenol by osmotic minipump produced increases in heart weight (15% and 40% respectively) that returned to pre-induction size following drug withdrawal. From multiple expression analyses of left ventricular RNA isolated at daily time-points duringmore » cardiac hypertrophy and regression, we identified sets of genes whose expression was altered at specific stages of this process. While confirming the participation of 25 genes or pathways previously known to be altered by hypertrophy, a larger set of 30 genes was identified whose expression had not previously been associated with cardiac hypertrophy or regression. Of the 55 genes that showed reproducible changes during the time course of induction and regression, 32 genes were altered only during induction and 8 were altered only during regression. This study identified both known and novel genes whose expression is affected at different stages of cardiac hypertrophy and regression and demonstrates that cardiac remodeling during regression utilizes a set of genes that are distinct from those used during induction of hypertrophy.« less

  17. Ridge: a computer program for calculating ridge regression estimates

    Treesearch

    Donald E. Hilt; Donald W. Seegrist

    1977-01-01

    Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.

  18. Near-infrared spectral image analysis of pork marbling based on Gabor filter and wide line detector techniques.

    PubMed

    Huang, Hui; Liu, Li; Ngadi, Michael O; Gariépy, Claude; Prasher, Shiv O

    2014-01-01

    Marbling is an important quality attribute of pork. Detection of pork marbling usually involves subjective scoring, which raises the efficiency costs to the processor. In this study, the ability to predict pork marbling using near-infrared (NIR) hyperspectral imaging (900-1700 nm) and the proper image processing techniques were studied. Near-infrared images were collected from pork after marbling evaluation according to current standard chart from the National Pork Producers Council. Image analysis techniques-Gabor filter, wide line detector, and spectral averaging-were applied to extract texture, line, and spectral features, respectively, from NIR images of pork. Samples were grouped into calibration and validation sets. Wavelength selection was performed on calibration set by stepwise regression procedure. Prediction models of pork marbling scores were built using multiple linear regressions based on derivatives of mean spectra and line features at key wavelengths. The results showed that the derivatives of both texture and spectral features produced good results, with correlation coefficients of validation of 0.90 and 0.86, respectively, using wavelengths of 961, 1186, and 1220 nm. The results revealed the great potential of the Gabor filter for analyzing NIR images of pork for the effective and efficient objective evaluation of pork marbling.

  19. The contextual effects of social capital on health: a cross-national instrumental variable analysis.

    PubMed

    Kim, Daniel; Baum, Christopher F; Ganz, Michael L; Subramanian, S V; Kawachi, Ichiro

    2011-12-01

    Past research on the associations between area-level/contextual social capital and health has produced conflicting evidence. However, interpreting this rapidly growing literature is difficult because estimates using conventional regression are prone to major sources of bias including residual confounding and reverse causation. Instrumental variable (IV) analysis can reduce such bias. Using data on up to 167,344 adults in 64 nations in the European and World Values Surveys and applying IV and ordinary least squares (OLS) regression, we estimated the contextual effects of country-level social trust on individual self-rated health. We further explored whether these associations varied by gender and individual levels of trust. Using OLS regression, we found higher average country-level trust to be associated with better self-rated health in both women and men. Instrumental variable analysis yielded qualitatively similar results, although the estimates were more than double in size in both sexes when country population density and corruption were used as instruments. The estimated health effects of raising the percentage of a country's population that trusts others by 10 percentage points were at least as large as the estimated health effects of an individual developing trust in others. These findings were robust to alternative model specifications and instruments. Conventional regression and to a lesser extent IV analysis suggested that these associations are more salient in women and in women reporting social trust. In a large cross-national study, our findings, including those using instrumental variables, support the presence of beneficial effects of higher country-level trust on self-rated health. Previous findings for contextual social capital using traditional regression may have underestimated the true associations. Given the close linkages between self-rated health and all-cause mortality, the public health gains from raising social capital within and across countries may be large. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Accounting for standard errors of vision-specific latent trait in regression models.

    PubMed

    Wong, Wan Ling; Li, Xiang; Li, Jialiang; Wong, Tien Yin; Cheng, Ching-Yu; Lamoureux, Ecosse L

    2014-07-11

    To demonstrate the effectiveness of Hierarchical Bayesian (HB) approach in a modeling framework for association effects that accounts for SEs of vision-specific latent traits assessed using Rasch analysis. A systematic literature review was conducted in four major ophthalmic journals to evaluate Rasch analysis performed on vision-specific instruments. The HB approach was used to synthesize the Rasch model and multiple linear regression model for the assessment of the association effects related to vision-specific latent traits. The effectiveness of this novel HB one-stage "joint-analysis" approach allows all model parameters to be estimated simultaneously and was compared with the frequently used two-stage "separate-analysis" approach in our simulation study (Rasch analysis followed by traditional statistical analyses without adjustment for SE of latent trait). Sixty-six reviewed articles performed evaluation and validation of vision-specific instruments using Rasch analysis, and 86.4% (n = 57) performed further statistical analyses on the Rasch-scaled data using traditional statistical methods; none took into consideration SEs of the estimated Rasch-scaled scores. The two models on real data differed for effect size estimations and the identification of "independent risk factors." Simulation results showed that our proposed HB one-stage "joint-analysis" approach produces greater accuracy (average of 5-fold decrease in bias) with comparable power and precision in estimation of associations when compared with the frequently used two-stage "separate-analysis" procedure despite accounting for greater uncertainty due to the latent trait. Patient-reported data, using Rasch analysis techniques, do not take into account the SE of latent trait in association analyses. The HB one-stage "joint-analysis" is a better approach, producing accurate effect size estimations and information about the independent association of exposure variables with vision-specific latent traits. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  1. A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

    ERIC Educational Resources Information Center

    Liou, Pey-Yan

    2009-01-01

    The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

  2. Screening and clustering of sparse regressions with finite non-Gaussian mixtures.

    PubMed

    Zhang, Jian

    2017-06-01

    This article proposes a method to address the problem that can arise when covariates in a regression setting are not Gaussian, which may give rise to approximately mixture-distributed errors, or when a true mixture of regressions produced the data. The method begins with non-Gaussian mixture-based marginal variable screening, followed by fitting a full but relatively smaller mixture regression model to the selected data with help of a new penalization scheme. Under certain regularity conditions, the new screening procedure is shown to possess a sure screening property even when the population is heterogeneous. We further prove that there exists an elbow point in the associated scree plot which results in a consistent estimator of the set of active covariates in the model. By simulations, we demonstrate that the new procedure can substantially improve the performance of the existing procedures in the content of variable screening and data clustering. By applying the proposed procedure to motif data analysis in molecular biology, we demonstrate that the new method holds promise in practice. © 2016, The International Biometric Society.

  3. Impact of multicollinearity on small sample hydrologic regression models

    NASA Astrophysics Data System (ADS)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  4. Test anxiety and academic performance in chiropractic students.

    PubMed

    Zhang, Niu; Henderson, Charles N R

    2014-01-01

    Objective : We assessed the level of students' test anxiety, and the relationship between test anxiety and academic performance. Methods : We recruited 166 third-quarter students. The Test Anxiety Inventory (TAI) was administered to all participants. Total scores from written examinations and objective structured clinical examinations (OSCEs) were used as response variables. Results : Multiple regression analysis shows that there was a modest, but statistically significant negative correlation between TAI scores and written exam scores, but not OSCE scores. Worry and emotionality were the best predictive models for written exam scores. Mean total anxiety and emotionality scores for females were significantly higher than those for males, but not worry scores. Conclusion : Moderate-to-high test anxiety was observed in 85% of the chiropractic students examined. However, total test anxiety, as measured by the TAI score, was a very weak predictive model for written exam performance. Multiple regression analysis demonstrated that replacing total anxiety (TAI) with worry and emotionality (TAI subscales) produces a much more effective predictive model of written exam performance. Sex, age, highest current academic degree, and ethnicity contributed little additional predictive power in either regression model. Moreover, TAI scores were not found to be statistically significant predictors of physical exam skill performance, as measured by OSCEs.

  5. Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

    NASA Astrophysics Data System (ADS)

    Chang, C. H.; Chan, H. C.; Chen, B. A.

    2016-12-01

    Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.

  6. Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case

    NASA Astrophysics Data System (ADS)

    Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann

    2017-04-01

    Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.

  7. Relationships among body weight, joint moments generated during functional activities, and hip bone mass in older adults

    PubMed Central

    Wang, Man-Ying; Flanagan, Sean P.; Song, Joo-Eun; Greendale, Gail A.; Salem, George J.

    2012-01-01

    Objective To investigate the relationships among hip joint moments produced during functional activities and hip bone mass in sedentary older adults. Methods Eight male and eight female older adults (70–85 yr) performed functional activities including walking, chair sit–stand–sit, and stair stepping at a self-selected pace while instrumented for biomechanical analysis. Bone mass at proximal femur, femoral neck, and greater trochanter were measured by dual-energy X-ray absorptiometry. Three-dimensional hip moments were obtained using a six-camera motion analysis system, force platforms, and inverse dynamics techniques. Pearson’s correlation coefficients were employed to assess the relationships among hip bone mass, height, weight, age, and joint moments. Stepwise regression analyses were performed to determine the factors that significantly predicted bone mass using all significant variables identified in the correlation analysis. Findings Hip bone mass was not significantly correlated with moments during activities in men. Conversely, in women bone mass at all sites were significantly correlated with weight, moments generated with stepping, and moments generated with walking (p < 0.05 to p < 0.001). Regression analysis results further indicated that the overall moments during stepping independently predicted up to 93% of the variability in bone mass at femoral neck and proximal femur; whereas weight independently predicted up to 92% of the variability in bone mass at greater trochanter. Interpretation Submaximal loading events produced during functional activities were highly correlated with hip bone mass in sedentary older women, but not men. The findings may ultimately be used to modify exercise prescription for the preservation of bone mass. PMID:16631283

  8. Characterizing Individual Differences in Functional Connectivity Using Dual-Regression and Seed-Based Approaches

    PubMed Central

    Smith, David V.; Utevsky, Amanda V.; Bland, Amy R.; Clement, Nathan; Clithero, John A.; Harsch, Anne E. W.; Carter, R. McKell; Huettel, Scott A.

    2014-01-01

    A central challenge for neuroscience lies in relating inter-individual variability to the functional properties of specific brain regions. Yet, considerable variability exists in the connectivity patterns between different brain areas, potentially producing reliable group differences. Using sex differences as a motivating example, we examined two separate resting-state datasets comprising a total of 188 human participants. Both datasets were decomposed into resting-state networks (RSNs) using a probabilistic spatial independent components analysis (ICA). We estimated voxelwise functional connectivity with these networks using a dual-regression analysis, which characterizes the participant-level spatiotemporal dynamics of each network while controlling for (via multiple regression) the influence of other networks and sources of variability. We found that males and females exhibit distinct patterns of connectivity with multiple RSNs, including both visual and auditory networks and the right frontal-parietal network. These results replicated across both datasets and were not explained by differences in head motion, data quality, brain volume, cortisol levels, or testosterone levels. Importantly, we also demonstrate that dual-regression functional connectivity is better at detecting inter-individual variability than traditional seed-based functional connectivity approaches. Our findings characterize robust—yet frequently ignored—neural differences between males and females, pointing to the necessity of controlling for sex in neuroscience studies of individual differences. Moreover, our results highlight the importance of employing network-based models to study variability in functional connectivity. PMID:24662574

  9. Building factorial regression models to explain and predict nitrate concentrations in groundwater under agricultural land

    NASA Astrophysics Data System (ADS)

    Stigter, T. Y.; Ribeiro, L.; Dill, A. M. M. Carvalho

    2008-07-01

    SummaryFactorial regression models, based on correspondence analysis, are built to explain the high nitrate concentrations in groundwater beneath an agricultural area in the south of Portugal, exceeding 300 mg/l, as a function of chemical variables, electrical conductivity (EC), land use and hydrogeological setting. Two important advantages of the proposed methodology are that qualitative parameters can be involved in the regression analysis and that multicollinearity is avoided. Regression is performed on eigenvectors extracted from the data similarity matrix, the first of which clearly reveals the impact of agricultural practices and hydrogeological setting on the groundwater chemistry of the study area. Significant correlation exists between response variable NO3- and explanatory variables Ca 2+, Cl -, SO42-, depth to water, aquifer media and land use. Substituting Cl - by the EC results in the most accurate regression model for nitrate, when disregarding the four largest outliers (model A). When built solely on land use and hydrogeological setting, the regression model (model B) is less accurate but more interesting from a practical viewpoint, as it is based on easily obtainable data and can be used to predict nitrate concentrations in groundwater in other areas with similar conditions. This is particularly useful for conservative contaminants, where risk and vulnerability assessment methods, based on assumed rather than established correlations, generally produce erroneous results. Another purpose of the models can be to predict the future evolution of nitrate concentrations under influence of changes in land use or fertilization practices, which occur in compliance with policies such as the Nitrates Directive. Model B predicts a 40% decrease in nitrate concentrations in groundwater of the study area, when horticulture is replaced by other land use with much lower fertilization and irrigation rates.

  10. Palus Somni - Anomalies in the correlation of Al/Si X-ray fluorescence intensity ratios and broad-spectrum visible albedos. [lunar surface mineralogy

    NASA Technical Reports Server (NTRS)

    Clark, P. E.; Andre, C. G.; Adler, I.; Weidner, J.; Podwysocki, M.

    1976-01-01

    The positive correlation between Al/Si X-ray fluorescence intensity ratios determined during the Apollo 15 lunar mission and a broad-spectrum visible albedo of the moon is quantitatively established. Linear regression analysis performed on 246 1 degree geographic cells of X-ray fluorescence intensity and visible albedo data points produced a statistically significant correlation coefficient of .78. Three distinct distributions of data were identified as (1) within one standard deviation of the regression line, (2) greater than one standard deviation below the line, and (3) greater than one standard deviation above the line. The latter two distributions of data were found to occupy distinct geographic areas in the Palus Somni region.

  11. Methods for estimating the magnitude and frequency of floods for urban and small, rural streams in Georgia, South Carolina, and North Carolina, 2011

    USGS Publications Warehouse

    Feaster, Toby D.; Gotvald, Anthony J.; Weaver, J. Curtis

    2014-01-01

    Reliable estimates of the magnitude and frequency of floods are essential for the design of transportation and water-conveyance structures, flood-insurance studies, and flood-plain management. Such estimates are particularly important in densely populated urban areas. In order to increase the number of streamflow-gaging stations (streamgages) available for analysis, expand the geographical coverage that would allow for application of regional regression equations across State boundaries, and build on a previous flood-frequency investigation of rural U.S Geological Survey streamgages in the Southeast United States, a multistate approach was used to update methods for determining the magnitude and frequency of floods in urban and small, rural streams that are not substantially affected by regulation or tidal fluctuations in Georgia, South Carolina, and North Carolina. The at-site flood-frequency analysis of annual peak-flow data for urban and small, rural streams (through September 30, 2011) included 116 urban streamgages and 32 small, rural streamgages, defined in this report as basins draining less than 1 square mile. The regional regression analysis included annual peak-flow data from an additional 338 rural streamgages previously included in U.S. Geological Survey flood-frequency reports and 2 additional rural streamgages in North Carolina that were not included in the previous Southeast rural flood-frequency investigation for a total of 488 streamgages included in the urban and small, rural regression analysis. The at-site flood-frequency analyses for the urban and small, rural streamgages included the expected moments algorithm, which is a modification of the Bulletin 17B log-Pearson type III method for fitting the statistical distribution to the logarithms of the annual peak flows. Where applicable, the flood-frequency analysis also included low-outlier and historic information. Additionally, the application of a generalized Grubbs-Becks test allowed for the detection of multiple potentially influential low outliers. Streamgage basin characteristics were determined using geographical information system techniques. Initial ordinary least squares regression simulations reduced the number of basin characteristics on the basis of such factors as statistical significance, coefficient of determination, Mallow’s Cp statistic, and ease of measurement of the explanatory variable. Application of generalized least squares regression techniques produced final predictive (regression) equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability flows for urban and small, rural ungaged basins for three hydrologic regions (HR1, Piedmont–Ridge and Valley; HR3, Sand Hills; and HR4, Coastal Plain), which previously had been defined from exploratory regression analysis in the Southeast rural flood-frequency investigation. Because of the limited availability of urban streamgages in the Coastal Plain of Georgia, South Carolina, and North Carolina, additional urban streamgages in Florida and New Jersey were used in the regression analysis for this region. Including the urban streamgages in New Jersey allowed for the expansion of the applicability of the predictive equations in the Coastal Plain from 3.5 to 53.5 square miles. Average standard error of prediction for the predictive equations, which is a measure of the average accuracy of the regression equations when predicting flood estimates for ungaged sites, range from 25.0 percent for the 10-percent annual exceedance probability regression equation for the Piedmont–Ridge and Valley region to 73.3 percent for the 0.2-percent annual exceedance probability regression equation for the Sand Hills region.

  12. Ozone and sulfur dioxide effects on three tall fescue cultivars

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Flagler, R.B.; Youngner, V.B.

    Although many reports have been published concerning differential susceptibility of various crops and/or cultivars to air pollutants, most have used foliar injury instead of the marketable yield as the factor that determined susceptibility for the crop. In an examination of screening in terms of marketable yield, three cultivars of tall fescue (Festuca arundinacea Schreb.), 'Alta,' 'Fawn,' and 'Kentucky 31,' were exposed to 0-0.40 ppm O/sub 3/ or 0-0.50 ppm SO/sub 2/ 6 h/d, once a week, for 7 and 9 weeks, respectively. Experimental design was a randomized complete block with three replications. Statistical analysis was by standard analysis of variancemore » and regression techniques. Three variables were analyzed: top dry weight (yield), tiller number, and weight per tiller. Ozone had a significant effect on all three variables. Significant linear decreases in yield and weight per tiller occurred with increasing O/sub 3/ concentrations. Linear regressions of these variables on O/sub 3/ concentration produced significantly different regression coefficients. The coefficient for Kentucky 31 was significantly greater than Alta or Fawn, which did not differ from each other. This indicated that Kentucky 31 was more susceptible to O/sub 3/ than either of the other cultivars. Percent reductions in dry weight for the three cultivars at highest O/sub 3/ level were 35, 44, and 53%, respectively, for Fawn, Alta, and Kentucky 31. For weight per tiller, Kentucky 31 had a higher percent reduction than the other cultivars (59 vs. 46 and 44%). Tiller number was generally increased by O/sub 3/, but this variable was not useful for determining differential susceptibility to the pollutant. Sulfur dioxide treatments produced no significant effects on any of the variables analyzed.« less

  13. Landslide susceptibility mapping for a part of North Anatolian Fault Zone (Northeast Turkey) using logistic regression model

    NASA Astrophysics Data System (ADS)

    Demir, Gökhan; aytekin, mustafa; banu ikizler, sabriye; angın, zekai

    2013-04-01

    The North Anatolian Fault is know as one of the most active and destructive fault zone which produced many earthquakes with high magnitudes. Along this fault zone, the morphology and the lithological features are prone to landsliding. However, many earthquake induced landslides were recorded by several studies along this fault zone, and these landslides caused both injuiries and live losts. Therefore, a detailed landslide susceptibility assessment for this area is indispancable. In this context, a landslide susceptibility assessment for the 1445 km2 area in the Kelkit River valley a part of North Anatolian Fault zone (Eastern Black Sea region of Turkey) was intended with this study, and the results of this study are summarized here. For this purpose, geographical information system (GIS) and a bivariate statistical model were used. Initially, Landslide inventory maps are prepared by using landslide data determined by field surveys and landslide data taken from General Directorate of Mineral Research and Exploration. The landslide conditioning factors are considered to be lithology, slope gradient, slope aspect, topographical elevation, distance to streams, distance to roads and distance to faults, drainage density and fault density. ArcGIS package was used to manipulate and analyze all the collected data Logistic regression method was applied to create a landslide susceptibility map. Landslide susceptibility maps were divided into five susceptibility regions such as very low, low, moderate, high and very high. The result of the analysis was verified using the inventoried landslide locations and compared with the produced probability model. For this purpose, Area Under Curvature (AUC) approach was applied, and a AUC value was obtained. Based on this AUC value, the obtained landslide susceptibility map was concluded as satisfactory. Keywords: North Anatolian Fault Zone, Landslide susceptibility map, Geographical Information Systems, Logistic Regression Analysis.

  14. Meta-analysis of haplotype-association studies: comparison of methods and empirical evaluation of the literature

    PubMed Central

    2011-01-01

    Background Meta-analysis is a popular methodology in several fields of medical research, including genetic association studies. However, the methods used for meta-analysis of association studies that report haplotypes have not been studied in detail. In this work, methods for performing meta-analysis of haplotype association studies are summarized, compared and presented in a unified framework along with an empirical evaluation of the literature. Results We present multivariate methods that use summary-based data as well as methods that use binary and count data in a generalized linear mixed model framework (logistic regression, multinomial regression and Poisson regression). The methods presented here avoid the inflation of the type I error rate that could be the result of the traditional approach of comparing a haplotype against the remaining ones, whereas, they can be fitted using standard software. Moreover, formal global tests are presented for assessing the statistical significance of the overall association. Although the methods presented here assume that the haplotypes are directly observed, they can be easily extended to allow for such an uncertainty by weighting the haplotypes by their probability. Conclusions An empirical evaluation of the published literature and a comparison against the meta-analyses that use single nucleotide polymorphisms, suggests that the studies reporting meta-analysis of haplotypes contain approximately half of the included studies and produce significant results twice more often. We show that this excess of statistically significant results, stems from the sub-optimal method of analysis used and, in approximately half of the cases, the statistical significance is refuted if the data are properly re-analyzed. Illustrative examples of code are given in Stata and it is anticipated that the methods developed in this work will be widely applied in the meta-analysis of haplotype association studies. PMID:21247440

  15. Advanced glycation end products and antioxidant status in type 2 diabetic patients with and without peripheral artery disease.

    PubMed

    Lapolla, Annunziata; Piarulli, Francesco; Sartore, Giovanni; Ceriello, Antonio; Ragazzi, Eugenio; Reitano, Rachele; Baccarin, Lorenzo; Laverda, Barbara; Fedele, Domenico

    2007-03-01

    Advanced glycation end products (AGEs), pentosidine and malondialdehyde (MDA), are elevated in type 2 diabetic subjects with coronary and carotid angiopathy. We investigated the relationship of AGEs, MDA, total reactive antioxidant potentials (TRAPs), and vitamin E in type 2 diabetic patients with and without peripheral artery disease (PAD). AGEs, pentosidine, MDA, TRAP, vitamin E, and ankle-brachial index (ABI) were measured in 99 consecutive type 2 diabetic subjects and 20 control subjects. AGEs, pentosidine, and MDA were higher and vitamin E and TRAP were lower in patients with PAD (ABI <0.9) than in patients without PAD (ABI >0.9) (P < 0.001). After multiple regression analysis, a correlation between AGEs and pentosidine, as independent variables, and ABI, as the dependent variable, was found in both patients with and without PAD (r = 0.9198, P < 0.001 and r = 0.5764, P < 0.001, respectively) but not in control subjects. When individual regression coefficients were evaluated, only that due to pentosidine was confirmed as significant. For patients with PAD, considering TRAP, vitamin E, and MDA as independent variables and ABI as the dependent variable produced an overall significant regression (r = 0.6913, P < 0.001). The regression coefficients for TRAP and vitamin E were not significant, indicating that the model is best explained by a single linear regression between MDA and ABI. These findings were also confirmed by principal component analysis. Results show that pentosidine and MDA are strongly associated with PAD in type 2 diabetic patients.

  16. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

    2015-05-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.

  17. Quantifying female bodily attractiveness by a statistical analysis of body measurements.

    PubMed

    Gründl, Martin; Eisenmann-Klein, Marita; Prantl, Lukas

    2009-03-01

    To investigate what makes a female figure attractive, an extensive experiment was conducted using high-quality photographic stimulus material and several systematically varied figure parameters. The objective was to predict female bodily attractiveness by using figure measurements. For generating stimulus material, a frontal-view photograph of a woman with normal body proportions was taken. Using morphing software, 243 variations of this photograph were produced by systematically manipulating the following features: weight, hip width, waist width, bust size, and leg length. More than 34,000 people participated in the web-based experiment and judged the attractiveness of the figures. All of the altered figures were measured (e.g., bust width, underbust width, waist width, hip width, and so on). Based on these measurements, ratios were calculated (e.g., waist-to-hip ratio). A multiple regression analysis was designed to predict the attractiveness rank of a figure by using figure measurements. The results show that the attractiveness of a woman's figure may be predicted by using her body measurements. The regression analysis explains a variance of 80 percent. Important predictors are bust-to-underbust ratio, bust-to-waist ratio, waist-to-hip ratio, and an androgyny index (an indicator of a typical female body). The study shows that the attractiveness of a female figure is the result of complex interactions of numerous factors. It affirms the importance of viewing the appearance of a bodily feature in the context of other bodily features when performing preoperative analysis. Based on the standardized beta-weights of the regression model, the relative importance of figure parameters in context of preoperative analysis is discussed.

  18. Using Gamma and Quantile Regressions to Explore the Association between Job Strain and Adiposity in the ELSA-Brasil Study: Does Gender Matter?

    PubMed

    Fonseca, Maria de Jesus Mendes da; Juvanhol, Leidjaira Lopes; Rotenberg, Lúcia; Nobre, Aline Araújo; Griep, Rosane Härter; Alves, Márcia Guimarães de Mello; Cardoso, Letícia de Oliveira; Giatti, Luana; Nunes, Maria Angélica; Aquino, Estela M L; Chor, Dóra

    2017-11-17

    This paper explores the association between job strain and adiposity, using two statistical analysis approaches and considering the role of gender. The research evaluated 11,960 active baseline participants (2008-2010) in the ELSA-Brasil study. Job strain was evaluated through a demand-control questionnaire, while body mass index (BMI) and waist circumference (WC) were evaluated in continuous form. The associations were estimated using gamma regression models with an identity link function. Quantile regression models were also estimated from the final set of co-variables established by gamma regression. The relationship that was found varied by analytical approach and gender. Among the women, no association was observed between job strain and adiposity in the fitted gamma models. In the quantile models, a pattern of increasing effects of high strain was observed at higher BMI and WC distribution quantiles. Among the men, high strain was associated with adiposity in the gamma regression models. However, when quantile regression was used, that association was found not to be homogeneous across outcome distributions. In addition, in the quantile models an association was observed between active jobs and BMI. Our results point to an association between job strain and adiposity, which follows a heterogeneous pattern. Modelling strategies can produce different results and should, accordingly, be used to complement one another.

  19. Digital immunohistochemistry wizard: image analysis-assisted stereology tool to produce reference data set for calibration and quality control.

    PubMed

    Plancoulaine, Benoît; Laurinaviciene, Aida; Meskauskas, Raimundas; Baltrusaityte, Indra; Besusparis, Justinas; Herlin, Paulette; Laurinavicius, Arvydas

    2014-01-01

    Digital image analysis (DIA) enables better reproducibility of immunohistochemistry (IHC) studies. Nevertheless, accuracy of the DIA methods needs to be ensured, demanding production of reference data sets. We have reported on methodology to calibrate DIA for Ki67 IHC in breast cancer tissue based on reference data obtained by stereology grid count. To produce the reference data more efficiently, we propose digital IHC wizard generating initial cell marks to be verified by experts. Digital images of proliferation marker Ki67 IHC from 158 patients (one tissue microarray spot per patient) with an invasive ductal carcinoma of the breast were used. Manual data (mD) were obtained by marking Ki67-positive and negative tumour cells, using a stereological method for 2D object enumeration. DIA was used as an initial step in stereology grid count to generate the digital data (dD) marks by Aperio Genie and Nuclear algorithms. The dD were collected into XML files from the DIA markup images and overlaid on the original spots along with the stereology grid. The expert correction of the dD marks resulted in corrected data (cD). The percentages of Ki67 positive tumour cells per spot in the mD, dD, and cD sets were compared by single linear regression analysis. Efficiency of cD production was estimated based on manual editing effort. The percentage of Ki67-positive tumor cells was in very good agreement in the mD, dD, and cD sets: regression of cD from dD (R2=0.92) reflects the impact of the expert editing the dD as well as accuracy of the DIA used; regression of the cD from the mD (R2=0.94) represents the consistency of the DIA-assisted ground truth (cD) with the manual procedure. Nevertheless, the accuracy of detection of individual tumour cells was much lower: in average, 18 and 219 marks per spot were edited due to the Genie and Nuclear algorithm errors, respectively. The DIA-assisted cD production in our experiment saved approximately 2/3 of manual marking. Digital IHC wizard enabled DIA-assisted stereology to produce reference data in a consistent and efficient way. It can provide quality control measure for appraising accuracy of the DIA steps.

  20. Digital immunohistochemistry wizard: image analysis-assisted stereology tool to produce reference data set for calibration and quality control

    PubMed Central

    2014-01-01

    Background Digital image analysis (DIA) enables better reproducibility of immunohistochemistry (IHC) studies. Nevertheless, accuracy of the DIA methods needs to be ensured, demanding production of reference data sets. We have reported on methodology to calibrate DIA for Ki67 IHC in breast cancer tissue based on reference data obtained by stereology grid count. To produce the reference data more efficiently, we propose digital IHC wizard generating initial cell marks to be verified by experts. Methods Digital images of proliferation marker Ki67 IHC from 158 patients (one tissue microarray spot per patient) with an invasive ductal carcinoma of the breast were used. Manual data (mD) were obtained by marking Ki67-positive and negative tumour cells, using a stereological method for 2D object enumeration. DIA was used as an initial step in stereology grid count to generate the digital data (dD) marks by Aperio Genie and Nuclear algorithms. The dD were collected into XML files from the DIA markup images and overlaid on the original spots along with the stereology grid. The expert correction of the dD marks resulted in corrected data (cD). The percentages of Ki67 positive tumour cells per spot in the mD, dD, and cD sets were compared by single linear regression analysis. Efficiency of cD production was estimated based on manual editing effort. Results The percentage of Ki67-positive tumor cells was in very good agreement in the mD, dD, and cD sets: regression of cD from dD (R2=0.92) reflects the impact of the expert editing the dD as well as accuracy of the DIA used; regression of the cD from the mD (R2=0.94) represents the consistency of the DIA-assisted ground truth (cD) with the manual procedure. Nevertheless, the accuracy of detection of individual tumour cells was much lower: in average, 18 and 219 marks per spot were edited due to the Genie and Nuclear algorithm errors, respectively. The DIA-assisted cD production in our experiment saved approximately 2/3 of manual marking. Conclusions Digital IHC wizard enabled DIA-assisted stereology to produce reference data in a consistent and efficient way. It can provide quality control measure for appraising accuracy of the DIA steps. PMID:25565221

  1. Watershed Planning within a Quantitative Scenario Analysis Framework.

    PubMed

    Merriam, Eric R; Petty, J Todd; Strager, Michael P

    2016-07-24

    There is a critical need for tools and methodologies capable of managing aquatic systems within heavily impacted watersheds. Current efforts often fall short as a result of an inability to quantify and predict complex cumulative effects of current and future land use scenarios at relevant spatial scales. The goal of this manuscript is to provide methods for conducting a targeted watershed assessment that enables resource managers to produce landscape-based cumulative effects models for use within a scenario analysis management framework. Sites are first selected for inclusion within the watershed assessment by identifying sites that fall along independent gradients and combinations of known stressors. Field and laboratory techniques are then used to obtain data on the physical, chemical, and biological effects of multiple land use activities. Multiple linear regression analysis is then used to produce landscape-based cumulative effects models for predicting aquatic conditions. Lastly, methods for incorporating cumulative effects models within a scenario analysis framework for guiding management and regulatory decisions (e.g., permitting and mitigation) within actively developing watersheds are discussed and demonstrated for 2 sub-watersheds within the mountaintop mining region of central Appalachia. The watershed assessment and management approach provided herein enables resource managers to facilitate economic and development activity while protecting aquatic resources and producing opportunity for net ecological benefits through targeted remediation.

  2. Computational tools for exact conditional logistic regression.

    PubMed

    Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P

    Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.

  3. The arcsine is asinine: the analysis of proportions in ecology.

    PubMed

    Warton, David I; Hui, Francis K C

    2011-01-01

    The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

  4. Novel nitric oxide producing probiotic wound healing patch: preparation and in vivo analysis in a New Zealand white rabbit model of ischaemic and infected wounds.

    PubMed

    Jones, Mitchell; Ganopolsky, Jorge G; Labbé, Alain; Gilardino, Mirko; Wahl, Christopher; Martoni, Christopher; Prakash, Satya

    2012-06-01

    The treatment of chronic wounds poses a significant challenge for clinicians and patients alike. Here we report design and preclinical efficacy of a novel nitric oxide gas (gNO)-producing probiotic patch for wound healing. Specifically, a wound healing patch using lactic acid bacteria in an adhesive gas permeable membrane has been designed and investigated for treating ischaemic and infected full-thickness dermal wounds in a New Zealand white rabbit model for ischaemic wound healing. Kaplan-Meier survival curves showed increased wound closure with gNO-producing patch-treated wounds over 21 days of therapy (log-rank P = 0·0225 and Wilcoxon P = 0·0113). Cox proportional hazard regression showed that gNO-producing patch-treated wounds were 2·52 times more likely to close compared with control patches (hazard P = 0·0375, score P = 0·032 and likelihood ratio P = 0·0355), and histological analysis showed improved wound healing in gNO-producing patch-treated animals. This study may provide an effective, safe and less costly alternative for treating chronic wounds. © 2012 The Authors. © 2012 Blackwell Publishing Ltd and Medicalhelplines.com Inc.

  5. Characterizing multivariate decoding models based on correlated EEG spectral features

    PubMed Central

    McFarland, Dennis J.

    2013-01-01

    Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267

  6. Thermoreceptive innervation of human glabrous and hairy skin: a contact heat evoked potential analysis.

    PubMed

    Granovsky, Yelena; Matre, Dagfinn; Sokolik, Alexander; Lorenz, Jürgen; Casey, Kenneth L

    2005-06-01

    The human palm has a lower heat detection threshold and a higher heat pain threshold than hairy skin. Neurophysiological studies of monkeys suggest that glabrous skin has fewer low threshold heat nociceptors (AMH type 2) than hairy skin. Accordingly, we used a temperature-controlled contact heat evoked potential (CHEP) stimulator to excite selectively heat receptors with C fibers or Adelta-innervated AMH type 2 receptors in humans. On the dorsal hand, 51 degrees C stimulation produced painful pinprick sensations and 41 degrees C stimuli evoked warmth. On the glabrous thenar, 41 degrees C stimulation produced mild warmth and 51 degrees C evoked strong but painless heat sensations. We used CHEP responses to estimate the conduction velocities (CV) of peripheral fibers mediating these sensations. On hairy skin, 41 degrees C stimuli evoked an ultra-late potential (mean, SD; N wave latency: 455 (118) ms) mediated by C fibers (CV by regression analysis: 1.28 m/s, N=15) whereas 51 degrees C stimuli evoked a late potential (N latency: 267 (33) ms) mediated by Adelta afferents (CV by within-subject analysis: 12.9 m/s, N=6). In contrast, thenar responses to 41 and 51 degrees C were mediated by C fibers (average N wave latencies 485 (100) and 433 (73) ms, respectively; CVs 0.95-1.35 m/s by regression analysis, N=15; average CV=1.7 (0.41) m/s calculated from distal glabrous and proximal hairy skin stimulation, N=6). The exploratory range of the human and monkey palm is enhanced by the abundance of low threshold, C-innervated heat receptors and the paucity of low threshold AMH type 2 heat nociceptors.

  7. Principal component regression analysis with SPSS.

    PubMed

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  8. Neural Network and Regression Approximations in High Speed Civil Transport Aircraft Design Optimization

    NASA Technical Reports Server (NTRS)

    Patniak, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.

    1998-01-01

    Nonlinear mathematical-programming-based design optimization can be an elegant method. However, the calculations required to generate the merit function, constraints, and their gradients, which are frequently required, can make the process computational intensive. The computational burden can be greatly reduced by using approximating analyzers derived from an original analyzer utilizing neural networks and linear regression methods. The experience gained from using both of these approximation methods in the design optimization of a high speed civil transport aircraft is the subject of this paper. The Langley Research Center's Flight Optimization System was selected for the aircraft analysis. This software was exercised to generate a set of training data with which a neural network and a regression method were trained, thereby producing the two approximating analyzers. The derived analyzers were coupled to the Lewis Research Center's CometBoards test bed to provide the optimization capability. With the combined software, both approximation methods were examined for use in aircraft design optimization, and both performed satisfactorily. The CPU time for solution of the problem, which had been measured in hours, was reduced to minutes with the neural network approximation and to seconds with the regression method. Instability encountered in the aircraft analysis software at certain design points was also eliminated. On the other hand, there were costs and difficulties associated with training the approximating analyzers. The CPU time required to generate the input-output pairs and to train the approximating analyzers was seven times that required for solution of the problem.

  9. Characterizing individual differences in functional connectivity using dual-regression and seed-based approaches.

    PubMed

    Smith, David V; Utevsky, Amanda V; Bland, Amy R; Clement, Nathan; Clithero, John A; Harsch, Anne E W; McKell Carter, R; Huettel, Scott A

    2014-07-15

    A central challenge for neuroscience lies in relating inter-individual variability to the functional properties of specific brain regions. Yet, considerable variability exists in the connectivity patterns between different brain areas, potentially producing reliable group differences. Using sex differences as a motivating example, we examined two separate resting-state datasets comprising a total of 188 human participants. Both datasets were decomposed into resting-state networks (RSNs) using a probabilistic spatial independent component analysis (ICA). We estimated voxel-wise functional connectivity with these networks using a dual-regression analysis, which characterizes the participant-level spatiotemporal dynamics of each network while controlling for (via multiple regression) the influence of other networks and sources of variability. We found that males and females exhibit distinct patterns of connectivity with multiple RSNs, including both visual and auditory networks and the right frontal-parietal network. These results replicated across both datasets and were not explained by differences in head motion, data quality, brain volume, cortisol levels, or testosterone levels. Importantly, we also demonstrate that dual-regression functional connectivity is better at detecting inter-individual variability than traditional seed-based functional connectivity approaches. Our findings characterize robust-yet frequently ignored-neural differences between males and females, pointing to the necessity of controlling for sex in neuroscience studies of individual differences. Moreover, our results highlight the importance of employing network-based models to study variability in functional connectivity. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Assessment of parametric uncertainty for groundwater reactive transport modeling,

    USGS Publications Warehouse

    Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

    2014-01-01

    The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.

  11. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA

    USGS Publications Warehouse

    Ohlmacher, G.C.; Davis, J.C.

    2003-01-01

    Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

  12. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  13. Unleashing the power of inhibitors of oncogenic kinases through BH3 mimetics.

    PubMed

    Cragg, Mark S; Harris, Claire; Strasser, Andreas; Scott, Clare L

    2009-05-01

    Therapeutic targeting of tumours on the basis of molecular analysis is a new paradigm for cancer treatment but has yet to fulfil expectations. For many solid tumours, targeted therapeutics, such as inhibitors of oncogenic kinase pathways, elicit predominantly disease-stabilizing, cytostatic responses, rather than tumour regression. Combining oncogenic kinase inhibitors with direct activators of the apoptosis machinery, such as the BH3 mimetic ABT-737, may unlock potent anti-tumour potential to produce durable clinical responses with less collateral damage.

  14. Development of an analytical method for crystalline content determination in amorphous solid dispersions produced by hot-melt extrusion using transmission Raman spectroscopy: A feasibility study.

    PubMed

    Netchacovitch, L; Dumont, E; Cailletaud, J; Thiry, J; De Bleye, C; Sacré, P-Y; Boiret, M; Evrard, B; Hubert, Ph; Ziemons, E

    2017-09-15

    The development of a quantitative method determining the crystalline percentage in an amorphous solid dispersion is of great interest in the pharmaceutical field. Indeed, the crystalline Active Pharmaceutical Ingredient transformation into its amorphous state is increasingly used as it enhances the solubility and bioavailability of Biopharmaceutical Classification System class II drugs. One way to produce amorphous solid dispersions is the Hot-Melt Extrusion (HME) process. This study reported the development and the comparison of the analytical performances of two techniques, based on backscattering and transmission Raman spectroscopy, determining the crystalline remaining content in amorphous solid dispersions produced by HME. Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression were performed on preprocessed data and tended towards the same conclusions: for the backscattering Raman results, the use of the DuoScan™ mode improved the PCA and PLS results, due to a larger analyzed sampling volume. For the transmission Raman results, the determination of low crystalline percentages was possible and the best regression model was obtained using this technique. Indeed, the latter acquired spectra through the whole sample volume, in contrast with the previous surface analyses performed using the backscattering mode. This study consequently highlighted the importance of the analyzed sampling volume. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Long-term response of total ozone content at different latitudes of the Northern and Southern Hemispheres caused by solar activity during 1958-2006 (results of regression analysis)

    NASA Astrophysics Data System (ADS)

    Krivolutsky, Alexei A.; Nazarova, Margarita; Knyazeva, Galina

    Solar activity influences on atmospheric photochemical system via its changebale electromag-netic flux with eleven-year period and also by energetic particles during solar proton event (SPE). Energetic particles penetrate mostly into polar regions and induce additional produc-tion of NOx and HOx chemical compounds, which can destroy ozone in photochemical catalytic cycles. Solar irradiance variations cause in-phase variability of ozone in accordance with photo-chemical theory. However, real ozone response caused by these two factors, which has different physical nature, is not so clear on long-term time scale. In order to understand the situation multiply linear regression statistical method was used. Three data series, which covered the period 1958-2006, have been used to realize such analysis: yearly averaged total ozone at dif-ferent latitudes (World Ozone Data Centre, Canada, WMO); yearly averaged proton fluxes with E¿ 10 MeV ( IMP, GOES, METEOR satellites); yearly averaged numbers of solar spots (Solar Data). Then, before the analysis, the data sets of ozone deviations from the mean values for whole period (1958-2006) at each latitudinal belt were prepared. The results of multiply regression analysis (two factors) revealed rather complicated time-dependent behavior of ozone response with clear negative peaks for the years of strong SPEs. The magnitudes of such peaks on annual mean basis are not greater than 10 DU. The unusual effect -positive response of ozone to solar proton activity near both poles-was discovered by statistical analysis. The pos-sible photochemical nature of found effect is discussed. This work was supported by Russian Science Foundation for Basic Research (grant 09-05-009949) and by the contract 1-6-08 under Russian Sub-Program "Research and Investigation of Antarctica".

  16. Using decision tree analysis to identify risk factors for relapse to smoking

    PubMed Central

    Piper, Megan E.; Loh, Wei-Yin; Smith, Stevens S.; Japuntich, Sandra J.; Baker, Timothy B.

    2010-01-01

    This research used classification tree analysis and logistic regression models to identify risk factors related to short- and long-term abstinence. Baseline and cessation outcome data from two smoking cessation trials, conducted from 2001 to 2002, in two Midwestern urban areas, were analyzed. There were 928 participants (53.1% women, 81.8% white) with complete data. Both analyses suggest that relapse risk is produced by interactions of risk factors and that early and late cessation outcomes reflect different vulnerability factors. The results illustrate the dynamic nature of relapse risk and suggest the importance of efficient modeling of interactions in relapse prediction. PMID:20397871

  17. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.

  18. Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

    NASA Astrophysics Data System (ADS)

    Yang, J.; Astitha, M.; Schwartz, C. S.

    2017-12-01

    Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.

  19. Techniques used to identify tornado producing thunderstorms using geosynchronous satellite data

    NASA Technical Reports Server (NTRS)

    Schrab, Kevin J.; Anderson, Charles E.; Monahan, John F.

    1992-01-01

    Satellite imagery in the outbreak region in the time prior to and during tornado occurrence was examined in detail to obtain descriptive characteristics of the anvil plume. These characteristics include outflow strength (UMAX), departure of anvil centerline from the storm relative ambient wind (MDA), storm relative ambient wind (SRAW), and maximum surface vorticity (SFCVOR). It is shown that by using satellite derived parameters which characterize the flow field in the anvil region, the occurrence and intensity of tornadoes, which the parent thunderstorm produces, can be identified. Analysis of the censored regression models revealed that the five explanatory variables (UMAX, MDA, SRAW, UMAX-2, and SFCVOR) were all significant predictors in the identification of tornadic intensity of a particular thunderstorm.

  20. Adolescent religiosity and attitudes to HIV and AIDS in Ghana.

    PubMed

    Amoako-Agyeman, Kofi Nyame

    2012-11-01

    This study investigated the relationships between adolescent religiosity and attitudes to HIV/AIDS based on two major techniques of analysis, factor and regression analysis towards informing preventive school education strategies. Using cross-sectional data of 448 adolescents in junior high school, the study incorporated survey in a self-administered questionnaire and sought to identify underlying factors that affect pupils' responses, delineate the pattern of relationships between variables and select models which best explain and predict relationships among variables. A seven-factor solution described the 'attitude' construct including abstinence and protection, and six for 'religiosity'. The results showed relatively high levels of religiosity and a preference for private religiosity as opposed to organisational religiosity. The regression analysis produced significant relationships between factors of attitudes to HIV/AIDS and of religiosity. Adolescent with very high private religiosity are more likely to abstain from sex but less likely to use condoms once they initiate: protection is inversely related to religiosity. The findings suggest that religious-based adolescent interventions should focus on intrinsic religiosity. Additionally, increasing HIV prevention information and incorporating culturally relevant and socially acceptable values might lend support to improved adolescent school-based HIV/AIDS prevention programmes.

  1. Brief and precarious lives: infant mortality in contrasting sites from medieval and post-medieval England (AD 850-1859).

    PubMed

    Lewis, Mary E; Gowland, Rebecca

    2007-09-01

    This study compares the infant mortality profiles of 128 infants from two urban and two rural cemetery sites in medieval England. The aim of this paper is to assess the impact of urbanization and industrialization in terms of endogenous or exogenous causes of death. In order to undertake this analysis, two different methods of estimating gestational age from long bone lengths were used: a traditional regression method and a Bayesian method. The regression method tended to produce more marked peaks at 38 weeks, while the Bayesian method produced a broader range of ages and were more comparable with the expected "natural" mortality profiles.At all the sites, neonatal mortality (28-40 weeks) outweighed post-neonatal mortality (41-48 weeks) with rural Raunds Furnells in Northamptonshire, showing the highest number of neonatal deaths and post-medieval Spitalfields, London, showing a greater proportion of deaths due to exogenous or environmental factors. Of the four sites under study, Wharram Percy in Yorkshire showed the most convincing "natural" infant mortality profile, suggesting the inclusion of all births at the site (i.e., stillbirths and unbaptised infants). (c) 2007 Wiley-Liss, Inc.

  2. A non-linear data mining parameter selection algorithm for continuous variables

    PubMed Central

    Razavi, Marianne; Brady, Sean

    2017-01-01

    In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables. PMID:29131829

  3. Empirical and semi-analytical models for predicting peak outflows caused by embankment dam failures

    NASA Astrophysics Data System (ADS)

    Wang, Bo; Chen, Yunliang; Wu, Chao; Peng, Yong; Song, Jiajun; Liu, Wenjun; Liu, Xin

    2018-07-01

    Prediction of peak discharge of floods has attracted great attention for researchers and engineers. In present study, nine typical nonlinear mathematical models are established based on database of 40 historical dam failures. The first eight models that were developed with a series of regression analyses are purely empirical, while the last one is a semi-analytical approach that was derived from an analytical solution of dam-break floods in a trapezoidal channel. Water depth above breach invert (Hw), volume of water stored above breach invert (Vw), embankment length (El), and average embankment width (Ew) are used as independent variables to develop empirical formulas of estimating the peak outflow from breached embankment dams. It is indicated from the multiple regression analysis that a function using the former two variables (i.e., Hw and Vw) produce considerably more accurate results than that using latter two variables (i.e., El and Ew). It is shown that the semi-analytical approach works best in terms of both prediction accuracy and uncertainty, and the established empirical models produce considerably reasonable results except the model only using El. Moreover, present models have been compared with other models available in literature for estimating peak discharge.

  4. A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples.

    PubMed

    Li, Yankun; Shao, Xueguang; Cai, Wensheng

    2007-04-15

    Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.

  5. Regression models for analyzing costs and their determinants in health care: an introductory review.

    PubMed

    Gregori, Dario; Petrinco, Michele; Bo, Simona; Desideri, Alessandro; Merletti, Franco; Pagano, Eva

    2011-06-01

    This article aims to describe the various approaches in multivariable modelling of healthcare costs data and to synthesize the respective criticisms as proposed in the literature. We present regression methods suitable for the analysis of healthcare costs and then apply them to an experimental setting in cardiovascular treatment (COSTAMI study) and an observational setting in diabetes hospital care. We show how methods can produce different results depending on the degree of matching between the underlying assumptions of each method and the specific characteristics of the healthcare problem. The matching of healthcare cost models to the analytic objectives and characteristics of the data available to a study requires caution. The study results and interpretation can be heavily dependent on the choice of model with a real risk of spurious results and conclusions.

  6. Regional Regression Equations to Estimate Flow-Duration Statistics at Ungaged Stream Sites in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2010-01-01

    Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.

  7. Candida virulence and ethanol-derived acetaldehyde production in oral cancer and non-cancer subjects.

    PubMed

    Alnuaimi, A D; Ramdzan, A N; Wiesenfeld, D; O'Brien-Simpson, N M; Kolev, S D; Reynolds, E C; McCullough, M J

    2016-11-01

    To compare biofilm-forming ability, hydrolytic enzymes and ethanol-derived acetaldehyde production of oral Candida isolated from the patients with oral cancer and matched non-oral cancer. Fungal biofilms were grown in RPMI-1640 medium, and biofilm mass and biofilm activity were assessed using crystal violet staining and XTT salt reduction assays, respectively. Phospholipase, proteinase, and esterase production were measured using agar plate method, while fungal acetaldehyde production was assessed via gas chromatography. Candida isolated from patients with oral cancer demonstrated significantly higher biofilm mass (P = 0.031), biofilm metabolic activity (P < 0.001), phospholipase (P = 0.002), and proteinase (P = 0.0159) activity than isolates from patients with non-oral cancer. High ethanol-derived acetaldehyde-producing Candida were more prevalent in patients with oral cancer than non-oral cancer (P = 0.01). In univariate regression analysis, high biofilm mass (P = 0.03) and biofilm metabolic activity (P < 0.001), high phospholipase (P = 0.003), and acetaldehyde production ability (0.01) were significant risk factors for oral cancer; while in the multivariate regression analysis, high biofilm activity (0.01) and phospholipase (P = 0.01) were significantly positive influencing factors on oral cancer. These data suggest a significant positive association between the ability of Candida isolates to form biofilms, to produce hydrolytic enzymes, and to metabolize alcohol to acetaldehyde with their ability to promote oral cancer development. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care.

    PubMed

    Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M

    2014-06-19

    An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.

  9. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    ERIC Educational Resources Information Center

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  10. Simulated peak inflows for glacier dammed Russell Fiord, near Yakutat, Alaska

    USGS Publications Warehouse

    Neal, Edward G.

    2004-01-01

    In June 2002, Hubbard Glacier advanced across the entrance to 35-mile-long Russell Fiord creating a glacier-dammed lake. After closure of the ice and moraine dam, runoff from mountain streams and glacial melt caused the level in ?Russell Lake? to rise until it eventually breached the dam on August 14, 2002. Daily mean inflows to the lake during the period of closure were estimated on the basis of lake stage data and the hypsometry of Russell Lake. Inflows were regressed against the daily mean streamflows of nearby Ophir Creek and Situk River to generate an equation for simulating Russell Lake inflow. The regression equation was used to produce 11 years of synthetic daily inflows to Russell Lake for the 1992-2002 water years. A flood-frequency analysis was applied to the peak daily mean inflows for these 11 years of record to generate a 100-year peak daily mean inflow of 235,000 cubic feet per second. Regional-regression equations also were applied to the Russell Lake basin, yielding a 100-year inflow of 157,000 cubic feet per second.

  11. Psychosocial and demographic variables associated with consumer intention to purchase sustainably produced foods as defined by the Midwest Food Alliance.

    PubMed

    Robinson, Ramona; Smith, Chery

    2002-01-01

    To examine psychosocial and demographic variables associated with consumer intention to purchase sustainably produced foods using an expanded Theory of Planned Behavior. Consumers were approached at the store entrance and asked to complete a self-administered survey. Three metropolitan Minnesota grocery stores. Participants (n = 550) were adults who shopped at the store: the majority were white, female, and highly educated and earned >or= 50,000 dollars/year. Participation rates averaged 62%. The major domain investigated was consumer support for sustainably produced foods. Demographics, beliefs, attitudes, subjective norm, and self-identity and perceived behavioral control were evaluated as predictors of intention to purchase them. Descriptive statistics, independent t tests, one-way analysis of variance, Pearson product moment correlation coefficients, and stepwise multiple regression analyses (P <.05). Consumers were supportive of sustainably produced foods but not highly confident in their ability to purchase them. Independent predictors of intention to purchase them included attitudes, beliefs, perceived behavioral control, subjective norm, past buying behavior, and marital status. Beliefs, attitudes, and confidence level may influence intention to purchase sustainably produced foods. Nutrition educators could increase consumers' awareness of sustainably produced foods by understanding their beliefs, attitudes, and confidence levels.

  12. Uncertainty Analysis on Heat Transfer Correlations for RP-1 Fuel in Copper Tubing

    NASA Technical Reports Server (NTRS)

    Driscoll, E. A.; Landrum, D. B.

    2004-01-01

    NASA is studying kerosene (RP-1) for application in Next Generation Launch Technology (NGLT). Accurate heat transfer correlations in narrow passages at high temperatures and pressures are needed. Hydrocarbon fuels, such as RP-1, produce carbon deposition (coke) along the inside of tube walls when heated to high temperatures. A series of tests to measure the heat transfer using RP-1 fuel and examine the coking were performed in NASA Glenn Research Center's Heated Tube Facility. The facility models regenerative cooling by flowing room temperature RP-1 through resistively heated copper tubing. A Regression analysis is performed on the data to determine the heat transfer correlation for Nusselt number as a function of Reynolds and Prandtl numbers. Each measurement and calculation is analyzed to identify sources of uncertainty, including RP-1 property variations. Monte Carlo simulation is used to determine how each uncertainty source propagates through the regression and an overall uncertainty in predicted heat transfer coefficient. The implications of these uncertainties on engine design and ways to minimize existing uncertainties are discussed.

  13. Quantification of the effects of quality investment on the Cost of Poor Quality: A quasi-experimental study

    NASA Astrophysics Data System (ADS)

    Tamimi, Abdallah Ibrahim

    Quality management is a fundamental challenge facing businesses. This research attempted to quantify the effect of quality investment on the Cost of Poor Quality (COPQ) in an aerospace company utilizing 3 years of quality data at United Launch Alliance, a Boeing -- Lockheed Martin Joint Venture Company. Statistical analysis tools, like multiple regressions, were used to quantify the relationship between quality investments and COPQ. Strong correlations were evident by the high correlation coefficient R2 and very small p-values in multiple regression analysis. The models in the study helped produce an Excel macro that based on preset constraints, optimized the level of quality spending to minimize COPQ. The study confirmed that as quality investments were increased, the COPQ decreased steadily until a point of diminishing return was reached. The findings may be used to develop an approach to reduce the COPQ and enhance product performance. Achieving superior quality in rocket launching enhances the accuracy, reliability, and mission success of delivering satellites to their precise orbits in pursuit of knowledge, peace, and freedom while assuring safety for the end user.

  14. Characterizing multivariate decoding models based on correlated EEG spectral features.

    PubMed

    McFarland, Dennis J

    2013-07-01

    Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  15. [Chemical and sensory characterization of tea (Thea sinensis) consumed in Chile].

    PubMed

    Wittig de Penna, Emma; José Zúñiga, María; Fuenzalida, Regina; López-Planes, Reinaldo

    2005-03-01

    By means of descriptive analysis four varieties of tea (Thea sinensis) were assesed: Argentinean OP (orange pekoe) tea (black), Brazilian OP tea (black), Ceylan OP tea (black) and Darjeeling OP tea (green). The appearance of dry tea leaves were qualitatively characterized comparing with dry leaves standard. The attributes: colour, form, regularity of the leaves, fibre and stem cutting were evaluated The differences obtained were related to the differences produced by the effect of the fermentation process. Flavour and aroma descriptors of the tea liqueur were generated by a trained panel. Colour and astringency were evaluated in comparison with qualified standards using non structured linear scales. In order to relate the sensory analysis and the chemical composition for the different varieties of tea, following determinations were made: chemical moisture, dry material, aqueous extract, tannin and caffeine. Through multifactor regression analysis the equations in relation to the following chemical parameters were determined. Dry material, aqueous extract and tannins for colour and moisture, dry material and aqueous extract for astringency, respectively. Statistical analysis through ANOVA (3 variation sources: samples, judges and replications) showed for samples four significant different groups for astringency and three different groups for colour. No significant differences between judges or repetitions were found. By multifactor regression analysis of both, colour and astringency, on their dependence of chemist results were calculated in order to asses the corresponding equations.

  16. Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math.

    PubMed

    Raizada, Rajeev D S; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D; Ansari, Daniel; Kuhl, Patricia K

    2010-05-15

    A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain-behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain-behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain-behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  17. A comparison between standard methods and structural nested modelling when bias from a healthy worker survivor effect is suspected: an iron-ore mining cohort study.

    PubMed

    Björ, Ove; Damber, Lena; Jonsson, Håkan; Nilsson, Tohr

    2015-07-01

    Iron-ore miners are exposed to extremely dusty and physically arduous work environments. The demanding activities of mining select healthier workers with longer work histories (ie, the Healthy Worker Survivor Effect (HWSE)), and could have a reversing effect on the exposure-response association. The objective of this study was to evaluate an iron-ore mining cohort to determine whether the effect of respirable dust was confounded by the presence of an HWSE. When an HWSE exists, standard modelling methods, such as Cox regression analysis, produce biased results. We compared results from g-estimation of accelerated failure-time modelling adjusted for HWSE with corresponding unadjusted Cox regression modelling results. For all-cause mortality when adjusting for the HWSE, cumulative exposure from respirable dust was associated with a 6% decrease of life expectancy if exposed ≥15 years, compared with never being exposed. Respirable dust continued to be associated with mortality after censoring outcomes known to be associated with dust when adjusting for the HWSE. In contrast, results based on Cox regression analysis did not support that an association was present. The adjustment for the HWSE made a difference when estimating the risk of mortality from respirable dust. The results of this study, therefore, support the recommendation that standard methods of analysis should be complemented with structural modelling analysis techniques, such as g-estimation of accelerated failure-time modelling, to adjust for the HWSE. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  18. Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math

    PubMed Central

    Raizada, Rajeev D.S.; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D.; Ansari, Daniel; Kuhl, Patricia K.

    2010-01-01

    A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain–behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain–behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain–behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. PMID:20132896

  19. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  20. Assessing the risk of bovine fasciolosis using linear regression analysis for the state of Rio Grande do Sul, Brazil.

    PubMed

    Silva, Ana Elisa Pereira; Freitas, Corina da Costa; Dutra, Luciano Vieira; Molento, Marcelo Beltrão

    2016-02-15

    Fasciola hepatica is the causative agent of fasciolosis, a disease that triggers a chronic inflammatory process in the liver affecting mainly ruminants and other animals including humans. In Brazil, F. hepatica occurs in larger numbers in the most Southern state of Rio Grande do Sul. The objective of this study was to estimate areas at risk using an eight-year (2002-2010) time series of climatic and environmental variables that best relate to the disease using a linear regression method to municipalities in the state of Rio Grande do Sul. The positivity index of the disease, which is the rate of infected animal per slaughtered animal, was divided into three risk classes: low, medium and high. The accuracy of the known sample classification on the confusion matrix for the low, medium and high rates produced by the estimated model presented values between 39 and 88% depending of the year. The regression analysis showed the importance of the time-based data for the construction of the model, considering the two variables of the previous year of the event (positivity index and maximum temperature). The generated data is important for epidemiological and parasite control studies mainly because F. hepatica is an infection that can last from months to years. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Foodborne general outbreaks of Shiga toxin-producing Escherichia coli O157 in England and Wales 1992-2002: where are the risks?

    PubMed

    Gillespie, I A; O'Brien, S J; Adak, G K; Cheasty, T; Willshaw, G

    2005-10-01

    Between 1 January 1992 and 31 December 2002, Shiga toxin-producing Escherichia coli O157 (STEC O157) accounted for 44 of the 1645 foodborne general outbreaks of infectious intestinal disease reported to the Health Protection Agency Communicable Disease Surveillance Centre. These outbreaks, although rare, were characterized by severe infection, with 169 hospital admissions and five deaths reported. STEC O157 outbreaks were compared with other pathogens to identify factors associated with this pathogen. Single risk variable analysis and logistic regression were employed. Two distinct aetiologies were identified. Foodborne outbreaks of STEC O157 infection in England and Wales were independently associated with farms, which related to milk and milk products, and with red meats/meat products, which highlighted butchers' shops as a cause for concern. The introduction and adherence to effective control measures, based on the principles of hazard analysis, provide the best means of minimizing the risk of foodborne infection with this pathogen.

  2. Statistical summary of selected physical, chemical, and toxicity characteristics and estimates of annual constituent loads in urban stormwater, Maricopa County, Arizona

    USGS Publications Warehouse

    Fossum, Kenneth D.; O'Day, Christie M.; Wilson, Barbara J.; Monical, Jim E.

    2001-01-01

    Stormwater and streamflow in Maricopa County were monitored to (1) describe the physical, chemical, and toxicity characteristics of stormwater from areas having different land uses, (2) describe the physical, chemical, and toxicity characteristics of streamflow from areas that receive urban stormwater, and (3) estimate constituent loads in stormwater. Urban stormwater and streamflow had similar ranges in most constituent concentrations. The mean concentration of dissolved solids in urban stormwater was lower than in streamflow from the Salt River and Indian Bend Wash. Urban stormwater, however, had a greater chemical oxygen demand and higher concentrations of most nutrients. Mean seasonal loads and mean annual loads of 11 constituents and volumes of runoff were estimated for municipalities in the metropolitan Phoenix area, Arizona, by adjusting regional regression equations of loads. This adjustment procedure uses the original regional regression equation and additional explanatory variables that were not included in the original equation. The adjusted equations had standard errors that ranged from 161 to 196 percent. The large standard errors of the prediction result from the large variability of the constituent concentration data used in the regression analysis. Adjustment procedures produced unsatisfactory results for nine of the regressions?suspended solids, dissolved solids, total phosphorus, dissolved phosphorus, total recoverable cadmium, total recoverable copper, total recoverable lead, total recoverable zinc, and storm runoff. These equations had no consistent direction of bias and no other additional explanatory variables correlated with the observed loads. A stepwise-multiple regression or a three-variable regression (total storm rainfall, drainage area, and impervious area) and local data were used to develop local regression equations for these nine constituents. These equations had standard errors from 15 to 183 percent.

  3. Comparative effectiveness research in cancer with observational data.

    PubMed

    Giordano, Sharon H

    2015-01-01

    Observational studies are increasingly being used for comparative effectiveness research. These studies can have the greatest impact when randomized trials are not feasible or when randomized studies have not included the population or outcomes of interest. However, careful attention must be paid to study design to minimize the likelihood of selection biases. Analytic techniques, such as multivariable regression modeling, propensity score analysis, and instrumental variable analysis, also can also be used to help address confounding. Oncology has many existing large and clinically rich observational databases that can be used for comparative effectiveness research. With careful study design, observational studies can produce valid results to assess the benefits and harms of a treatment or intervention in representative real-world populations.

  4. Regression of altitude-produced cardiac hypertrophy.

    NASA Technical Reports Server (NTRS)

    Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.

    1973-01-01

    The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.

  5. Advanced statistical methods for improved data analysis of NASA astrophysics missions

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.

    1992-01-01

    The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.

  6. An evaluation of supervised classifiers for indirectly detecting salt-affected areas at irrigation scheme level

    NASA Astrophysics Data System (ADS)

    Muller, Sybrand Jacobus; van Niekerk, Adriaan

    2016-07-01

    Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.

  7. Carbon monoxide mixing ratio inference from gas filter radiometer data

    NASA Technical Reports Server (NTRS)

    Wallio, H. A.; Reichle, H. G., Jr.; Casas, J. C.; Saylor, M. S.; Gormsen, B. B.

    1983-01-01

    A new algorithm has been developed which permits, for the first time, real time data reduction of nadir measurements taken with a gas filter correlation radiometer to determine tropospheric carbon monoxide concentrations. The algorithm significantly reduces the complexity of the equations to be solved while providing accuracy comparable to line-by-line calculations. The method is based on a regression analysis technique using a truncated power series representation of the primary instrument output signals to infer directly a weighted average of trace gas concentration. The results produced by a microcomputer-based implementation of this technique are compared with those produced by the more rigorous line-by-line methods. This algorithm has been used in the reduction of Measurement of Air Pollution from Satellites, Shuttle, and aircraft data.

  8. Patient casemix classification for medicare psychiatric prospective payment.

    PubMed

    Drozd, Edward M; Cromwell, Jerry; Gage, Barbara; Maier, Jan; Greenwald, Leslie M; Goldman, Howard H

    2006-04-01

    For a proposed Medicare prospective payment system for inpatient psychiatric facility treatment, the authors developed a casemix classification to capture differences in patients' real daily resource use. Primary data on patient characteristics and daily time spent in various activities were collected in a survey of 696 patients from 40 inpatient psychiatric facilities. Survey data were combined with Medicare claims data to estimate intensity-adjusted daily cost. Classification and Regression Trees (CART) analysis of average daily routine and ancillary costs yielded several hierarchical classification groupings. Regression analysis was used to control for facility and day-of-stay effects in order to compare hierarchical models with models based on the recently proposed payment system of the Centers for Medicare & Medicaid Services. CART analysis identified a small set of patient characteristics strongly associated with higher daily costs, including age, psychiatric diagnosis, deficits in daily living activities, and detox or ECT use. A parsimonious, 16-group, fully interactive model that used five major DSM-IV categories and stratified by age, illness severity, deficits in daily living activities, dangerousness, and use of ECT explained 40% (out of a possible 76%) of daily cost variation not attributable to idiosyncratic daily changes within patients. A noninteractive model based on diagnosis-related groups, age, and medical comorbidity had explanatory power of only 32%. A regression model with 16 casemix groups restricted to using "appropriate" payment variables (i.e., those with clinical face validity and low administrative burden that are easily validated and provide proper care incentives) produced more efficient and equitable payments than did a noninteractive system based on diagnosis-related groups.

  9. Dual regression physiological modeling of resting-state EPI power spectra: Effects of healthy aging.

    PubMed

    Viessmann, Olivia; Möller, Harald E; Jezzard, Peter

    2018-02-02

    Aging and disease-related changes in the arteriovasculature have been linked to elevated levels of cardiac cycle-induced pulsatility in the cerebral microcirculation. Functional magnetic resonance imaging (fMRI), acquired fast enough to unalias the cardiac frequency contributions, can be used to study these physiological signals in the brain. Here, we propose an iterative dual regression analysis in the frequency domain to model single voxel power spectra of echo planar imaging (EPI) data using external recordings of the cardiac and respiratory cycles as input. We further show that a data-driven variant, without external physiological traces, produces comparable results. We use this framework to map and quantify cardiac and respiratory contributions in healthy aging. We found a significant increase in the spatial extent of cardiac modulated white matter voxels with age, whereas the overall strength of cardiac-related EPI power did not show an age effect. Copyright © 2018. Published by Elsevier Inc.

  10. On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP.

    PubMed

    Winkler, Irene; Debener, Stefan; Müller, Klaus-Robert; Tangermann, Michael

    2015-01-01

    Standard artifact removal methods for electroencephalographic (EEG) signals are either based on Independent Component Analysis (ICA) or they regress out ocular activity measured at electrooculogram (EOG) channels. Successful ICA-based artifact reduction relies on suitable pre-processing. Here we systematically evaluate the effects of high-pass filtering at different frequencies. Offline analyses were based on event-related potential data from 21 participants performing a standard auditory oddball task and an automatic artifactual component classifier method (MARA). As a pre-processing step for ICA, high-pass filtering between 1-2 Hz consistently produced good results in terms of signal-to-noise ratio (SNR), single-trial classification accuracy and the percentage of `near-dipolar' ICA components. Relative to no artifact reduction, ICA-based artifact removal significantly improved SNR and classification accuracy. This was not the case for a regression-based approach to remove EOG artifacts.

  11. Analysis of reciprocal creatinine plots by two-phase linear regression.

    PubMed

    Rowe, P A; Richardson, R E; Burton, P R; Morgan, A G; Burden, R P

    1989-01-01

    The progression of renal diseases is often monitored by the serial measurement of plasma creatinine. The slope of the linear relation that is frequently found between the reciprocal of creatinine concentration and time delineates the rate of change in renal function. Minor changes in slope, perhaps indicating response to therapeutic intervention, can be difficult to identify and yet be of clinical importance. We describe the application of two-phase linear regression to identify and characterise changes in slope using a microcomputer. The method fits two intersecting lines to the data by computing a least-squares estimate of the position of the slope change and its 95% confidence limits. This avoids the potential bias of fixing the change at a preconceived time corresponding with an alteration in treatment. The program then evaluates the statistical and clinical significance of the slope change and produces a graphical output to aid interpretation.

  12. Evaluation of mercury pollution in cultivated and wild plants from two small communities of the Tapajós gold mining reserve, Pará State, Brazil.

    PubMed

    Egler, Silvia G; Rodrigues-Filho, Saulo; Villas-Bôas, Roberto C; Beinhoff, Christian

    2006-09-01

    This study examines the total Hg contamination in soil and sediments, and the correlation between the total Hg concentration in soil and vegetables in two small scale gold mining areas, São Chico and Creporizinho, in the State of Para, Brazilian Amazon. Total Hg values for soil samples for both study areas are higher than region background values (ca. 0.15 mg/kg). At São Chico, mean values in soils samples are higher than at Creporizinho, but without significant differences at alpha<0.05 level. São Chico's aboveground produce samples possess significantly higher values for total Hg levels than samples from Creporizinho. Creporizinho's soil-root produce regression model were significant, and the slope negative. Creporizinho's soil-aboveground and root wild plants regression models were also significant, and the slopes positives. Although, aboveground:root ratios were >1 in all of São Chico's produce samples, soil-plant parts regression were not significant, and Hg uptake probably occurs through stomata by atmospheric mercury deposition. Wild plants aboveground:root ratios were <1 at both study areas, and soil-plant parts regressions were significant in samples of Creporizinho, suggesting that they function as an excluder. The average total contents of Hg in edible parts of produces were close to FAO/WHO/JECFA PTWI values in São Chico area, and much lower in Creporizinho. However, Hg inorganic small gastrointestinal absorption reduces its adverse health effects.

  13. Comparison of two methods in deriving a short version of oral health-related quality of life measure.

    PubMed

    Saub, R; Locker, D; Allison, P

    2008-09-01

    To compare two methods of developing short forms of the Malaysian Oral Health Impact Profile (OHIP-M) measure. Cross sectional data obtained using the long form of the OHIP-M was used to produce two types of OHIP-M short forms, derived using two different methods; namely regression and item frequency methods. The short version derived using a regression method is known as Reg-SOHIP(M) and that derived using a frequency method is known as Freq-SOHIP(M). Both short forms contained 14 items. These two forms were then compared in terms of their content, scores, reliability, validity and the ability to distinguish between groups. Out of 14 items, only four were in common. The form derived from the frequency method contained more high prevalence items and higher scores than the form derived from the regression method. Both methods produced a reliable and valid measure. However, the frequency method produced a measure, which was slightly better in terms of distinguishing between groups. Regardless of the method used to produce the measures, both forms performed equally well when tested for their cross-sectional psychometric properties.

  14. Nosocomial colonization due to imipenem-resistant Pseudomonas aeruginosa epidemiologically linked to breast milk feeding in a neonatal intensive care unit.

    PubMed

    Mammina, Caterina; Di Carlo, Paola; Cipolla, Domenico; Casuccio, Alessandra; Tantillo, Matilde; Plano, Maria Rosa Anna; Mazzola, Angela; Corsello, Giovanni

    2008-12-01

    We describe a one-year investigation of colonization by imipenemresistant, metallo-beta-lactamase (MBL) producing Pseudomonas aeruginosa in a neonatal intensive care unit (NICU) of the University Hospital of Palermo, Italy. A prospective epidemiological investigation was conducted in the period 2003 January to 2004 January. Rectal swabs were collected twice a week from all neonates throughout their NICU stay. MBL production by imipenem-resistant strains of P aeruginosa was detected by phenotypic and molecular methods. Pulsed field gel electrophoresis (PFGE) was carried out on all isolates of P aeruginosa. The association between risk factors and colonization by imipenem-resistant, imipenem-susceptible P aeruginosa isolates and other multidrug-resistant Gram negative (MDRGN) organisms was analyzed for variables present at admission and during the NICU stay. Data analysis was carried out by the Cox proportional hazards regression model. Twentytwo of 210 neonates were colonized with imipenem-resistant, MBL-producing P aeruginosa isolates and 14 by imipenem-susceptible P aeruginosa isolates. A single pulsotype, named A, was shared by all imipenem-resistant isolates. Colonization by P aeruginosa of pulsotype A was positively correlated with breast milk feeding and administration of ampicillin-sulbactam, and inversely correlated with exclusive feeding by formula. In the Cox proportional hazards regression model, birthweight of more than 2500 g and breast milk feeding were independently associated with an increased risk of colonization by MBL producing P aeruginosa. The results strongly support an association between colonization by a well-defined imipenem-resistant, MBL producing P aeruginosa strain and breast milk feeding. Such a study may highlight the need for implementation of strategies to prevent expressed breast milk from becoming a vehicle of health care-associated infections.

  15. Regression: The Apple Does Not Fall Far From the Tree.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  16. Applied Multiple Linear Regression: A General Research Strategy

    ERIC Educational Resources Information Center

    Smith, Brandon B.

    1969-01-01

    Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)

  17. Study of the role of tumor necrosis factor-α (-308 G/A) and interleukin-10 (-1082 G/A) polymorphisms as potential risk factors to acute kidney injury in patients with severe sepsis using high-resolution melting curve analysis.

    PubMed

    Hashad, Doaa I; Elsayed, Eman T; Helmy, Tamer A; Elawady, Samier M

    2017-11-01

    Septic acute kidney injury (AKI) is a prevalent complication in intensive care units with an increased incidence of complications. The aim of the present study was to assess the use of high-resolution melting curve (HRM) analysis in investigating whether the genetic polymorphisms; -308 G/A of tumor necrosis factor-α (TNF-α), and -1082 G /A of Interleukin-10 (IL-10) genes may predispose patients diagnosed with severe sepsis to the development of AKI. One hundred and fifty patients with severe sepsis participated in the present study; only sixty-six developed AKI. Both polymorphisms were studied using HRM analysis. The low producer genotype of both studied polymorphism of TNF-α and IL-10 genes was associated with AKI. Using logistic regression analysis, the low producer genotypes remained an independent risk factor for AKI. A statistically significant difference was detected between both studied groups as regards the low producer genotype in both TNF-α (-308 G/A) and interleukin-10 (IL-10) (-1082 G/A) polymorphisms being prevalent in patients developing AKI. Principle conclusions: The low producer genotypes of both TNF-α (-308 G/A) and IL-10 (-1082 G/A) polymorphisms could be considered a risk factor for the development of AKI in critically ill patients with severe sepsis, thus management technique implemented for this category should be modulated rescuing this sector of patients from the grave deterioration to acute kidney injury. Using HRM for genotyping proved to be a highly efficient, simple, cost-effective genotyping technique that is most appropriate for the routine study of large-scale samples.

  18. Producer attitudes and practices related to antimicrobial use in beef cattle in Tennessee.

    PubMed

    Green, Alice L; Carpenter, L Rand; Edmisson, Darryl E; Lane, Clyde D; Welborn, Matt G; Hopkins, Fred M; Bemis, David A; Dunn, John R

    2010-12-01

    To evaluate knowledge, attitudes, and management practices involving antimicrobial use among Tennessee beef producers. Mail survey. A population-based, stratified random sample of 3,000 beef producers across the state. Questionnaires were mailed to beef producers. Questions focused on producer practices related to education, biosecurity, veterinary use, and the purchase and use of antimicrobials. Operation types were categorized as either cow-calf only or multiple operation type (MOT). Associations between various factors and antimicrobial use were evaluated by use of multivariable logistic regression, with the outcome variable being any antimicrobial use (injectable or by mouth) in the past year. Of 3,000 questionnaires mailed, 1,042 (34.7%) were returned. A significantly higher proportion of producers with MOTs reported giving antimicrobials by mouth or by injection than did producers with cow-calf only operations. In addition, higher proportions of producers with MOTs than producers with cow-calf only operations reported treating with macrolides, florfenicol, ceftiofur, and aminoglycosides. In the multivariable analysis, herd size>50 cattle, participation in Beef Quality Assurance or master beef producer certification programs, quarantining of newly purchased animals, use of written instructions for treating disease, and observation of withdrawal times were associated with a higher likelihood of antimicrobial use. Results suggested that producers who engaged in more progressive farming practices were also more likely to use antimicrobials. Incorporating training on judicious antimicrobial use into educational programs would likely increase awareness of best management practices regarding antimicrobial use.

  19. Old and New Ideas for Data Screening and Assumption Testing for Exploratory and Confirmatory Factor Analysis

    PubMed Central

    Flora, David B.; LaBrish, Cathy; Chalmers, R. Philip

    2011-01-01

    We provide a basic review of the data screening and assumption testing issues relevant to exploratory and confirmatory factor analysis along with practical advice for conducting analyses that are sensitive to these concerns. Historically, factor analysis was developed for explaining the relationships among many continuous test scores, which led to the expression of the common factor model as a multivariate linear regression model with observed, continuous variables serving as dependent variables, and unobserved factors as the independent, explanatory variables. Thus, we begin our paper with a review of the assumptions for the common factor model and data screening issues as they pertain to the factor analysis of continuous observed variables. In particular, we describe how principles from regression diagnostics also apply to factor analysis. Next, because modern applications of factor analysis frequently involve the analysis of the individual items from a single test or questionnaire, an important focus of this paper is the factor analysis of items. Although the traditional linear factor model is well-suited to the analysis of continuously distributed variables, commonly used item types, including Likert-type items, almost always produce dichotomous or ordered categorical variables. We describe how relationships among such items are often not well described by product-moment correlations, which has clear ramifications for the traditional linear factor analysis. An alternative, non-linear factor analysis using polychoric correlations has become more readily available to applied researchers and thus more popular. Consequently, we also review the assumptions and data-screening issues involved in this method. Throughout the paper, we demonstrate these procedures using an historic data set of nine cognitive ability variables. PMID:22403561

  20. Quantitative analysis of titanium-induced artifacts and correlated factors during micro-CT scanning.

    PubMed

    Li, Jun Yuan; Pow, Edmond Ho Nang; Zheng, Li Wu; Ma, Li; Kwong, Dora Lai Wan; Cheung, Lim Kwong

    2014-04-01

    To investigate the impact of cover screw, resin embedment, and implant angulation on artifact of microcomputed tomography (micro-CT) scanning for implant. A total of twelve implants were randomly divided into 4 groups: (i) implant only; (ii) implant with cover screw; (iii) implant with resin embedment; and (iv) implants with cover screw and resin embedment. Implants angulation at 0°, 45°, and 90° were scanned by micro-CT. Images were assessed, and the ratio of artifact volume to total volume (AV/TV) was calculated. A multiple regression analysis in stepwise model was used to determine the significance of different factors. One-way ANOVA was performed to identify which combination of factors could minimize the artifact. In the regression analysis, implant angulation was identified as the best predictor for artifact among the factors (P < 0.001). Resin embedment also had significant effect on artifact volume (P = 0.028), while cover screw had not (P > 0.05). Non-embedded implants with the axis parallel to X-ray source of micro-CT produced minimal artifact. Implant angulation and resin embedment affected the artifact volume of micro-CT scanning for implant, while cover screw did not. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  1. A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

    PubMed

    Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

    2013-08-01

    It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences.

  2. Computer-delivered interventions for reducing alcohol consumption: meta-analysis and meta-regression using behaviour change techniques and theory.

    PubMed

    Black, Nicola; Mullan, Barbara; Sharpe, Louise

    2016-09-01

    The current aim was to examine the effectiveness of behaviour change techniques (BCTs), theory and other characteristics in increasing the effectiveness of computer-delivered interventions (CDIs) to reduce alcohol consumption. Included were randomised studies with a primary aim of reducing alcohol consumption, which compared self-directed CDIs to assessment-only control groups. CDIs were coded for the use of 42 BCTs from an alcohol-specific taxonomy, the use of theory according to a theory coding scheme and general characteristics such as length of the CDI. Effectiveness of CDIs was assessed using random-effects meta-analysis and the association between the moderators and effect size was assessed using univariate and multivariate meta-regression. Ninety-three CDIs were included in at least one analysis and produced small, significant effects on five outcomes (d+ = 0.07-0.15). Larger effects occurred with some personal contact, provision of normative information or feedback on performance, prompting commitment or goal review, the social norms approach and in samples with more women. Smaller effects occurred when information on the consequences of alcohol consumption was provided. These findings can be used to inform both intervention- and theory-development. Intervention developers should focus on, including specific, effective techniques, rather than many techniques or more-elaborate approaches.

  3. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    NASA Astrophysics Data System (ADS)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for maximal response. For the calculation of the regression coefficients, dispersion and correlation coefficients, the software Matlab was used.

  4. A comparative evaluation of end-emic and non-endemic region of visceral leishmaniasis (Kala-azar) in India with ground survey and space technology.

    PubMed

    Kesari, Shreekant; Bhunia, Gouri Sankar; Kumar, Vijay; Jeyaram, Algarswamy; Ranjan, Alok; Das, Pradeep

    2011-08-01

    In visceral leishmaniasis, phlebotomine vectors are targets for control measures. Understanding the ecosystem of the vectors is a prerequisite for creating these control measures. This study endeavours to delineate the suitable locations of Phlebotomus argentipes with relation to environmental characteristics between endemic and non-endemic districts in India. A cross-sectional survey was conducted on 25 villages in each district. Environmental data were obtained through remote sensing images and vector density was measured using a CDC light trap. Simple linear regression analysis was used to measure the association between climatic parameters and vector density. Using factor analysis, the relationship between land cover classes and P. argentipes density among the villages in both districts was investigated. The results of the regression analysis indicated that indoor temperature and relative humidity are the best predictors for P. argentipes distribution. Factor analysis confirmed breeding preferences for P. argentipes by landscape element. Minimum Normalised Difference Vegetation Index, marshy land and orchard/settlement produced high loading in an endemic region, whereas water bodies and dense forest were preferred in non-endemic sites. Soil properties between the two districts were studied and indicated that soil pH and moisture content is higher in endemic sites compared to non-endemic sites. The present study should be utilised to make critical decisions for vector surveillance and controlling Kala-azar disease vectors.

  5. Does Neostigmine Administration Produce a Clinically Important Increase in Postoperative Nausea and Vomiting?

    PubMed Central

    Cheng, Ching-Rong; Sessler, Daniel I.; Apfel, Christian C.

    2005-01-01

    Neostigmine is used to antagonize neoromuscluar blocker-induced residual neuromuscular paralysis. Despite a previous meta-analysis, the effect of neostigmine on postoperative nausea and vomiting (PONV) remains unresolved. We reevaluated the effect of neostigmine on PONV while considering the different anticholinergics as potentially confounding factors. We performed a systematic literature search using Medline, Embase, Cochrane library, reference listings, and hand searching with no language restriction through December 2004 and identified 10 clinical, randomized, controlled trials evaluating neostigmine's effect on PONV. Data on nausea or vomiting from 933 patients were extracted for the early (0-6 h), delayed (6-24 h), and overall postoperative periods (0-24 h) and analyzed with RevMan 4.2 (Cochrane Collaboration, Oxford, UK) and multiple logistic regression analysis. The combination of neostigmine with either atropine or glycopyrrolate did not significantly increase the incidence of overall (0-24 h) vomiting (relative risk (RR) 0.91 [0.70-1.18], P=0.48) or nausea (RR 1.24 [95% CI: 0.98-1.59], P=0.08). Multiple logistic regression analysis indicated that that there was not a significant increase in the risk of vomiting with large compared with small doses of neostigmine. In contrast to a previous analysis, we conclude that there is insufficient evidence to conclude that neostigmine increases the risk of PONV. PMID:16243993

  6. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    NASA Technical Reports Server (NTRS)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  7. Vitamin D Beliefs and Associations with Sunburns, Sun Exposure, and Sun Protection

    PubMed Central

    Kim, Bang Hyun; Glanz, Karen; Nehl, Eric J.

    2012-01-01

    The main objective of this study was to examine certain beliefs about vitamin D and associations with sun exposure, sun protection behaviors, and sunburns. A total of 3,922 lifeguards, pool managers, and parents completed a survey in 2006 about beliefs regarding vitamin D and sun-related behaviors. Multivariate ordinal regression analyses and linear regression analysis were used to examine associations of beliefs and other variables. Results revealed that Non-Caucasian lifeguards and pool managers were less likely to agree that they needed to go out in the sun to get enough vitamin D. Lifeguards and parents who were non-Caucasian were less likely to report that sunlight helped the body to produce vitamin D. A stronger belief about the need to go out in the sun to get enough vitamin D predicted more sun exposure for lifeguards. For parents, a stronger belief that they can get enough vitamin D from foods predicted greater sun protection and a stronger belief that sunlight helps the body produce vitamin D predicted lower sun exposure. This study provides information regarding vitamin D beliefs and their association with certain sun related behaviors across different demographic groups that can inform education efforts about vitamin D and sun protection. PMID:22851950

  8. Case matching and the reduction of selection bias in quasi-experiments: The relative importance of pretest measures of outcome, of unreliable measurement, and of mode of data analysis.

    PubMed

    Cook, Thomas D; Steiner, Peter M

    2010-03-01

    In this article, we note the many ontological, epistemological, and methodological similarities between how Campbell and Rubin conceptualize causation. We then explore 3 differences in their written emphases about individual case matching in observational studies. We contend that (a) Campbell places greater emphasis than Rubin on the special role of pretest measures of outcome among matching variables; (b) Campbell is more explicitly concerned with unreliability in the covariates; and (c) for analyzing the outcome, only Rubin emphasizes the advantages of using propensity score over regression methods. To explore how well these 3 factors reduce bias, we reanalyze and review within-study comparisons that contrast experimental and statistically adjusted nonexperimental causal estimates from studies with the same target population and treatment content. In this context, the choice of covariates counts most for reducing selection bias, and the pretest usually plays a special role relative to all the other covariates considered singly. Unreliability in the covariates also influences bias reduction but by less. Furthermore, propensity score and regression methods produce comparable degrees of bias reduction, though these within-study comparisons may not have met the theoretically specified conditions most likely to produce differences due to analytic method.

  9. Antibacterial and antifungal activities of pyroligneous acid from wood of Eucalyptus urograndis and Mimosa tenuiflora.

    PubMed

    de Souza Araújo, E; Pimenta, A S; Feijó, F M C; Castro, R V O; Fasciotti, M; Monteiro, T V C; de Lima, K M G

    2018-01-01

    This work aimed to evaluate the antibacterial and antifungal activities of two types of pyroligneous acid (PA) obtained from slow pyrolysis of wood of Mimosa tenuiflora and of a hybrid of Eucalyptus urophylla × Eucalyptus grandis. Wood wedges were carbonized on a heating rate of 1·25°C min -1 until 450°C. Pyrolysis smoke was trapped and condensed to yield liquid products. Crude pyrolysis liquids were bidistilled under 5 mmHg vacuum yielding purified PA. Multi-antibiotic-resistant strains of Escherichia coli, Pseudomonas aeruginosa (ATCC 27853) and Staphylococcus aureus (ATCC 25923) had their sensitivity to PA evaluated using agar diffusion test. Two yeasts were evaluated as well, Candida albicans (ATCC 10231) and Cryptococcus neoformans. GC-MS analysis of both PAs was carried out to obtain their chemical composition. Regression analysis was performed, and models were adjusted, with diameter of inhibition halos and PA concentration (100, 50 and 20%) as parameters. Identity of regression models and equality of parameters in polynomial orthogonal equations were verified. Inhibition halos were observed in the range 15-25 mm of diameter. All micro-organisms were inhibited by both types of PA even in the lowest concentration of 20%. The feasibility of the usage of PAs produced with wood species planted in large scale in Brazil was evident and the real potential as a basis to produce natural antibacterial and antifungal agents, with real possibility to be used in veterinary and zootechnical applications. © 2017 The Society for Applied Microbiology.

  10. Neighborhood contextual factors, maternal smoking, and birth outcomes: multilevel analysis of the South Carolina PRAMS survey, 2000-2003.

    PubMed

    Nkansah-Amankra, Stephen

    2010-08-01

    Previous studies investigating relationships among neighborhood contexts, maternal smoking behaviors, and birth outcomes (low birth weight [LBW] or preterm births) have produced mixed results. We evaluated independent effects of neighborhood contexts on maternal smoking behaviors and risks of LBW or preterm birth outcomes among mothers participating in the South Carolina Pregnancy Risk Assessment and Monitoring System (PRAMS) survey, 2000-2003. The PRAMS data were geocoded to 2000 U.S. Census data to create a multilevel data structure. We used a multilevel regression analysis (SAS PROC GLIMMIX) to estimate odds ratios (OR) and corresponding 95% confidence intervals (CI). In multivariable logistic regression models, high poverty, predominantly African American neighborhoods, upper quartiles of low education, and second quartile of neighborhood household crowding were significantly associated with LBW. However, only mothers resident in predominantly African American Census tract areas were statistically significantly at an increased risk of delivering preterm (OR 2.2, 95% CI 1.29-3.78). In addition, mothers resident in medium poverty neighborhoods remained modestly associated with smoking after adjustment for maternal-level covariates. The results also indicated that maternal smoking has more consistent effects on LBW than preterm births, particularly for mothers living in deprived neighborhoods. Interventions seeking to improve maternal and child health by reducing smoking during pregnancy need to engage specific community factors that encourage maternal quitting behaviors and reduce smoking relapse rates. Inclusion of maternal-level covariates in neighborhood models without careful consideration of the causal pathway might produce misleading interpretation of the results.

  11. Pretest probability assessment derived from attribute matching

    PubMed Central

    Kline, Jeffrey A; Johnson, Charles L; Pollack, Charles V; Diercks, Deborah B; Hollander, Judd E; Newgard, Craig D; Garvey, J Lee

    2005-01-01

    Background Pretest probability (PTP) assessment plays a central role in diagnosis. This report compares a novel attribute-matching method to generate a PTP for acute coronary syndrome (ACS). We compare the new method with a validated logistic regression equation (LRE). Methods Eight clinical variables (attributes) were chosen by classification and regression tree analysis of a prospectively collected reference database of 14,796 emergency department (ED) patients evaluated for possible ACS. For attribute matching, a computer program identifies patients within the database who have the exact profile defined by clinician input of the eight attributes. The novel method was compared with the LRE for ability to produce PTP estimation <2% in a validation set of 8,120 patients evaluated for possible ACS and did not have ST segment elevation on ECG. 1,061 patients were excluded prior to validation analysis because of ST-segment elevation (713), missing data (77) or being lost to follow-up (271). Results In the validation set, attribute matching produced 267 unique PTP estimates [median PTP value 6%, 1st–3rd quartile 1–10%] compared with the LRE, which produced 96 unique PTP estimates [median 24%, 1st–3rd quartile 10–30%]. The areas under the receiver operating characteristic curves were 0.74 (95% CI 0.65 to 0.82) for the attribute matching curve and 0.68 (95% CI 0.62 to 0.77) for LRE. The attribute matching system categorized 1,670 (24%, 95% CI = 23–25%) patients as having a PTP < 2.0%; 28 developed ACS (1.7% 95% CI = 1.1–2.4%). The LRE categorized 244 (4%, 95% CI = 3–4%) with PTP < 2.0%; four developed ACS (1.6%, 95% CI = 0.4–4.1%). Conclusion Attribute matching estimated a very low PTP for ACS in a significantly larger proportion of ED patients compared with a validated LRE. PMID:16095534

  12. Epidemiology and characteristics of urinary tract infections in children and adolescents.

    PubMed

    Hanna-Wakim, Rima H; Ghanem, Soha T; El Helou, Mona W; Khafaja, Sarah A; Shaker, Rouba A; Hassan, Sara A; Saad, Randa K; Hedari, Carine P; Khinkarly, Rima W; Hajar, Farah M; Bakhash, Marwan; El Karah, Dima; Akel, Imad S; Rajab, Mariam A; Khoury, Mireille; Dbaibo, Ghassan S

    2015-01-01

    Urinary tract infections (UTIs) are among the most common infections in the pediatric population. Over the last two decades, antibiotic resistance is increasing significantly as extended spectrum beta lactamase (ESBL) producing organisms are emerging. The aim of this study is to provide a comprehensive view of the epidemiologic characteristics of UTIs in hospitalized children, examine the risk factors of UTIs caused by ESBL-producing organisms, and determine the resistance patterns in the isolated organisms over the last 10 years. Retrospective chart review was conducted at two Lebanese medical centers. Subjects were identified by looking at the following ICD-9 discharge codes: "Urinary tract infection," "UTI," "Cystitis," and/or "Pyelonephritis." Children less than 18 years of age admitted for UTI between January 1st, 2001 and December 31st, 2011 were included. Cases whose urine culture result did not meet our definition for UTI were excluded. Chi-square, Fisher's exact test, and multivariate logistic regression were used to determine risk factors for ESBL. Linear regression analysis was used to determine resistance patterns. The study included 675 cases with a median age of 16 months and female predominance of 77.7% (525 cases). Of the 584 cases caused by Escherichia coli or Klebsiella spp, 91 cases (15.5%) were found to be ESBL-producing organisms. Vesico-ureteral reflux and previous antibiotics use were found to be independent risk factors for ESBL-producing E. coli and Klebsiella spp. (p < 0.05). A significant linear increase in resistance to all generations of Cephalosporins (r (2) = 0.442) and Fluoroquinolones (r (2) = 0.698) was found. The recognition of risk factors for infection with ESBL-producing organisms and the observation of increasing overall resistance to antibiotics warrant further studies that might probably lead to new recommendations to guide management of UTIs and antibiotic use in children and adolescents.

  13. Bacteraemia due to non-ESBL-producing Escherichia coli O25b:H4 sequence type 131: insights into risk factors, clinical features and outcomes.

    PubMed

    Morales-Barroso, Isabel; López-Cerero, Lorena; Molina, José; Bellido, Mar; Navarro, María Dolores; Serrano, Lara; González-Galán, Verónica; Praena, Julia; Pascual, Alvaro; Rodríguez-Baño, Jesús

    2017-04-01

    The epidemiology and outcomes of bloodstream infections (BSIs) caused by Escherichia coli ST131 isolates not producing extended-spectrum β-lactamases (ESBLs) are not well defined despite being more prevalent than ESBL-producers. In this study, risk factors and the impact on outcome of BSIs caused by non-ESBL-producing ST131 E. coli versus non-ST131 E. coli were investigated. A case-control study was performed in two tertiary centres to identify risk factors for ST131. Molecular methods were used to investigate all E. coli isolates from blood cultures for those belonging to O25b:H4-ST131 clonal group. fimH alleles were characterised in ST131 isolates. Multivariate analysis was performed by logistic regression or Cox regression as appropriate. A total of 33 ST131 E. coli cases and 56 controls were studied. ST131 isolates showed higher rates of resistance to ampicillin and ciprofloxacin; fimH alleles were H30 in 14 isolates (42.4%) and H22 in 12 isolates (36.4%). Only recent surgery (OR = 7.03, 95% CI 1.71-28.84; P = 0.007) and unknown source of bacteraemia (OR = 5.37, 95% CI 0.93-30.81; P = 0.05) were associated with ST131. ST131 isolates showed no association with 30-day mortality, therapeutic failure, presentation with severe sepsis/shock or length of stay. Bacteraemia due to non-ESBL-producing O25b:H4-ST131 E. coli showed few differences in terms of risk factors as well as similar outcome to non-ST131 E. coli. These data support the notion that ST131 strains are not less clinically virulent despite showing increased antimicrobial resistance, but also that they are not more virulent than other clonal groups causing BSI. Copyright © 2017 Elsevier B.V. and International Society of Chemotherapy. All rights reserved.

  14. Relations among soil radon, environmental parameters, volcanic and seismic events at Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.

    2013-12-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  15. Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)

    DTIC Science & Technology

    1987-10-01

    Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE

  16. Design and analysis of forward and reverse models for predicting defect accumulation, defect energetics, and irradiation conditions

    DOE PAGES

    Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent; ...

    2018-03-06

    The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less

  17. Design and analysis of forward and reverse models for predicting defect accumulation, defect energetics, and irradiation conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent

    The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less

  18. Creep-Rupture Data Analysis - Engineering Application of Regression Techniques. Ph.D. Thesis - North Carolina State Univ.

    NASA Technical Reports Server (NTRS)

    Rummler, D. R.

    1976-01-01

    The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.

  19. A Wireless Electronic Nose System Using a Fe2O3 Gas Sensing Array and Least Squares Support Vector Regression

    PubMed Central

    Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo

    2011-01-01

    This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH4/H2) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process. PMID:22346587

  20. Risk of Resource Failure and Toolkit Variation in Small-Scale Farmers and Herders

    PubMed Central

    Collard, Mark; Ruttle, April; Buchanan, Briggs; O’Brien, Michael J.

    2012-01-01

    Recent work suggests that global variation in toolkit structure among hunter-gatherers is driven by risk of resource failure such that as risk of resource failure increases, toolkits become more diverse and complex. Here we report a study in which we investigated whether the toolkits of small-scale farmers and herders are influenced by risk of resource failure in the same way. In the study, we applied simple linear and multiple regression analysis to data from 45 small-scale food-producing groups to test the risk hypothesis. Our results were not consistent with the hypothesis; none of the risk variables we examined had a significant impact on toolkit diversity or on toolkit complexity. It appears, therefore, that the drivers of toolkit structure differ between hunter-gatherers and small-scale food-producers. PMID:22844421

  1. Oil taxation and risks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rodriguez-Padilla, V.

    1992-01-01

    The relationship between the taxation system and the division of risks between the host country governments and the international companies is discussed. The analysis underscores the effect of taxation on the geological and political risks. These two cases are evaluated in two West-African oil-producing countries. It emerges from this that too heavy and regressive taxes greatly increase the risks supported by the two partners. The progressive character of the taxation is a necessary but not a sufficient condition for the reduction of public and private risks. A taxation burden well-balanced among small and large deposits is the best way tomore » reduce the risk due to taxation. The oil-producing countries of this region had made great advances in developing neutral taxation systems but in most cases they must progress further. 15 refs., 3 figs., 1 tab.« less

  2. The prognostic significance of prostate specific antigen in metastatic hormone-resistant prostate cancer.

    PubMed Central

    Fosså, S. D.; Waehre, H.; Paus, E.

    1992-01-01

    Twenty-seven of 152 patients (18%) with progressing hormone resistant prostate cancer had normal serum levels of prostate specific antigen (PSA less than or equal to 10 micrograms l-1), when referred for secondary treatment. PSA was significantly correlated with the extent of skeletal metastases (R: 0.35) and the levels of hemoglobin (R: -0.19) and serum alkaline phosphatase (R: 0.30). In a multivariate Cox regression analysis the survival of the 152 patients was not correlated with the PSA level but with the patients performance status, the level of hemoglobin, and the time between primary hormone treatment and relapse. The lack of serum PSA to predict survival may be explained by a heterogenous composition of hormone resistant prostate cancer as regards differentiated and/or PSA producing vs undifferentiated and/or PSA non-producing cells. PMID:1379059

  3. Weather Impact on Airport Arrival Meter Fix Throughput

    NASA Technical Reports Server (NTRS)

    Wang, Yao

    2017-01-01

    Time-based flow management provides arrival aircraft schedules based on arrival airport conditions, airport capacity, required spacing, and weather conditions. In order to meet a scheduled time at which arrival aircraft can cross an airport arrival meter fix prior to entering the airport terminal airspace, air traffic controllers make regulations on air traffic. Severe weather may create an airport arrival bottleneck if one or more of airport arrival meter fixes are partially or completely blocked by the weather and the arrival demand has not been reduced accordingly. Under these conditions, aircraft are frequently being put in holding patterns until they can be rerouted. A model that predicts the weather impacted meter fix throughput may help air traffic controllers direct arrival flows into the airport more efficiently, minimizing arrival meter fix congestion. This paper presents an analysis of air traffic flows across arrival meter fixes at the Newark Liberty International Airport (EWR). Several scenarios of weather impacted EWR arrival fix flows are described. Furthermore, multiple linear regression and regression tree ensemble learning approaches for translating multiple sector Weather Impacted Traffic Indexes (WITI) to EWR arrival meter fix throughputs are examined. These weather translation models are developed and validated using the EWR arrival flight and weather data for the period of April-September in 2014. This study also compares the performance of the regression tree ensemble with traditional multiple linear regression models for estimating the weather impacted throughputs at each of the EWR arrival meter fixes. For all meter fixes investigated, the results from the regression tree ensemble weather translation models show a stronger correlation between model outputs and observed meter fix throughputs than that produced from multiple linear regression method.

  4. Regression estimators for generic health-related quality of life and quality-adjusted life years.

    PubMed

    Basu, Anirban; Manca, Andrea

    2012-01-01

    To develop regression models for outcomes with truncated supports, such as health-related quality of life (HRQoL) data, and account for features typical of such data such as a skewed distribution, spikes at 1 or 0, and heteroskedasticity. Regression estimators based on features of the Beta distribution. First, both a single equation and a 2-part model are presented, along with estimation algorithms based on maximum-likelihood, quasi-likelihood, and Bayesian Markov-chain Monte Carlo methods. A novel Bayesian quasi-likelihood estimator is proposed. Second, a simulation exercise is presented to assess the performance of the proposed estimators against ordinary least squares (OLS) regression for a variety of HRQoL distributions that are encountered in practice. Finally, the performance of the proposed estimators is assessed by using them to quantify the treatment effect on QALYs in the EVALUATE hysterectomy trial. Overall model fit is studied using several goodness-of-fit tests such as Pearson's correlation test, link and reset tests, and a modified Hosmer-Lemeshow test. The simulation results indicate that the proposed methods are more robust in estimating covariate effects than OLS, especially when the effects are large or the HRQoL distribution has a large spike at 1. Quasi-likelihood techniques are more robust than maximum likelihood estimators. When applied to the EVALUATE trial, all but the maximum likelihood estimators produce unbiased estimates of the treatment effect. One and 2-part Beta regression models provide flexible approaches to regress the outcomes with truncated supports, such as HRQoL, on covariates, after accounting for many idiosyncratic features of the outcomes distribution. This work will provide applied researchers with a practical set of tools to model outcomes in cost-effectiveness analysis.

  5. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  6. A Highly Efficient Design Strategy for Regression with Outcome Pooling

    PubMed Central

    Mitchell, Emily M.; Lyles, Robert H.; Manatunga, Amita K.; Perkins, Neil J.; Schisterman, Enrique F.

    2014-01-01

    The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. PMID:25220822

  7. System dynamic modeling: an alternative method for budgeting.

    PubMed

    Srijariya, Witsanuchai; Riewpaiboon, Arthorn; Chaikledkaew, Usa

    2008-03-01

    To construct, validate, and simulate a system dynamic financial model and compare it against the conventional method. The study was a cross-sectional analysis of secondary data retrieved from the National Health Security Office (NHSO) in the fiscal year 2004. The sample consisted of all emergency patients who received emergency services outside their registered hospital-catchments area. The dependent variable used was the amount of reimbursed money. Two types of model were constructed, namely, the system dynamic model using the STELLA software and the multiple linear regression model. The outputs of both methods were compared. The study covered 284,716 patients from various levels of providers. The system dynamic model had the capability of producing various types of outputs, for example, financial and graphical analyses. For the regression analysis, statistically significant predictors were composed of service types (outpatient or inpatient), operating procedures, length of stay, illness types (accident or not), hospital characteristics, age, and hospital location (adjusted R(2) = 0.74). The total budget arrived at from using the system dynamic model and regression model was US$12,159,614.38 and US$7,301,217.18, respectively, whereas the actual NHSO reimbursement cost was US$12,840,805.69. The study illustrated that the system dynamic model is a useful financial management tool, although it is not easy to construct. The model is not only more accurate in prediction but is also more capable of analyzing large and complex real-world situations than the conventional method.

  8. A highly efficient design strategy for regression with outcome pooling.

    PubMed

    Mitchell, Emily M; Lyles, Robert H; Manatunga, Amita K; Perkins, Neil J; Schisterman, Enrique F

    2014-12-10

    The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. Copyright © 2014 John Wiley & Sons, Ltd.

  9. Isovolumic relaxation period as an index of left ventricular relaxation under different afterload conditions--comparison with the time constant of left ventricular pressure decay in the dog.

    PubMed

    Ochi, H; Ikuma, I; Toda, H; Shimada, T; Morioka, S; Moriyama, K

    1989-12-01

    In order to determine whether isovolumic relaxation period (IRP) reflects left ventricular relaxation under different afterload conditions, 17 anesthetized, open chest dogs were studied, and the left ventricular pressure decay time constant (T) was calculated. In 12 dogs, angiotensin II and nitroprusside were administered, with the heart rate constant at 90 beats/min. Multiple linear regression analysis showed that the aortic dicrotic notch pressure (AoDNP) and T were major determinants of IRP, while left ventricular end-diastolic pressure was a minor determinant. Multiple linear regression analysis, correlating T with IRP and AoDNP, did not further improve the correlation coefficient compared with that between T and IRP. We concluded that correction of the IRP by AoDNP is not necessary to predict T from additional multiple linear regression. The effects of ascending aortic constriction or angiotensin II on IRP were examined in five dogs, after pretreatment with propranolol. Aortic constriction caused a significant decrease in IRP and T, while angiotensin II produced a significant increase in IRP and T. IRP was affected by the change of afterload. However, the IRP and T values were always altered in the same direction. These results demonstrate that IRP is substituted for T and it reflects left ventricular relaxation even in different afterload conditions. We conclude that IRP is a simple parameter easily used to evaluate left ventricular relaxation in clinical situations.

  10. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  11. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  12. The comparison between several robust ridge regression estimators in the presence of multicollinearity and multiple outliers

    NASA Astrophysics Data System (ADS)

    Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said

    2014-09-01

    In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust methods which were reported to be less sensitive to the presence of outliers. In addition, ridge regression technique was employed to tackle multicollinearity problem. In order to mitigate both problems, a combination of ridge regression and robust methods was discussed in this study. The superiority of this approach was examined when simultaneous presence of multicollinearity and multiple outliers occurred in multiple linear regression. This study aimed to look at the performance of several well-known robust estimators; M, MM, RIDGE and robust ridge regression estimators, namely Weighted Ridge M-estimator (WRM), Weighted Ridge MM (WRMM), Ridge MM (RMM), in such a situation. Results of the study showed that in the presence of simultaneous multicollinearity and multiple outliers (in both x and y-direction), the RMM and RIDGE are more or less similar in terms of superiority over the other estimators, regardless of the number of observation, level of collinearity and percentage of outliers used. However, when outliers occurred in only single direction (y-direction), the WRMM estimator is the most superior among the robust ridge regression estimators, by producing the least variance. In conclusion, the robust ridge regression is the best alternative as compared to robust and conventional least squares estimators when dealing with simultaneous presence of multicollinearity and outliers.

  13. Exploring relationships between Dairy Herd Improvement monitors of performance and the Transition Cow Index in Wisconsin dairy herds.

    PubMed

    Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B

    2016-09-01

    Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  14. Teaching Students Not to Dismiss the Outermost Observations in Regressions

    ERIC Educational Resources Information Center

    Kasprowicz, Tomasz; Musumeci, Jim

    2015-01-01

    One econometric rule of thumb is that greater dispersion in observations of the independent variable improves estimates of regression coefficients and therefore produces better results, i.e., lower standard errors of the estimates. Nevertheless, students often seem to mistrust precisely the observations that contribute the most to this greater…

  15. REGRESSION MODELS THAT RELATE STREAMS TO WATERSHEDS: COPING WITH NUMEROUS, COLLINEAR PEDICTORS

    EPA Science Inventory

    GIS efforts can produce a very large number of watershed variables (climate, land use/land cover and topography, all defined for multiple areas of influence) that could serve as candidate predictors in a regression model of reach-scale stream features. Invariably, many of these ...

  16. [A SAS marco program for batch processing of univariate Cox regression analysis for great database].

    PubMed

    Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin

    2015-02-01

    To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.

  17. Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2009-01-01

    In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…

  18. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    USDA-ARS?s Scientific Manuscript database

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  19. Magnitude and frequency of floods in Arkansas

    USGS Publications Warehouse

    Hodge, Scott A.; Tasker, Gary D.

    1995-01-01

    Methods are presented for estimating the magnitude and frequency of peak discharges of streams in Arkansas. Regression analyses were developed in which a stream's physical and flood characteristics were related. Four sets of regional regression equations were derived to predict peak discharges with selected recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years on streams draining less than 7,770 square kilometers. The regression analyses indicate that size of drainage area, main channel slope, mean basin elevation, and the basin shape factor were the most significant basin characteristics that affect magnitude and frequency of floods. The region of influence method is included in this report. This method is still being improved and is to be considered only as a second alternative to the standard method of producing regional regression equations. This method estimates unique regression equations for each recurrence interval for each ungaged site. The regression analyses indicate that size of drainage area, main channel slope, mean annual precipitation, mean basin elevation, and the basin shape factor were the most significant basin and climatic characteristics that affect magnitude and frequency of floods for this method. Certain recommendations on the use of this method are provided. A method is described for estimating the magnitude and frequency of peak discharges of streams for urban areas in Arkansas. The method is from a nationwide U.S. Geeological Survey flood frequency report which uses urban basin characteristics combined with rural discharges to estimate urban discharges. Annual peak discharges from 204 gaging stations, with drainage areas less than 7,770 square kilometers and at least 10 years of unregulated record, were used in the analysis. These data provide the basis for this analysis and are published in the Appendix of this report as supplemental data. Large rivers such as the Red, Arkansas, White, Black, St. Francis, Mississippi, and Ouachita Rivers have floodflow characteristics that differ from those of smaller tributary streams and were treated individually. Regional regression equations are not applicable to these large rivers. The magnitude and frequency of floods along these rivers are based on specific station data. This section is provided in the Appendix and has not been updated since the last Arkansas flood frequency report (1987b), but is included at the request of the cooperator.

  20. Quality of life of patients who undergo breast reconstruction after mastectomy: effects of personality characteristics.

    PubMed

    Bellino, Silvio; Fenocchio, Marina; Zizza, Monica; Rocca, Giuseppe; Bogetti, Paolo; Bogetto, Filippo

    2011-01-01

    Reconstruction after mastectomy has become an integral part of breast cancer treatment. The effects of psychological factors on quality of life after reconstruction have been poorly investigated. The authors examined clinical and personality characteristics related to quality of life in patients receiving reconstructive surgery. All patients received immediate reconstruction and were evaluated in the week before tissue expander implantation (T0) with a semistructured interview for demographic and clinical characteristics, the Temperament and Character Inventory, the Inventory of Interpersonal Problems, the Short Form Health Survey, the Severity Item of the Clinical Global Impression, the Hamilton Depression Rating Scale, and the Hamilton Anxiety Rating Scale. Assessment with the Short Form was repeated 3 months after expander placement (T1). Statistics were calculated with univariate regression and analysis of variance. Significant variables were included in a multiple regression analysis to identify factors related to the change T1-T0 of the mean of the Short Form-transformed scores. Results were significant when p was less than or equal to 0.05. Fifty-seven women were enrolled. Results of multiple regression analysis showed that the Temperament and Character Inventory personality dimension harm avoidance and the Inventory of Interpersonal Problems domain vindictive/self-centered were significantly and independently related to the change in Short Form mean score. Personality dimensions and patterns of interpersonal functioning produce significant effects on patients' quality of life during breast reconstruction. Patients with high harm avoidance are apprehensive and doubtful. Restoration of body image could help them to reduce social anxiety and insecurity. Vindictive/self-centered patients are resentful and aggressive. Breast reconstruction could symbolize the conclusion of a reparative process and fulfill the desire of revenge on cancer.

  1. Melamine detection by mid- and near-infrared (MIR/NIR) spectroscopy: a quick and sensitive method for dairy products analysis including liquid milk, infant formula, and milk powder.

    PubMed

    Balabin, Roman M; Smirnov, Sergey V

    2011-07-15

    Melamine (2,4,6-triamino-1,3,5-triazine) is a nitrogen-rich chemical implicated in the pet and human food recalls and in the global food safety scares involving milk products. Due to the serious health concerns associated with melamine consumption and the extensive scope of affected products, rapid and sensitive methods to detect melamine's presence are essential. We propose the use of spectroscopy data-produced by near-infrared (near-IR/NIR) and mid-infrared (mid-IR/MIR) spectroscopies, in particular-for melamine detection in complex dairy matrixes. None of the up-to-date reported IR-based methods for melamine detection has unambiguously shown its wide applicability to different dairy products as well as limit of detection (LOD) below 1 ppm on independent sample set. It was found that infrared spectroscopy is an effective tool to detect melamine in dairy products, such as infant formula, milk powder, or liquid milk. ALOD below 1 ppm (0.76±0.11 ppm) can be reached if a correct spectrum preprocessing (pretreatment) technique and a correct multivariate (MDA) algorithm-partial least squares regression (PLS), polynomial PLS (Poly-PLS), artificial neural network (ANN), support vector regression (SVR), or least squares support vector machine (LS-SVM)-are used for spectrum analysis. The relationship between MIR/NIR spectrum of milk products and melamine content is nonlinear. Thus, nonlinear regression methods are needed to correctly predict the triazine-derivative content of milk products. It can be concluded that mid- and near-infrared spectroscopy can be regarded as a quick, sensitive, robust, and low-cost method for liquid milk, infant formula, and milk powder analysis. Copyright © 2011 Elsevier B.V. All rights reserved.

  2. [RS estimation of inventory parameters and carbon storage of moso bamboo forest based on synergistic use of object-based image analysis and decision tree].

    PubMed

    Du, Hua Qiang; Sun, Xiao Yan; Han, Ning; Mao, Fang Jie

    2017-10-01

    By synergistically using the object-based image analysis (OBIA) and the classification and regression tree (CART) methods, the distribution information, the indexes (including diameter at breast, tree height, and crown closure), and the aboveground carbon storage (AGC) of moso bamboo forest in Shanchuan Town, Anji County, Zhejiang Province were investigated. The results showed that the moso bamboo forest could be accurately delineated by integrating the multi-scale ima ge segmentation in OBIA technique and CART, which connected the image objects at various scales, with a pretty good producer's accuracy of 89.1%. The investigation of indexes estimated by regression tree model that was constructed based on the features extracted from the image objects reached normal or better accuracy, in which the crown closure model archived the best estimating accuracy of 67.9%. The estimating accuracy of diameter at breast and tree height was relatively low, which was consistent with conclusion that estimating diameter at breast and tree height using optical remote sensing could not achieve satisfactory results. Estimation of AGC reached relatively high accuracy, and accuracy of the region of high value achieved above 80%.

  3. Complementary nonparametric analysis of covariance for logistic regression in a randomized clinical trial setting.

    PubMed

    Tangen, C M; Koch, G G

    1999-03-01

    In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.

  4. Different behavioral effect dose–response profiles in mice exposed to two-carbon chlorinated hydrocarbons: Influence of structural and physical properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Umezu, Toyoshi, E-mail: umechan2@nies.go.jp; Shibata, Yasuyuki, E-mail: yshibata@nies.go.jp

    2014-09-01

    The present study aimed to clarify whether dose–response profiles of acute behavioral effects of 1,2-dichloroethane (DCE), 1,1,1-trichloroethane (TCE), trichloroethylene (TRIC), and tetrachloroethylene (PERC) differ. A test battery involving 6 behavioral endpoints was applied to evaluate the effects of DCE, TCE, TRIC, and PERC in male ICR strain mice under the same experimental conditions. The behavioral effect dose–response profiles of these compounds differed. Regression analysis was used to evaluate the relationship between the dose–response profiles and structural and physical properties of the compounds. Dose–response profile differences correlated significantly with differences in specific structural and physical properties. These results suggest that differencesmore » in specific structural and physical properties of DCE, TCE, TRIC, and PERC are responsible for differences in behavioral effects that lead to a variety of dose–response profiles. - Highlights: • We examine effects of 4 chlorinated hydrocarbons on 6 behavioral endpoints in mice. • The behavioral effect dose–response profiles for the 4 compounds are different. • We utilize regression analysis to clarify probable causes of the different profiles. • The compound's physicochemical properties probably produce the different profiles.« less

  5. Indirect spectrophotometric determination of arbutin, whitening agent through oxidation by periodate and complexation with ferric chloride

    NASA Astrophysics Data System (ADS)

    Barsoom, B. N.; Abdelsamad, A. M. E.; Adib, N. M.

    2006-07-01

    A simple and accurate spectrophotometric method for the determination of arbutin (glycosylated hydroquinone) is described. It is based on the oxidation of arbutin by periodate in presence of iodate. Excess periodate causes liberation of iodine at pH 8.0. The unreacted periodate is determined by measurement of the liberated iodine spectrophotometrically in the wavelength range (300-500 nm). A calibration curve was constructed for more accurate results and the correlation coefficient of linear regression analysis was -0.9778. The precision of this method was better than 6.17% R.S.D. ( n = 3). Regression analysis of Bear-Lambert plot shows good correlation in the concentration range 25-125 ug/ml. The identification limit was determined to be 25 ug/ml a detailed study of the reaction conditions was carried out, including effect of changing pH, time, temperature and volume of periodate. Analyzing pure and authentic samples containing arbutin tested the validity of the proposed method which has an average percent recovery of 100.86%. An alternative method is also proposed which involves a complexation reaction between arbutin and ferric chloride solution. The produced complex which is yellowish-green in color was determined spectophotometrically.

  6. Earth before life.

    PubMed

    Marzban, Caren; Viswanathan, Raju; Yurtsever, Ulvi

    2014-01-09

    A recent study argued, based on data on functional genome size of major phyla, that there is evidence life may have originated significantly prior to the formation of the Earth. Here a more refined regression analysis is performed in which 1) measurement error is systematically taken into account, and 2) interval estimates (e.g., confidence or prediction intervals) are produced. It is shown that such models for which the interval estimate for the time origin of the genome includes the age of the Earth are consistent with observed data. The appearance of life after the formation of the Earth is consistent with the data set under examination.

  7. Method for evaluating moisture tensions of soils using spectral data

    NASA Technical Reports Server (NTRS)

    Peterson, John B. (Inventor)

    1982-01-01

    A method is disclosed which permits evaluation of soil moisture utilizing remote sensing. Spectral measurements at a plurality of different wavelengths are taken with respect to sample soils and the bidirectional reflectance factor (BRF) measurements produced are submitted to regression analysis for development therefrom of predictable equations calculated for orderly relationships. Soil of unknown reflective and unknown soil moisture tension is thereafter analyzed for bidirectional reflectance and the resulting data utilized to determine the soil moisture tension of the soil as well as providing a prediction as to the bidirectional reflectance of the soil at other moisture tensions.

  8. The need for pediatric-specific triage criteria: results from the Florida Trauma Triage Study.

    PubMed

    Phillips, S; Rond, P C; Kelly, S M; Swartz, P D

    1996-12-01

    The objective of the Florida Trauma Triage Study was to assess the performance of state-adopted field triage criteria. The study addressed three specific age groups: pediatric (age < 15 years), adult (age 15-54 years), and geriatric (age 55+ years). Since 1990, Florida has used a uniform set of eight triage criteria, known as the trauma scorecard, for triaging adult trauma patients to state-approved trauma centers. However, only five of the criteria are recommended for use with pediatric patients. This article presents the findings regarding the performance of the scorecard when applied to a pediatric population. We used state trauma registry data linked to state hospital discharge data in a retrospective analysis of trauma patients transported by prehospital providers to any acute care hospital within nine selected Florida counties between July 1, 1991, and December 31, 1991. We used cross-table and logistic regression analysis to determine the ability of triage criteria to correctly identify patients who were retrospectively defined as major trauma. We applied the field criteria to physiologic and anatomy/mechanism of injury data contained in the trauma registry to "score" the patient as major or minor trauma. To make our retrospective determination of major or minor trauma we used the protocols developed by an expert medical panel as described by E. J. MacKenzie et al. (1990). We calculated sensitivity, specificity, and the corresponding over- and undertriage rates by comparing patient classifications (major or minor trauma) produced by the triage criteria and the retrospective algorithm. We used logistic regression to identify which triage criteria were statistically significant in predicting major trauma. Pediatric cases accounted for 9.2% of the total study population, 6.0% of all hospitalized cases, and 6.8% of all trauma deaths. Of the 1505 pediatric cases available for analysis, the triage criteria classified 269 cases as expected major trauma and 1236 cases as expected minor trauma. The retrospective algorithm classified 78 cases as expected major trauma and 1427 cases as expected minor trauma. The resulting specificity is 84.8% (15.2% overtriage), and the sensitivity is 66.7% (33.3% undertriage). Logistic regression indicated that, of the eight state-adopted field triage criteria, only the Glasgow coma score, ejection from vehicle, and penetrating injuries have a statistically significant impact on predicting major trauma in pediatric patients. Although the state-adopted trauma scorecard, applied to a pediatric population, produced acceptable overtriage, it did not produce acceptable undertriage. However, our undertriage rate is comparable to the results of other published studies on pediatric trauma. As a result of the Florida Trauma Triage Study, a new pediatric triage instrument was developed. It is currently being field-tested.

  9. Development of a User Interface for a Regression Analysis Software Tool

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  10. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  11. Improving salt marsh digital elevation model accuracy with full-waveform lidar and nonparametric predictive modeling

    NASA Astrophysics Data System (ADS)

    Rogers, Jeffrey N.; Parrish, Christopher E.; Ward, Larry G.; Burdick, David M.

    2018-03-01

    Salt marsh vegetation tends to increase vertical uncertainty in light detection and ranging (lidar) derived elevation data, often causing the data to become ineffective for analysis of topographic features governing tidal inundation or vegetation zonation. Previous attempts at improving lidar data collected in salt marsh environments range from simply computing and subtracting the global elevation bias to more complex methods such as computing vegetation-specific, constant correction factors. The vegetation specific corrections can be used along with an existing habitat map to apply separate corrections to different areas within a study site. It is hypothesized here that correcting salt marsh lidar data by applying location-specific, point-by-point corrections, which are computed from lidar waveform-derived features, tidal-datum based elevation, distance from shoreline and other lidar digital elevation model based variables, using nonparametric regression will produce better results. The methods were developed and tested using full-waveform lidar and ground truth for three marshes in Cape Cod, Massachusetts, U.S.A. Five different model algorithms for nonparametric regression were evaluated, with TreeNet's stochastic gradient boosting algorithm consistently producing better regression and classification results. Additionally, models were constructed to predict the vegetative zone (high marsh and low marsh). The predictive modeling methods used in this study estimated ground elevation with a mean bias of 0.00 m and a standard deviation of 0.07 m (0.07 m root mean square error). These methods appear very promising for correction of salt marsh lidar data and, importantly, do not require an existing habitat map, biomass measurements, or image based remote sensing data such as multi/hyperspectral imagery.

  12. Relationships between harvest time and wine composition in Vitis vinifera L. cv. Cabernet Sauvignon 2. Wine sensory properties and consumer preference.

    PubMed

    Bindon, Keren; Holt, Helen; Williamson, Patricia O; Varela, Cristian; Herderich, Markus; Francis, I Leigh

    2014-07-01

    A series of five Vitis vinifera L. cv Cabernet Sauvignon wines were produced from sequentially-harvested grape parcels, with alcohol concentrations between 12% v/v and 15.5% v/v. A multidisciplinary approach, combining sensory analysis, consumer testing and detailed chemical analysis was used to better define the relationship between grape maturity, wine composition and sensory quality. The sensory attribute ratings for dark fruit, hotness and viscosity increased in wines produced from riper grapes, while the ratings for the attributes red fruit and fresh green decreased. Consumer testing of the wines revealed that the lowest-alcohol wines (12% v/v) were the least preferred and wines with ethanol concentration between 13% v/v and 15.5% v/v were equally liked by consumers. Partial least squares regression identified that many sensory attributes were strongly associated with the compositional data, providing evidence of wine chemical components which are important to wine sensory properties and consumer preferences, and which change as the grapes used for winemaking ripen. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Multivariate Regression Analysis and Slaughter Livestock,

    DTIC Science & Technology

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  14. Retention payoff-based cost per day open regression equations: Application in a user-friendly decision support tool for investment analysis of automated estrus detection technologies.

    PubMed

    Dolecheck, K A; Heersche, G; Bewley, J M

    2016-12-01

    Assessing the economic implications of investing in automated estrus detection (AED) technologies can be overwhelming for dairy producers. The objectives of this study were to develop new regression equations for estimating the cost per day open (DO) and to apply the results to create a user-friendly, partial budget, decision support tool for investment analysis of AED technologies. In the resulting decision support tool, the end user can adjust herd-specific inputs regarding general management, current reproductive management strategies, and the proposed AED system. Outputs include expected DO, reproductive cull rate, net present value, and payback period for the proposed AED system. Utility of the decision support tool was demonstrated with an example dairy herd created using data from DairyMetrics (Dairy Records Management Systems, Raleigh, NC), Food and Agricultural Policy Research Institute (Columbia, MO), and published literature. Resulting herd size, rolling herd average milk production, milk price, and feed cost were 323 cows, 10,758kg, $0.41/kg, and $0.20/kg of dry matter, respectively. Automated estrus detection technologies with 2 levels of initial system cost (low: $5,000 vs. high: $10,000), tag price (low: $50 vs. high: $100), and estrus detection rate (low: 60% vs. high: 80%) were compared over a 7-yr investment period. Four scenarios were considered in a demonstration of the investment analysis tool: (1) a herd using 100% visual observation for estrus detection before adopting 100% AED, (2) a herd using 100% visual observation before adopting 75% AED and 25% visual observation, (3) a herd using 100% timed artificial insemination (TAI) before adopting 100% AED, and (4) a herd using 100% TAI before adopting 75% AED and 25% TAI. Net present value in scenarios 1 and 2 was always positive, indicating a positive investment situation. Net present value in scenarios 3 and 4 was always positive in combinations using a $50 tag price, and in scenario 4, the $5,000, $100, and 80% combination. Overall, the payback period ranged from 1.6 yr to greater than 10 yr. Investment analysis demonstration results were highly dependent on assumptions, especially AED system initial investment and labor costs. Dairy producers can use herd-specific inputs with the cost per day open regression equations and the decision support tool to estimate individual herd results. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. RRegrs: an R package for computer-aided model selection with multiple regression models.

    PubMed

    Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L

    2015-01-01

    Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.

  16. Minimizing bias in biomass allometry: Model selection and log transformation of data

    Treesearch

    Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer

    2011-01-01

    Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....

  17. Pick Your Poisson: A Tutorial on Analyzing Counts of Student Victimization Data

    ERIC Educational Resources Information Center

    Huang, Francis L.; Cornell, Dewey G.

    2012-01-01

    School violence research is often concerned with infrequently occurring events such as counts of the number of bullying incidents or fights a student may experience. Analyzing count data using ordinary least squares regression may produce improbable predicted values, and as a result of regression assumption violations, result in higher Type I…

  18. Assistive Technologies for Second-Year Statistics Students Who Are Blind

    ERIC Educational Resources Information Center

    Erhardt, Robert J.; Shuman, Michael P.

    2015-01-01

    At Wake Forest University, a student who is blind enrolled in a second course in statistics. The course covered simple and multiple regression, model diagnostics, model selection, data visualization, and elementary logistic regression. These topics required that the student both interpret and produce three sets of materials: mathematical writing,…

  19. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  20. RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,

    DTIC Science & Technology

    This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)

  1. Exploring the modeling of spatiotemporal variations in ambient air pollution within the land use regression framework: Estimation of PM10 concentrations on a daily basis.

    PubMed

    Alam, Md Saniul; McNabola, Aonghus

    2015-05-01

    Estimation of daily average exposure to PM10 (particulate matter with an aerodynamic diameter<10 μm) using the available fixed-site monitoring stations (FSMs) in a city poses a great challenge. This is because typically FSMs are limited in number when considering the spatial representativeness of their measurements and also because statistical models of citywide exposure have yet to be explored in this context. This paper deals with the later aspect of this challenge and extends the widely used land use regression (LUR) approach to deal with temporal changes in air pollution and the influence of transboundary air pollution on short-term variations in PM10. Using the concept of multiple linear regression (MLR) modeling, the average daily concentrations of PM10 in two European cities, Vienna and Dublin, were modeled. Models were initially developed using the standard MLR approach in Vienna using the most recently available data. Efforts were subsequently made to (i) assess the stability of model predictions over time; (ii) explores the applicability of nonparametric regression (NPR) and artificial neural networks (ANNs) to deal with the nonlinearity of input variables. The predictive performance of the MLR models of the both cities was demonstrated to be stable over time and to produce similar results. However, NPR and ANN were found to have more improvement in the predictive performance in both cities. Using ANN produced the highest result, with daily PM10 exposure predicted at R2=66% for Vienna and 51% for Dublin. In addition, two new predictor variables were also assessed for the Dublin model. The variables representing transboundary air pollution and peak traffic count were found to account for 6.5% and 12.7% of the variation in average daily PM10 concentration. The variable representing transboundary air pollution that was derived from air mass history (from back-trajectory analysis) and population density has demonstrated a positive impact on model performance. The implications of this research would suggest that it is possible to produce a model of ambient air quality on a citywide scale using the readily available data. Most European cities typically have a limited FSM network with average daily concentrations of air pollutants as well as available meteorological, traffic, and land-use data. This research highlights that using these data in combination with advanced statistical techniques such as NPR or ANNs will produce reasonably accurate predictions of ambient air quality across a city, including temporal variations. Therefore, this approach reduces the need for additional measurement data to supplement existing historical records and enables a lower-cost method of air pollution model development for practitioners and policy makers.

  2. A study to ascertain the viability of ultrasonic nondestructive testing to determine the mechanical characteristics of wood/agricultural hardboards with soybean based adhesives

    NASA Astrophysics Data System (ADS)

    Colen, Charles Raymond, Jr.

    There have been numerous studies with ultrasonic nondestructive testing and wood fiber composites. The problem of the study was to ascertain whether ultrasonic nondestructive testing can be used in place of destructive testing to obtain the modulus of elasticity (MOE) of the wood/agricultural material with comparable results. The uniqueness of this research is that it addressed the type of content (cornstalks and switchgrass) being used with the wood fibers and the type of adhesives (soybean-based) associated with the production of these composite materials. Two research questions were addressed in the study. The major objective was to determine if one can predict the destructive test MOE value based on the nondestructive test MOE value. The population of the study was wood/agricultural fiberboards made from wood fibers, cornstalks, and switchgrass bonded together with soybean-based, urea-formaldehyde, and phenol-formaldehyde adhesives. Correlational analysis was used to determine if there was a relationship between the two tests. Regression analysis was performed to determine a prediction equation for the destructive test MOE value. Data were collected on both procedures using ultrasonic nondestructing testing and 3-point destructive testing. The results produced a simple linear regression model for this study which was adequate in the prediction of destructive MOE values if the nondestructive MOE value is known. An approximation very close to the entire error in the model equation was explained from the destructive test MOE values for the composites. The nondestructive MOE values used to produce a linear regression model explained 83% of the variability in the destructive test MOE values. The study also showed that, for the particular destructive test values obtained with the equipment used, the model associated with the study is as good as it could be due to the variability in the results from the destructive tests. In this study, an ultrasonic signal was used to determine the MOE values on nondestructive tests. Future research studies could use the same or other hardboards to examine how the resins affect the ultrasonic signal.

  3. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value

  4. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  5. Water quality parameter measurement using spectral signatures

    NASA Technical Reports Server (NTRS)

    White, P. E.

    1973-01-01

    Regression analysis is applied to the problem of measuring water quality parameters from remote sensing spectral signature data. The equations necessary to perform regression analysis are presented and methods of testing the strength and reliability of a regression are described. An efficient algorithm for selecting an optimal subset of the independent variables available for a regression is also presented.

  6. cp-R, an interface the R programming language for clinical laboratory method comparisons.

    PubMed

    Holmes, Daniel T

    2015-02-01

    Clinical scientists frequently need to compare two different bioanalytical methods as part of assay validation/monitoring. As a matter necessity, regression methods for quantitative comparison in clinical chemistry, hematology and other clinical laboratory disciplines must allow for error in both the x and y variables. Traditionally the methods popularized by 1) Deming and 2) Passing and Bablok have been recommended. While commercial tools exist, no simple open source tool is available. The purpose of this work was to develop and entirely open-source GUI-driven program for bioanalytical method comparisons capable of performing these regression methods and able to produce highly customized graphical output. The GUI is written in python and PyQt4 with R scripts performing regression and graphical functions. The program can be run from source code or as a pre-compiled binary executable. The software performs three forms of regression and offers weighting where applicable. Confidence bands of the regression are calculated using bootstrapping for Deming and Passing Bablok methods. Users can customize regression plots according to the tools available in R and can produced output in any of: jpg, png, tiff, bmp at any desired resolution or ps and pdf vector formats. Bland Altman plots and some regression diagnostic plots are also generated. Correctness of regression parameter estimates was confirmed against existing R packages. The program allows for rapid and highly customizable graphical output capable of conforming to the publication requirements of any clinical chemistry journal. Quick method comparisons can also be performed and cut and paste into spreadsheet or word processing applications. We present a simple and intuitive open source tool for quantitative method comparison in a clinical laboratory environment. Copyright © 2014 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  7. Evaluating the perennial stream using logistic regression in central Taiwan

    NASA Astrophysics Data System (ADS)

    Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.

    2014-12-01

    This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.

  8. Confounding adjustment in comparative effectiveness research conducted within distributed research networks.

    PubMed

    Toh, Sengwee; Gagne, Joshua J; Rassen, Jeremy A; Fireman, Bruce H; Kulldorff, Martin; Brown, Jeffrey S

    2013-08-01

    A distributed research network (DRN) of electronic health care databases, in which data reside behind the firewall of each data partner, can support a wide range of comparative effectiveness research (CER) activities. An essential component of a fully functional DRN is the capability to perform robust statistical analyses to produce valid, actionable evidence without compromising patient privacy, data security, or proprietary interests. We describe the strengths and limitations of different confounding adjustment approaches that can be considered in observational CER studies conducted within DRNs, and the theoretical and practical issues to consider when selecting among them in various study settings. Several methods can be used to adjust for multiple confounders simultaneously, either as individual covariates or as confounder summary scores (eg, propensity scores and disease risk scores), including: (1) centralized analysis of patient-level data, (2) case-centered logistic regression of risk set data, (3) stratified or matched analysis of aggregated data, (4) distributed regression analysis, and (5) meta-analysis of site-specific effect estimates. These methods require different granularities of information be shared across sites and afford investigators different levels of analytic flexibility. DRNs are growing in use and sharing of highly detailed patient-level information is not always feasible in DRNs. Methods that incorporate confounder summary scores allow investigators to adjust for a large number of confounding factors without the need to transfer potentially identifiable information in DRNs. They have the potential to let investigators perform many analyses traditionally conducted through a centralized dataset with detailed patient-level information.

  9. Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

    PubMed

    Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

    2012-05-01

    Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  10. The beta2- and beta3-adrenoceptor-mediated relaxation induced by fenoterol in guinea pig taenia caecum.

    PubMed

    Akimoto, Yurie; Horinouchi, Takahiro; Tanaka, Yoshio; Koike, Katsuo

    2002-10-01

    Fenoterol, a beta2-adrenoceptor selective agonist, belongs to the arylethanolamine class. To understand the receptor subtypes responsible for beta-adrenoceptor-mediated relaxation of guinea pig taenia caecum, we investigated the effect of fenoterol. Fenoterol caused concentration-dependent relaxation of the guinea pig taenia caecum. Propranolol, bupranolol and butoxamine produced shifts of the concentration-response curve for fenoterol. Schild regression analyses carried out for propranolol, butoxamine and bupranolol against fenoterol gave pA2 values of 8.41, 6.33 and 8.44, respectively. However, in the presence of 3 x 10(-4) M atenolol, 10(-4) M butoxamine and 10(-6) M phentolamine to block the beta1-, beta2- and a-adrenoceptor effects, respectively, Schild regression analysis carried out for bupranolol against fenoterol gave pA2 values of 5.80. These results suggest that the relaxant response to fenoterol in the guinea pig taenia caecum is mediated by both the beta2- and the beta3-adrenoceptors.

  11. A generalized least squares regression approach for computing effect sizes in single-case research: application examples.

    PubMed

    Maggin, Daniel M; Swaminathan, Hariharan; Rogers, Helen J; O'Keeffe, Breda V; Sugai, George; Horner, Robert H

    2011-06-01

    A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of treatment effect from baseline to treatment phases in standard deviation units. In this paper, the method is applied to two published examples using common single case designs (i.e., withdrawal and multiple-baseline). The results from these studies are described, and the method is compared to ten desirable criteria for single-case effect sizes. Based on the results of this application, we conclude with observations about the use of GLS as a support to visual analysis, provide recommendations for future research, and describe implications for practice. Copyright © 2011 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  12. Weight loss efficacy of a novel mobile Diabetes Prevention Program delivery platform with human coaching

    PubMed Central

    Michaelides, Andreas; Raby, Christine; Wood, Meghan; Farr, Kit

    2016-01-01

    Objective To evaluate the weight loss efficacy of a novel mobile platform delivering the Diabetes Prevention Program. Research Design and Methods 43 overweight or obese adult participants with a diagnosis of prediabetes signed-up to receive a 24-week virtual Diabetes Prevention Program with human coaching, through a mobile platform. Weight loss and engagement were the main outcomes, evaluated by repeated measures analysis of variance, backward regression, and mediation regression. Results Weight loss at 16 and 24 weeks was significant, with 56% of starters and 64% of completers losing over 5% body weight. Mean weight loss at 24 weeks was 6.58% in starters and 7.5% in completers. Participants were highly engaged, with 84% of the sample completing 9 lessons or more. In-app actions related to self-monitoring significantly predicted weight loss. Conclusions Our findings support the effectiveness of a uniquely mobile prediabetes intervention, producing weight loss comparable to studies with high engagement, with potential for scalable population health management. PMID:27651911

  13. How to predict the sugariness and hardness of melons: A near-infrared hyperspectral imaging method.

    PubMed

    Sun, Meijun; Zhang, Dong; Liu, Li; Wang, Zheng

    2017-03-01

    Hyperspectral imaging (HSI) in the near-infrared (NIR) region (900-1700nm) was used for non-intrusive quality measurements (of sweetness and texture) in melons. First, HSI data from melon samples were acquired to extract the spectral signatures. The corresponding sample sweetness and hardness values were recorded using traditional intrusive methods. Partial least squares regression (PLSR), principal component analysis (PCA), support vector machine (SVM), and artificial neural network (ANN) models were created to predict melon sweetness and hardness values from the hyperspectral data. Experimental results for the three types of melons show that PLSR produces the most accurate results. To reduce the high dimensionality of the hyperspectral data, the weighted regression coefficients of the resulting PLSR models were used to identify the most important wavelengths. On the basis of these wavelengths, each image pixel was used to visualize the sweetness and hardness in all the portions of each sample. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Methods for estimating drought streamflow probabilities for Virginia streams

    USGS Publications Warehouse

    Austin, Samuel H.

    2014-01-01

    Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.

  15. Predicting major element mineral/melt equilibria - A statistical approach

    NASA Technical Reports Server (NTRS)

    Hostetler, C. J.; Drake, M. J.

    1980-01-01

    Empirical equations have been developed for calculating the mole fractions of NaO0.5, MgO, AlO1.5, SiO2, KO0.5, CaO, TiO2, and FeO in a solid phase of initially unknown identity given only the composition of the coexisting silicate melt. The approach involves a linear multivariate regression analysis in which solid composition is expressed as a Taylor series expansion of the liquid compositions. An internally consistent precision of approximately 0.94 is obtained, that is, the nature of the liquidus phase in the input data set can be correctly predicted for approximately 94% of the entries. The composition of the liquidus phase may be calculated to better than 5 mol % absolute. An important feature of this 'generalized solid' model is its reversibility; that is, the dependent and independent variables in the linear multivariate regression may be inverted to permit prediction of the composition of a silicate liquid produced by equilibrium partial melting of a polymineralic source assemblage.

  16. Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.

    PubMed

    Ritz, Christian; Van der Vliet, Leana

    2009-09-01

    The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.

  17. DTI measures identify mild and moderate TBI cases among patients with complex health problems: A receiver operating characteristic analysis of U.S. veterans.

    PubMed

    Main, Keith L; Soman, Salil; Pestilli, Franco; Furst, Ansgar; Noda, Art; Hernandez, Beatriz; Kong, Jennifer; Cheng, Jauhtai; Fairchild, Jennifer K; Taylor, Joy; Yesavage, Jerome; Wesson Ashford, J; Kraemer, Helena; Adamson, Maheen M

    2017-01-01

    Standard MRI methods are often inadequate for identifying mild traumatic brain injury (TBI). Advances in diffusion tensor imaging now provide potential biomarkers of TBI among white matter fascicles (tracts). However, it is still unclear which tracts are most pertinent to TBI diagnosis. This study ranked fiber tracts on their ability to discriminate patients with and without TBI. We acquired diffusion tensor imaging data from military veterans admitted to a polytrauma clinic (Overall n  = 109; Age: M  = 47.2, SD  = 11.3; Male: 88%; TBI: 67%). TBI diagnosis was based on self-report and neurological examination. Fiber tractography analysis produced 20 fiber tracts per patient. Each tract yielded four clinically relevant measures (fractional anisotropy, mean diffusivity, radial diffusivity, and axial diffusivity). We applied receiver operating characteristic (ROC) analyses to identify the most diagnostic tract for each measure. The analyses produced an optimal cutpoint for each tract. We then used kappa coefficients to rate the agreement of each cutpoint with the neurologist's diagnosis. The tract with the highest kappa was most diagnostic. As a check on the ROC results, we performed a stepwise logistic regression on each measure using all 20 tracts as predictors. We also bootstrapped the ROC analyses to compute the 95% confidence intervals for sensitivity, specificity, and the highest kappa coefficients. The ROC analyses identified two fiber tracts as most diagnostic of TBI: the left cingulum (LCG) and the left inferior fronto-occipital fasciculus (LIF). Like ROC, logistic regression identified LCG as most predictive for the FA measure but identified the right anterior thalamic tract (RAT) for the MD, RD, and AD measures. These findings are potentially relevant to the development of TBI biomarkers. Our methods also demonstrate how ROC analysis may be used to identify clinically relevant variables in the TBI population.

  18. Spatial analysis of land use and shallow groundwater vulnerability in the watershed adjacent to Assateague Island National Seashore, Maryland and Virginia, USA

    USGS Publications Warehouse

    LaMotte, A.E.; Greene, E.A.

    2007-01-01

    Spatial relations between land use and groundwater quality in the watershed adjacent to Assateague Island National Seashore, Maryland and Virginia, USA were analyzed by the use of two spatial models. One model used a logit analysis and the other was based on geostatistics. The models were developed and compared on the basis of existing concentrations of nitrate as nitrogen in samples from 529 domestic wells. The models were applied to produce spatial probability maps that show areas in the watershed where concentrations of nitrate in groundwater are likely to exceed a predetermined management threshold value. Maps of the watershed generated by logistic regression and probability kriging analysis showing where the probability of nitrate concentrations would exceed 3 mg/L (>0.50) compared favorably. Logistic regression was less dependent on the spatial distribution of sampled wells, and identified an additional high probability area within the watershed that was missed by probability kriging. The spatial probability maps could be used to determine the natural or anthropogenic factors that best explain the occurrence and distribution of elevated concentrations of nitrate (or other constituents) in shallow groundwater. This information can be used by local land-use planners, ecologists, and managers to protect water supplies and identify land-use planning solutions and monitoring programs in vulnerable areas. ?? 2006 Springer-Verlag.

  19. Determination of urine ionic composition with potentiometric multisensor system.

    PubMed

    Yaroshenko, Irina; Kirsanov, Dmitry; Kartsova, Lyudmila; Sidorova, Alla; Borisova, Irina; Legin, Andrey

    2015-01-01

    The ionic composition of urine is a good indicator of patient's general condition and allows for diagnostics of certain medical problems such as e.g., urolithiasis. Due to environmental factors and malnutrition the number of registered urinary tract cases continuously increases. Most of the methods currently used for urine analysis are expensive, quite laborious and require skilled personnel. The present work deals with feasibility study of potentiometric multisensor system of 18 ion-selective and cross-sensitive sensors as an analytical tool for determination of urine ionic composition. In total 136 samples from patients of Urolithiasis Laboratory and healthy people were analyzed by the multisensor system as well as by capillary electrophoresis as a reference method. Various chemometric approaches were implemented to relate the data from electrochemical measurements with the reference data. Logistic regression (LR) was applied for classification of samples into healthy and unhealthy producing reasonable misclassification rates. Projection on Latent Structures (PLS) regression was applied for quantitative analysis of ionic composition from potentiometric data. Mean relative errors of simultaneous prediction of sodium, potassium, ammonium, calcium, magnesium, chloride, sulfate, phosphate, urate and creatinine from multisensor system response were in the range 3-13% for independent test sets. This shows a good promise for development of a fast and inexpensive alternative method for urine analysis. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Dietary Magnesium Intake and Metabolic Syndrome in the Adult Population: Dose-Response Meta-Analysis and Meta-Regression

    PubMed Central

    Ju, Sang-Yhun; Choi, Whan-Seok; Ock, Sun-Myeong; Kim, Chul-Min; Kim, Do-Hoon

    2014-01-01

    Increasing evidence has suggested an association between dietary magnesium intake and metabolic syndrome. However, previous research examining dietary magnesium intake and metabolic syndrome has produced mixed results. Our objective was to determine the relationship between dietary magnesium intake and metabolic syndrome in the adult population using a dose-response meta-analysis. We searched the PubMed, Embase and the Cochrane Library databases from August, 1965, to May, 2014. Observational studies reporting risk ratios with 95% confidence intervals (CIs) for metabolic syndrome in ≥3 categories of dietary magnesium intake levels were selected. The data extraction was performed independently by two authors, and the quality of the studies was evaluated using the Risk of Bias Assessment Tool for Nonrandomized Studies (RoBANS). Based on eight cross-sectional studies and two prospective cohort studies, the pooled relative risks of metabolic syndrome per 150 mg/day increment in magnesium intake was 0.88 (95% CI, 0.84–0.93; I2 = 36.3%). The meta-regression model showed a generally linear, inverse relationship between magnesium intake (mg/day) and metabolic syndrome. This dose-response meta-analysis indicates that dietary magnesium intake is significantly and inversely associated with the risk of metabolic syndrome. However, randomized clinical trials will be necessary to address the issue of causality and to determine whether magnesium supplementation is effective for the prevention of metabolic syndrome. PMID:25533010

  1. Extracorporeal circuit for Panton-Valentine leukocidin-producing Staphylococcus aureus necrotizing pneumonia.

    PubMed

    Lavoue, S; Le Gac, G; Gacouin, A; Revest, M; Sohier, L; Mouline, J; Jouneau, S; Flecher, E; Tattevin, P; Tadié, J-M

    2016-09-01

    To describe two cases of Panton-Valentine leukocidin-producing Staphylococcus aureus (PVL-SA) necrotizing pneumonia treated with ECMO, and complete pulmonary evaluation at six months. Retrospective analysis of two patients presenting with severe PVL-SA pneumonia who both underwent complete respiratory function testing and chest CT scan six months after hospital discharge. Indications for ECMO were refractory hypoxia and left ventricular dysfunction associated with right ventricular dilatation. Patients were weaned off ECMO after 52 and 5 days. No ECMO-related hemorrhagic complication was observed. Pulmonary function tests performed at six months were normal and the CT scan showed complete regression of pulmonary injuries. PVL-SA pneumonia is characterized by extensive parenchymal injuries, including necrotic and hemorrhagic complications. ECMO may be used as a salvage treatment without any associated hemorrhagic complication, provided anticoagulant therapy is carefully monitored, and may lead to complete pulmonary recovery at six months. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  2. Risk factors identified associated with tuberculosis in cattle at 11 livestock experiment stations of Punjab Pakistan.

    PubMed

    Javed, M Tariq; Irfan, M; Ali, Imtiaz; Farooqi, Farooq A; Wasiq, M; Cagiola, Monica

    2011-02-01

    The study was carried out in cattle kept at 11 livestock experiment stations of Punjab by using single comparative cervical intradermal tuberculin (SCCIT) test. Sahiwal was the main breed kept at these farms. Sixty three percent of animals were between four and 10 years of age. Seventy-six percent of animals weighed between 300 and 400 kg and 66% produced 5-10l of milk/day. Animals other than cattle were present at about 64% of these farms. The positive SCCIT test was recorded in 7.6% of animals at the 11 farms. However, the prevalence of tuberculosis varied from 2.0% to 19.3% at these farms. Bivariate frequency analysis showed that the chances of a positive SCCIT test were higher in older animals, in cattle with higher number of calving and those produced up to 1800l of milk. However, the chances of positive SCCIT test decreases with further increase in milk production. Results of bivariate and/or multivariate logistic regression analysis after controlling for the farm showed a significant association of age of cattle, numbers of calving, total milk produced, per day milk, lactation length, presence of sheep at the farm and total numbers of animals at the farm with a positive SCCIT test. It can be concluded from the study that herd prevalence of tuberculosis was 100%, while animal prevalence was about 8% at these farms. The stronger risk factors identified by logistic analysis were the age of cattle, numbers of calving, total milk produced and lactation length, while the presence of sheep at the farm has protective effect. 2010 Elsevier B.V. All rights reserved.

  3. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  4. A Quality Assessment Tool for Non-Specialist Users of Regression Analysis

    ERIC Educational Resources Information Center

    Argyrous, George

    2015-01-01

    This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…

  5. A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy.

    PubMed

    Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung

    2015-12-01

    This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Visual grading characteristics and ordinal regression analysis during optimisation of CT head examinations.

    PubMed

    Zarb, Francis; McEntee, Mark F; Rainford, Louise

    2015-06-01

    To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.

  7. Statistical analysis of subjective preferences for video enhancement

    NASA Astrophysics Data System (ADS)

    Woods, Russell L.; Satgunam, PremNandhini; Bronstad, P. Matthew; Peli, Eli

    2010-02-01

    Measuring preferences for moving video quality is harder than for static images due to the fleeting and variable nature of moving video. Subjective preferences for image quality can be tested by observers indicating their preference for one image over another. Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999). Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for the items that are compared (e.g. enhancement levels). However, Thurstone scaling does not determine the statistical significance of the differences between items on that perceptual scale. Recent papers have provided inferential statistical methods that produce an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we demonstrate that binary logistic regression can analyze preferences for enhanced video.

  8. Statistical comparison of methods for estimating sediment thickness from Horizontal-to-Vertical Spectral Ratio (HVSR) seismic methods: An example from Tylerville, Connecticut, USA

    USGS Publications Warehouse

    Johnson, Carole D.; Lane, John W.

    2016-01-01

    Determining sediment thickness and delineating bedrock topography are important for assessing groundwater availability and characterizing contamination sites. In recent years, the horizontal-to-vertical spectral ratio (HVSR) seismic method has emerged as a non-invasive, cost-effective approach for estimating the thickness of unconsolidated sediments above bedrock. Using a three-component seismometer, this method uses the ratio of the average horizontal- and vertical-component amplitude spectrums to produce a spectral ratio curve with a peak at the fundamental resonance frequency. The HVSR method produces clear and repeatable resonance frequency peaks when there is a sharp contrast (>2:1) in acoustic impedance at the sediment/bedrock boundary. Given the resonant frequency, sediment thickness can be determined either by (1) using an estimate of average local sediment shear-wave velocity or by (2) application of a power-law regression equation developed from resonance frequency observations at sites with a range of known depths to bedrock. Two frequently asked questions about the HVSR method are (1) how accurate are the sediment thickness estimates? and (2) how much do sediment thickness/bedrock depth estimates change when using different published regression equations? This paper compares and contrasts different approaches for generating HVSR depth estimates, through analysis of HVSR data acquired in the vicinity of Tylerville, Connecticut, USA.

  9. Family Medicine or Primary Care Residency Selection: Effects of Family Medicine Interest Groups, MD/MPH Dual Degrees, and Rural Medical Education.

    PubMed

    Wei McIntosh, Elizabeth; Morley, Christopher P

    2016-05-01

    If medical schools are to produce primary care physicians (family medicine, pediatrics, or general internal medicine), they must provide educational experiences that enable medical students to maintain existing or form new interests in such careers. This study examined three mechanisms for doing so, at one medical school: participation as an officer in a family medicine interest group (FMIG), completion of a dual medical/public health (MD/MPH) degree program, and participation in a rural medical education (RMED) clinical track. Specialty Match data for students who graduated from the study institution between 2006 and 2015 were included as dependent variables in bivariate analysis (c2) and logistic regression models, examining FMIG, MD/MPH, and RMED participation as independent predictors of specialty choice (family medicine yes/no, or any primary care (PC) yes/no), controlling for student demographic data. In bivariate c2 analyses, FMIG officership did not significantly predict matching with family medicine or any PC; RMED and MD/MPH education were significant predictors of both family medicine and PC. Binary logistic regression analyses replicated the bivariate findings, controlling for student demographics. Dual MD/MPH and rural medical education had stronger effects in producing primary care physicians than participation in a FMIG as an officer, at one institution. Further study at multiple institutions is warranted.

  10. REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.

    DTIC Science & Technology

    SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.

  11. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  12. The process and utility of classification and regression tree methodology in nursing research.

    PubMed

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  13. Sea surface temperature: Observations from geostationary satellites

    NASA Astrophysics Data System (ADS)

    Bates, John J.; Smith, William L.

    1985-11-01

    A procedure is developed for estimating sea surface temperatures (SST) from multispectral image data acquired from the VISSR atmospheric sounder (VAS) on the geostationary GOES satellites. Theoretical regression equations for two and three infrared window channels are empirically tuned by using clear field of view satellite radiances matched with reports of SST from NOAA fixed environmental buoys from 1982. The empirical regression equations are then used to produce daily regional analyses of SST. The daily analyses are used to study the response of SST's to the passage of Hurricane Alicia (1983) and Hurricane Debbie (1982) and are also used as a first guess surface temperature in the retrieval of atmospheric temperature and moisture profiles over the oceanic regions. Monthly mean SST's for the western North Atlantic and the eastern equatorial Pacific during March and July 1982 were produced for use in the NASA/JPL SST intercomparison workshop series. Workshop results showed VAS SST's have a scatter of 0.8°-1.0°C and a slight warm bias with respect to the other measurements of SST. Subsequently, a second set of VAS/ buoy matches collected during 1983 and 1984 was used to produce a set of bias corrected regression relations for VAS.

  14. Advantages of the net benefit regression framework for economic evaluations of interventions in the workplace: a case study of the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders.

    PubMed

    Hoch, Jeffrey S; Dewa, Carolyn S

    2014-04-01

    Economic evaluations commonly accompany trials of new treatments or interventions; however, regression methods and their corresponding advantages for the analysis of cost-effectiveness data are not well known. To illustrate regression-based economic evaluation, we present a case study investigating the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders. We implement net benefit regression to illustrate its strengths and limitations. Net benefit regression offers a simple option for cost-effectiveness analyses of person-level data. By placing economic evaluation in a regression framework, regression-based techniques can facilitate the analysis and provide simple solutions to commonly encountered challenges. Economic evaluations of person-level data (eg, from a clinical trial) should use net benefit regression to facilitate analysis and enhance results.

  15. Development of A Tsunami Magnitude Scale Based on DART Buoy Data

    NASA Astrophysics Data System (ADS)

    Leiva, J.; Polet, J.

    2016-12-01

    The quantification of tsunami energy has evolved through time, with a number of magnitude and intensity scales employed in the past century. Most of these scales rely on coastal measurements, which may be affected by complexities due to near-shore bathymetric effects and coastal geometries. Moreover, these datasets are generated by tsunami inundation, and thus cannot serve as a means of assessing potential tsunami impact prior to coastal arrival. With the introduction of a network of ocean buoys provided through the Deep-ocean Assessment and Reporting of Tsunamis (DART) project, a dataset has become available that can be exploited to further our current understanding of tsunamis and the earthquakes that excite them. The DART network consists of 39 stations that have produced estimates of sea-surface height as a function of time since 2003, and are able to detect deep ocean tsunami waves. Data collected at these buoys for the past decade reveals that at least nine major tsunami events, such as the 2011 Tohoku and 2013 Solomon Islands events, produced substantial wave amplitudes across a large distance range that can be implemented in a DART data based tsunami magnitude scale. We present preliminary results from the development of a tsunami magnitude scale that follows the methods used in the development of the local magnitude scale by Charles Richter. Analogous to the use of seismic ground motion amplitudes in the calculation of local magnitude, maximum ocean height displacements due to the passage of tsunami waves will be related to distance from the source in a least-squares exponential regression analysis. The regression produces attenuation curves based on the DART data, a site correction term, attenuation parameters, and an amplification factor. Initially, single event based regressions are used to constrain the attenuation parameters. Additional iterations use the parameters of these event-based fits as a starting point to obtain a stable solution, and include the calculation of station corrections, in order to obtain a final amplification factor for each event, which is used to calculate its tsunami magnitude.

  16. CADDIS Volume 4. Data Analysis: Basic Analyses

    EPA Pesticide Factsheets

    Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.

  17. Assessing the Generalizability of Estimates of Causal Effects from Regression Discontinuity Designs

    ERIC Educational Resources Information Center

    Bloom, Howard S.; Porter, Kristin E.

    2012-01-01

    In recent years, the regression discontinuity design (RDD) has gained widespread recognition as a quasi-experimental method that when used correctly, can produce internally valid estimates of causal effects of a treatment, a program or an intervention (hereafter referred to as treatment effects). In an RDD study, subjects or groups of subjects…

  18. Data mining: Potential applications in research on nutrition and health.

    PubMed

    Batterham, Marijka; Neale, Elizabeth; Martin, Allison; Tapsell, Linda

    2017-02-01

    Data mining enables further insights from nutrition-related research, but caution is required. The aim of this analysis was to demonstrate and compare the utility of data mining methods in classifying a categorical outcome derived from a nutrition-related intervention. Baseline data (23 variables, 8 categorical) on participants (n = 295) in an intervention trial were used to classify participants in terms of meeting the criteria of achieving 10 000 steps per day. Results from classification and regression trees (CARTs), random forests, adaptive boosting, logistic regression, support vector machines and neural networks were compared using area under the curve (AUC) and error assessments. The CART produced the best model when considering the AUC (0.703), overall error (18%) and within class error (28%). Logistic regression also performed reasonably well compared to the other models (AUC 0.675, overall error 23%, within class error 36%). All the methods gave different rankings of variables' importance. CART found that body fat, quality of life using the SF-12 Physical Component Summary (PCS) and the cholesterol: HDL ratio were the most important predictors of meeting the 10 000 steps criteria, while logistic regression showed the SF-12PCS, glucose levels and level of education to be the most significant predictors (P ≤ 0.01). Differing outcomes suggest caution is required with a single data mining method, particularly in a dataset with nonlinear relationships and outliers and when exploring relationships that were not the primary outcomes of the research. © 2017 Dietitians Association of Australia.

  19. Characterizing nonconstant instrumental variance in emerging miniaturized analytical techniques.

    PubMed

    Noblitt, Scott D; Berg, Kathleen E; Cate, David M; Henry, Charles S

    2016-04-07

    Measurement variance is a crucial aspect of quantitative chemical analysis. Variance directly affects important analytical figures of merit, including detection limit, quantitation limit, and confidence intervals. Most reported analyses for emerging analytical techniques implicitly assume constant variance (homoskedasticity) by using unweighted regression calibrations. Despite the assumption of constant variance, it is known that most instruments exhibit heteroskedasticity, where variance changes with signal intensity. Ignoring nonconstant variance results in suboptimal calibrations, invalid uncertainty estimates, and incorrect detection limits. Three techniques where homoskedasticity is often assumed were covered in this work to evaluate if heteroskedasticity had a significant quantitative impact-naked-eye, distance-based detection using paper-based analytical devices (PADs), cathodic stripping voltammetry (CSV) with disposable carbon-ink electrode devices, and microchip electrophoresis (MCE) with conductivity detection. Despite these techniques representing a wide range of chemistries and precision, heteroskedastic behavior was confirmed for each. The general variance forms were analyzed, and recommendations for accounting for nonconstant variance discussed. Monte Carlo simulations of instrument responses were performed to quantify the benefits of weighted regression, and the sensitivity to uncertainty in the variance function was tested. Results show that heteroskedasticity should be considered during development of new techniques; even moderate uncertainty (30%) in the variance function still results in weighted regression outperforming unweighted regressions. We recommend utilizing the power model of variance because it is easy to apply, requires little additional experimentation, and produces higher-precision results and more reliable uncertainty estimates than assuming homoskedasticity. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.

    PubMed

    Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C

    2014-03-01

    To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  1. Empirical predictive models of daily relativistic electron flux at geostationary orbit: Multiple regression analysis

    DOE PAGES

    Simms, Laura E.; Engebretson, Mark J.; Pilipenko, Viacheslav; ...

    2016-04-07

    The daily maximum relativistic electron flux at geostationary orbit can be predicted well with a set of daily averaged predictor variables including previous day's flux, seed electron flux, solar wind velocity and number density, AE index, IMF Bz, Dst, and ULF and VLF wave power. As predictor variables are intercorrelated, we used multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Empirical models produced from regressions of flux on measured predictors from 1 day previous were reasonably effective at predicting novel observations. Adding previous flux to the parameter set improves the predictionmore » of the peak of the increases but delays its anticipation of an event. Previous day's solar wind number density and velocity, AE index, and ULF wave activity are the most significant explanatory variables; however, the AE index, measuring substorm processes, shows a negative correlation with flux when other parameters are controlled. This may be due to the triggering of electromagnetic ion cyclotron waves by substorms that cause electron precipitation. VLF waves show lower, but significant, influence. The combined effect of ULF and VLF waves shows a synergistic interaction, where each increases the influence of the other on flux enhancement. Correlations between observations and predictions for this 1 day lag model ranged from 0.71 to 0.89 (average: 0.78). Furthermore, a path analysis of correlations between predictors suggests that solar wind and IMF parameters affect flux through intermediate processes such as ring current ( Dst), AE, and wave activity.« less

  2. Empirical predictive models of daily relativistic electron flux at geostationary orbit: Multiple regression analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Simms, Laura E.; Engebretson, Mark J.; Pilipenko, Viacheslav

    The daily maximum relativistic electron flux at geostationary orbit can be predicted well with a set of daily averaged predictor variables including previous day's flux, seed electron flux, solar wind velocity and number density, AE index, IMF Bz, Dst, and ULF and VLF wave power. As predictor variables are intercorrelated, we used multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Empirical models produced from regressions of flux on measured predictors from 1 day previous were reasonably effective at predicting novel observations. Adding previous flux to the parameter set improves the predictionmore » of the peak of the increases but delays its anticipation of an event. Previous day's solar wind number density and velocity, AE index, and ULF wave activity are the most significant explanatory variables; however, the AE index, measuring substorm processes, shows a negative correlation with flux when other parameters are controlled. This may be due to the triggering of electromagnetic ion cyclotron waves by substorms that cause electron precipitation. VLF waves show lower, but significant, influence. The combined effect of ULF and VLF waves shows a synergistic interaction, where each increases the influence of the other on flux enhancement. Correlations between observations and predictions for this 1 day lag model ranged from 0.71 to 0.89 (average: 0.78). Furthermore, a path analysis of correlations between predictors suggests that solar wind and IMF parameters affect flux through intermediate processes such as ring current ( Dst), AE, and wave activity.« less

  3. A Note on the Relationship between the Number of Indicators and Their Reliability in Detecting Regression Coefficients in Latent Regression Analysis

    ERIC Educational Resources Information Center

    Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.

    2004-01-01

    We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…

  4. Comparison of 3 Methods for Identifying Dietary Patterns Associated With Risk of Disease

    PubMed Central

    DiBello, Julia R.; Kraft, Peter; McGarvey, Stephen T.; Goldberg, Robert; Campos, Hannia

    2008-01-01

    Reduced rank regression and partial least-squares regression (PLS) are proposed alternatives to principal component analysis (PCA). Using all 3 methods, the authors derived dietary patterns in Costa Rican data collected on 3,574 cases and controls in 1994–2004 and related the resulting patterns to risk of first incident myocardial infarction. Four dietary patterns associated with myocardial infarction were identified. Factor 1, characterized by high intakes of lean chicken, vegetables, fruit, and polyunsaturated oil, was generated by all 3 dietary pattern methods and was associated with a significantly decreased adjusted risk of myocardial infarction (28%–46%, depending on the method used). PCA and PLS also each yielded a pattern associated with a significantly decreased risk of myocardial infarction (31% and 23%, respectively); this pattern was characterized by moderate intake of alcohol and polyunsaturated oil and low intake of high-fat dairy products. The fourth factor derived from PCA was significantly associated with a 38% increased risk of myocardial infarction and was characterized by high intakes of coffee and palm oil. Contrary to previous studies, the authors found PCA and PLS to produce more patterns associated with cardiovascular disease than reduced rank regression. The most effective method for deriving dietary patterns related to disease may vary depending on the study goals. PMID:18945692

  5. Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle.

    PubMed

    Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko

    2018-03-01

    The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.

  6. Effect of motivational interviewing on rates of early childhood caries: a randomized trial.

    PubMed

    Harrison, Rosamund; Benton, Tonya; Everson-Stewart, Siobhan; Weinstein, Phil

    2007-01-01

    The purposes of this randomized controlled trial were to: (1) test motivational interviewing (MI) to prevent early childhood caries; and (2) use Poisson regression for data analysis. A total of 240 South Asian children 6 to 18 months old were enrolled and randomly assigned to either the MI or control condition. Children had a dental exam, and their mothers completed pretested instruments at baseline and 1 and 2 years postintervention. Other covariates that might explain outcomes over and above treatment differences were modeled using Poisson regression. Hazard ratios were produced. Analyses included all participants whenever possible. Poisson regression supported a protective effect of MI (hazard ratio [HR]=0.54 (95%CI=035-0.84)-that is, the M/ group had about a 46% lower rate of dmfs at 2 years than did control children. Similar treatment effect estimates were obtained from models that included, as alternative outcomes, ds, dms, and dmfs, including "white spot lesions." Exploratory analyses revealed that rates of dmfs were higher in children whose mothers had: (1) prechewed their food; (2) been raised in a rural environment; and (3) a higher family income (P<.05). A motivational interviewing-style intervention shows promise to promote preventive behaviors in mothers of young children at high risk for caries.

  7. Modelling space of spread Dengue Hemorrhagic Fever (DHF) in Central Java use spatial durbin model

    NASA Astrophysics Data System (ADS)

    Ispriyanti, Dwi; Prahutama, Alan; Taryono, Arkadina PN

    2018-05-01

    Dengue Hemorrhagic Fever is one of the major public health problems in Indonesia. From year to year, DHF causes Extraordinary Event in most parts of Indonesia, especially Central Java. Central Java consists of 35 districts or cities where each region is close to each other. Spatial regression is an analysis that suspects the influence of independent variables on the dependent variables with the influences of the region inside. In spatial regression modeling, there are spatial autoregressive model (SAR), spatial error model (SEM) and spatial autoregressive moving average (SARMA). Spatial Durbin model is the development of SAR where the dependent and independent variable have spatial influence. In this research dependent variable used is number of DHF sufferers. The independent variables observed are population density, number of hospitals, residents and health centers, and mean years of schooling. From the multiple regression model test, the variables that significantly affect the spread of DHF disease are the population and mean years of schooling. By using queen contiguity and rook contiguity, the best model produced is the SDM model with queen contiguity because it has the smallest AIC value of 494,12. Factors that generally affect the spread of DHF in Central Java Province are the number of population and the average length of school.

  8. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan

    2011-07-01

    SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.

  9. An econometric analysis of regional differences in household waste collection: the case of plastic packaging waste in Sweden.

    PubMed

    Hage, Olle; Söderholm, Patrik

    2008-01-01

    The Swedish producer responsibility ordinance mandates producers to collect and recycle packaging materials. This paper investigates the main determinants of collection rates of household plastic packaging waste in Swedish municipalities. This is done by the use of a regression analysis based on cross-sectional data for 252 Swedish municipalities. The results suggest that local policies, geographic/demographic variables, socio-economic factors and environmental preferences all help explain inter-municipality collection rates. For instance, the collection rate appears to be positively affected by increases in the unemployment rate, the share of private houses, and the presence of immigrants (unless newly arrived) in the municipality. The impacts of distance to recycling industry, urbanization rate and population density on collection outcomes turn out, though, to be both statistically and economically insignificant. A reasonable explanation for this is that the monetary compensation from the material companies to the collection entrepreneurs vary depending on region and is typically higher in high-cost regions. This implies that the plastic packaging collection in Sweden may be cost ineffective. Finally, the analysis also shows that municipalities that employ weight-based waste management fees generally experience higher collection rates than those municipalities in which flat and/or volume-based fees are used.

  10. Textural properties of gelling system of low-methoxy pectins produced by demethoxylating reaction of pectin methyl esterase.

    PubMed

    Kim, Y; Yoo, Y-H; Kim, K-O; Park, J-B; Yoo, S-H

    2008-06-01

    After deesterification of commercial pectins with a pectin methyl esterase (PME), their gelling properties were characterized using instrumental texture analysis. The final degree of esterification (DE) of the high- and low-methoxy pectins reached approximately 6% after the PME treatment, while deesterification of low-methoxy amidated pectin stopped at 18% DE. Furthermore, DE of high-methoxy pectin was tailored to be 40%, which is equivalent to the DE of commercial low-methoxy pectin. As a result, significant changes in molecular weight (Mw) distribution were observed in the PME-treated pectins. The texture profile analysis showed that PME modification drastically increased hardness, gumminess, and chewiness, while decreasing cohesiveness and adhesiveness of the pectin gels (P < 0.05). The pectin gel with relatively high peak molecular weight (Mp, 3.5 x 10(5)) and low DE (6), which was produced from high-methoxy pectin, exhibited the greatest hardness, gumminess, chewiness, and resilience. The hardness of low-methoxy amidated pectin increased over 300% after PME deesterification, suggesting that the effects of amide substitution could be reinforced when DE is even lower. The partial least square regression analysis indicated that the Mw and DE of the pectin molecule are the most crucial factors for hardness, chewiness, gumminess, and resilience of gel matrix.

  11. An econometric analysis of regional differences in household waste collection: The case of plastic packaging waste in Sweden

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hage, Olle; Soederholm, Patrik

    2008-07-01

    The Swedish producer responsibility ordinance mandates producers to collect and recycle packaging materials. This paper investigates the main determinants of collection rates of household plastic packaging waste in Swedish municipalities. This is done by the use of a regression analysis based on cross-sectional data for 252 Swedish municipalities. The results suggest that local policies, geographic/demographic variables, socio-economic factors and environmental preferences all help explain inter-municipality collection rates. For instance, the collection rate appears to be positively affected by increases in the unemployment rate, the share of private houses, and the presence of immigrants (unless newly arrived) in the municipality. Themore » impacts of distance to recycling industry, urbanization rate and population density on collection outcomes turn out, though, to be both statistically and economically insignificant. A reasonable explanation for this is that the monetary compensation from the material companies to the collection entrepreneurs vary depending on region and is typically higher in high-cost regions. This implies that the plastic packaging collection in Sweden may be cost ineffective. Finally, the analysis also shows that municipalities that employ weight-based waste management fees generally experience higher collection rates than those municipalities in which flat and/or volume-based fees are used.« less

  12. C U L8ter: YouTube distracted driving PSAs use of behavior change theory.

    PubMed

    Steadman, Mindy; Chao, Melanie S; Strong, Jessica T; Maxwell, Martha; West, Joshua H

    2014-01-01

    To examine the inclusion of health behavior theory in distracted driving PSAs on YouTube.com. Two-hundred fifty PSAs were assessed using constructs from 4 prominent health behavior theories. A total theory score was calculated for each video. Multiple regression analysis was used to identify factors associated with higher theory scores. PSAs were generally lacking in theoretical content. Video length, use of rates/statistics, driving scenario depiction, and presence of a celebrity were positively associated with theory inclusion. Collaboration between health experts and PSA creators could be fostered to produce more theory-based distracted driving videos on YouTube.com.

  13. A new method of linkage analysis using LOD scores for quantitative traits supports linkage of monoamine oxidase activity to D17S250 in the Collaborative Study on the Genetics of Alcoholism pedigrees.

    PubMed

    Curtis, David; Knight, Jo; Sham, Pak C

    2005-09-01

    Although LOD score methods have been applied to diseases with complex modes of inheritance, linkage analysis of quantitative traits has tended to rely on non-parametric methods based on regression or variance components analysis. Here, we describe a new method for LOD score analysis of quantitative traits which does not require specification of a mode of inheritance. The technique is derived from the MFLINK method for dichotomous traits. A range of plausible transmission models is constructed, constrained to yield the correct population mean and variance for the trait but differing with respect to the contribution to the variance due to the locus under consideration. Maximized LOD scores under homogeneity and admixture are calculated, as is a model-free LOD score which compares the maximized likelihoods under admixture assuming linkage and no linkage. These LOD scores have known asymptotic distributions and hence can be used to provide a statistical test for linkage. The method has been implemented in a program called QMFLINK. It was applied to data sets simulated using a variety of transmission models and to a measure of monoamine oxidase activity in 105 pedigrees from the Collaborative Study on the Genetics of Alcoholism. With the simulated data, the results showed that the new method could detect linkage well if the true allele frequency for the trait was close to that specified. However, it performed poorly on models in which the true allele frequency was much rarer. For the Collaborative Study on the Genetics of Alcoholism data set only a modest overlap was observed between the results obtained from the new method and those obtained when the same data were analysed previously using regression and variance components analysis. Of interest is that D17S250 produced a maximized LOD score under homogeneity and admixture of 2.6 but did not indicate linkage using the previous methods. However, this region did produce evidence for linkage in a separate data set, suggesting that QMFLINK may have been able to detect a true linkage which was not picked up by the other methods. The application of model-free LOD score analysis to quantitative traits is novel and deserves further evaluation of its merits and disadvantages relative to other methods.

  14. Vocal mechanics in Darwin's finches: correlation of beak gape and song frequency.

    PubMed

    Podos, Jeffrey; Southall, Joel A; Rossi-Santos, Marcos R

    2004-02-01

    Recent studies of vocal mechanics in songbirds have identified a functional role for the beak in sound production. The vocal tract (trachea and beak) filters harmonic overtones from sounds produced by the syrinx, and birds can fine-tune vocal tract resonance properties through changes in beak gape. In this study, we examine patterns of beak gape during song production in seven species of Darwin's finches of the Galápagos Islands. Our principal goals were to characterize the relationship between beak gape and vocal frequency during song production and to explore the possible influence therein of diversity in beak morphology and body size. Birds were audio and video recorded (at 30 frames s(-1)) as they sang in the field, and 164 song sequences were analyzed. We found that song frequency regressed significantly and positively on beak gape for 38 of 56 individuals and for all seven species examined. This finding provides broad support for a resonance model of vocal tract function in Darwin's finches. Comparison among species revealed significant variation in regression y-intercept values. Body size correlated negatively with y-intercept values, although not at a statistically significant level. We failed to detect variation in regression slopes among finch species, although the regression slopes of Darwin's finch and two North American sparrow species were found to differ. Analysis within one species (Geospiza fortis) revealed significant inter-individual variation in regression parameters; these parameters did not correlate with song frequency features or plumage scores. Our results suggest that patterns of beak use during song production were conserved during the Darwin's finch adaptive radiation, despite the evolution of substantial variation in beak morphology and body size.

  15. Predictive spectroscopy and chemical imaging based on novel optical systems

    NASA Astrophysics Data System (ADS)

    Nelson, Matthew Paul

    1998-10-01

    This thesis describes two futuristic optical systems designed to surpass contemporary spectroscopic methods for predictive spectroscopy and chemical imaging. These systems are advantageous to current techniques in a number of ways including lower cost, enhanced portability, shorter analysis time, and improved S/N. First, a novel optical approach to predicting chemical and physical properties based on principal component analysis (PCA) is proposed and evaluated. A regression vector produced by PCA is designed into the structure of a set of paired optical filters. Light passing through the paired filters produces an analog detector signal directly proportional to the chemical/physical property for which the regression vector was designed. Second, a novel optical system is described which takes a single-shot approach to chemical imaging with high spectroscopic resolution using a dimension-reduction fiber-optic array. Images are focused onto a two- dimensional matrix of optical fibers which are drawn into a linear distal array with specific ordering. The distal end is imaged with a spectrograph equipped with an ICCD camera for spectral analysis. Software is used to extract the spatial/spectral information contained in the ICCD images and deconvolute them into wave length-specific reconstructed images or position-specific spectra which span a multi-wavelength space. This thesis includes a description of the fabrication of two dimension-reduction arrays as well as an evaluation of the system for spatial and spectral resolution, throughput, image brightness, resolving power, depth of focus, and channel cross-talk. PCA is performed on the images by treating rows of the ICCD images as spectra and plotting the scores of each PC as a function of reconstruction position. In addition, iterative target transformation factor analysis (ITTFA) is performed on the spectroscopic images to generate ``true'' chemical maps of samples. Univariate zero-order images, univariate first-order spectroscopic images, bivariate first-order spectroscopic images, and multivariate first-order spectroscopic images of the temporal development of laser-induced plumes are presented and interpreted. Reconstructed chemical images generated using bivariate and trivariate wavelength techniques, bimodal and trimodal PCA methods, and bimodal and trimodal ITTFA approaches are also included.

  16. Moderation analysis using a two-level regression model.

    PubMed

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  17. Comparing near-infrared conventional diffuse reflectance spectroscopy and hyperspectral imaging for determination of the bulk properties of solid samples by multivariate regression: determination of Mooney viscosity and plasticity indices of natural rubber.

    PubMed

    Juliano da Silva, Carlos; Pasquini, Celio

    2015-01-21

    Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample representativeness, and minimizing the effect of the presence of contaminants.

  18. Multiple Correlation versus Multiple Regression.

    ERIC Educational Resources Information Center

    Huberty, Carl J.

    2003-01-01

    Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

  19. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  20. General Nature of Multicollinearity in Multiple Regression Analysis.

    ERIC Educational Resources Information Center

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  1. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  2. Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

    PubMed

    Hemmila, April; McGill, Jim; Ritter, David

    2008-03-01

    To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.

  3. Negative correlation between altitudes and oxygen isotope ratios of seeds: exploring its applicability to assess vertical seed dispersal.

    PubMed

    Naoe, Shoji; Tayasu, Ichiro; Masaki, Takashi; Koike, Shinsuke

    2016-10-01

    Vertical seed dispersal, which plays a key role in plant escape and/or expansion under climate change, was recently evaluated for the first time using negative correlation between altitudes and oxygen isotope ratio of seeds. Although this method is innovative, its applicability to other plants is unknown. To explore the applicability of the method, we regressed altitudes on δ 18 O of seeds of five woody species constituting three families in temperate forests in central Japan. Because climatic factors, including temperature and precipitation that influence δ 18 O of plant materials, demonstrate intensive seasonal fluctuation in the temperate zone, we also evaluated the effect of fruiting season of each species on δ 18 O of seeds using generalized linear mixed models (GLMM). Negative correlation between altitudes and δ 18 O of seeds was found in four of five species tested. The slope of regression lines tended to be lower in late-fruiting species. The GLMM analysis revealed that altitudes and date of fruiting peak negatively affected δ 18 O of seeds. These results indicate that the estimation of vertical seed dispersal using δ 18 O of seeds can be applicable for various species, not just confined to specific taxa, by identifying the altitudes of plants that produced seeds. The results also suggest that the regression line between altitudes and δ 18 O of seeds is rather species specific and that vertical seed dispersal in late-fruiting species is estimated at a low resolution due to their small regression slopes. A future study on the identification of environmental factors and plant traits that cause a difference in δ 18 O of seeds, combined with an improvement of analysis, will lead to effective evaluation of vertical seed dispersal in various species and thereby promote our understanding about the mechanism and ecological functions of vertical seed dispersal.

  4. Inter-model comparison of the landscape determinants of vector-borne disease: implications for epidemiological and entomological risk modeling.

    PubMed

    Lorenz, Alyson; Dhingra, Radhika; Chang, Howard H; Bisanzio, Donal; Liu, Yang; Remais, Justin V

    2014-01-01

    Extrapolating landscape regression models for use in assessing vector-borne disease risk and other applications requires thoughtful evaluation of fundamental model choice issues. To examine implications of such choices, an analysis was conducted to explore the extent to which disparate landscape models agree in their epidemiological and entomological risk predictions when extrapolated to new regions. Agreement between six literature-drawn landscape models was examined by comparing predicted county-level distributions of either Lyme disease or Ixodes scapularis vector using Spearman ranked correlation. AUC analyses and multinomial logistic regression were used to assess the ability of these extrapolated landscape models to predict observed national data. Three models based on measures of vegetation, habitat patch characteristics, and herbaceous landcover emerged as effective predictors of observed disease and vector distribution. An ensemble model containing these three models improved precision and predictive ability over individual models. A priori assessment of qualitative model characteristics effectively identified models that subsequently emerged as better predictors in quantitative analysis. Both a methodology for quantitative model comparison and a checklist for qualitative assessment of candidate models for extrapolation are provided; both tools aim to improve collaboration between those producing models and those interested in applying them to new areas and research questions.

  5. Analysis of 16S rRNA gene lactic acid bacteria (LAB) isolate from Markisa fruit (Passiflora sp.) as a producer of protease enzyme and probiotics

    NASA Astrophysics Data System (ADS)

    Hidayat, Habibi

    2017-03-01

    16S rRNA gene analysis of bacteria lactic acid (LAB) isolate from Markisa Kuning Fruit (Passiflora edulis var. flavicarpa) as a producer of protease enzyme and probiotics has been done. The aim of the study is to determine the protease enzyme activity and 16S rRNA gene amplification using PCR. The calculation procedure was done to M4 isolate bacteria lactic acid (LAB) Isolate which has been resistant to acids with pH 2.0 in the manner of screening protease enzyme activity test result 6.5 to clear zone is 13 mm againts colony diametre is 2 mm. The results of study enzyme activity used spectrophotometer UV-Vis obtainable the regression equation Y=0.02983+0.001312X, with levels of protein M4 isolate is 0.6594 mg/mL and enzyme activity of obtainable is 0.8626 unit/ml while the spesific enzyme activity produced is 1.308 unit/mg. Then, 16S rRNA gene amplificatiom and DNA sequencing has been done. The results of study showed that the bacteria species contained from M4 bacteria lactic acid (LAB) isolate is Weisella cibiria strain II-I-59. Weisella cibiria strain II-I-59 is one of bacteria could be utilized in the digestive tract.

  6. A Generalized Least Squares Regression Approach for Computing Effect Sizes in Single-Case Research: Application Examples

    ERIC Educational Resources Information Center

    Maggin, Daniel M.; Swaminathan, Hariharan; Rogers, Helen J.; O'Keeffe, Breda V.; Sugai, George; Horner, Robert H.

    2011-01-01

    A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of…

  7. Mechanisms of Optical Regression Following Corneal Laser Refractive Surgery: Epithelial and Stromal Responses

    PubMed Central

    MOSHIRFAR, Majid; DESAUTELS, Jordan D.; WALKER, Brian D.; MURRI, Michael S.; BIRDSONG, Orry C.; HOOPES, Phillip C. Sr

    2018-01-01

    Laser vision correction is a safe and effective method of reducing spectacle dependence. Photorefractive Keratectomy (PRK), Laser In Situ Keratomileusis (LASIK), and Small-Incision Lenticule Extraction (SMILE) can accurately correct myopia, hyperopia, and astigmatism. Although these procedures are nearing optimization in terms of their ability to produce a desired refractive target, the long term cellular responses of the cornea to these procedures can cause patients to regress from the their ideal postoperative refraction. In many cases, refractive regression requires follow up enhancement surgeries, presenting additional risks to patients. Although some risk factors underlying refractive regression have been identified, the exact mechanisms have not been elucidated. It is clear that cellular proliferation events are important mediators of optical regression. This review focused specifically on cellular changes to the corneal epithelium and stroma, which may influence postoperative visual regression following LASIK, PRK, and SMILE procedures. PMID:29644238

  8. Developing a predictive tropospheric ozone model for Tabriz

    NASA Astrophysics Data System (ADS)

    Khatibi, Rahman; Naghipour, Leila; Ghorbani, Mohammad A.; Smith, Michael S.; Karimi, Vahid; Farhoudi, Reza; Delafrouz, Hadi; Arvanaghi, Hadi

    2013-04-01

    Predictive ozone models are becoming indispensable tools by providing a capability for pollution alerts to serve people who are vulnerable to the risks. We have developed a tropospheric ozone prediction capability for Tabriz, Iran, by using the following five modeling strategies: three regression-type methods: Multiple Linear Regression (MLR), Artificial Neural Networks (ANNs), and Gene Expression Programming (GEP); and two auto-regression-type models: Nonlinear Local Prediction (NLP) to implement chaos theory and Auto-Regressive Integrated Moving Average (ARIMA) models. The regression-type modeling strategies explain the data in terms of: temperature, solar radiation, dew point temperature, and wind speed, by regressing present ozone values to their past values. The ozone time series are available at various time intervals, including hourly intervals, from August 2010 to March 2011. The results for MLR, ANN and GEP models are not overly good but those produced by NLP and ARIMA are promising for the establishing a forecasting capability.

  9. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  10. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies

    PubMed Central

    Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.

    2016-01-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis. PMID:27274911

  11. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.

    PubMed

    Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H

    2016-04-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.

  12. Stepwise versus Hierarchical Regression: Pros and Cons

    ERIC Educational Resources Information Center

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  13. Interpreting Bivariate Regression Coefficients: Going beyond the Average

    ERIC Educational Resources Information Center

    Halcoussis, Dennis; Phillips, G. Michael

    2010-01-01

    Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…

  14. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  15. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  16. Estimation of 1RM for knee extension based on the maximal isometric muscle strength and body composition.

    PubMed

    Kanada, Yoshikiyo; Sakurai, Hiroaki; Sugiura, Yoshito; Arai, Tomoaki; Koyama, Soichiro; Tanabe, Shigeo

    2017-11-01

    [Purpose] To create a regression formula in order to estimate 1RM for knee extensors, based on the maximal isometric muscle strength measured using a hand-held dynamometer and data regarding the body composition. [Subjects and Methods] Measurement was performed in 21 healthy males in their twenties to thirties. Single regression analysis was performed, with measurement values representing 1RM and the maximal isometric muscle strength as dependent and independent variables, respectively. Furthermore, multiple regression analysis was performed, with data regarding the body composition incorporated as another independent variable, in addition to the maximal isometric muscle strength. [Results] Through single regression analysis with the maximal isometric muscle strength as an independent variable, the following regression formula was created: 1RM (kg)=0.714 + 0.783 × maximal isometric muscle strength (kgf). On multiple regression analysis, only the total muscle mass was extracted. [Conclusion] A highly accurate regression formula to estimate 1RM was created based on both the maximal isometric muscle strength and body composition. Using a hand-held dynamometer and body composition analyzer, it was possible to measure these items in a short time, and obtain clinically useful results.

  17. Regression Model Optimization for the Analysis of Experimental Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2009-01-01

    A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

  18. Correlation between standard plate count and somatic cell count milk quality results for Wisconsin dairy producers.

    PubMed

    Borneman, Darand L; Ingham, Steve

    2014-05-01

    The objective of this study was to determine if a correlation exists between standard plate count (SPC) and somatic cell count (SCC) monthly reported results for Wisconsin dairy producers. Such a correlation may indicate that Wisconsin producers effectively controlling sanitation and milk temperature (reflected in low SPC) also have implemented good herd health management practices (reflected in low SCC). The SPC and SCC results for all grade A and B dairy producers who submitted results to the Wisconsin Department of Agriculture, Trade, and Consumer Protection, in each month of 2012 were analyzed. Grade A producer SPC results were less dispersed than grade B producer SPC results. Regression analysis showed a highly significant correlation between SPC and SCC, but the R(2) value was very small (0.02-0.03), suggesting that many other factors, besides SCC, influence SPC. Average SCC (across 12 mo) for grade A and B producers decreased with an increase in the number of monthly SPC results (out of 12) that were ≤ 25,000 cfu/mL. A chi-squared test of independence showed that the proportion of monthly SCC results >250,000 cells/mL varied significantly depending on whether the corresponding SPC result was ≤ 25,000 or >25,000 cfu/mL. This significant difference occurred in all months of 2012 for grade A and B producers. The results suggest that a generally consistent level of skill exists across dairy production practices affecting SPC and SCC. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  19. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    PubMed

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  20. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules

    PubMed Central

    Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030

  1. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules.

    PubMed

    Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.

  2. Comparing least-squares and quantile regression approaches to analyzing median hospital charges.

    PubMed

    Olsen, Cody S; Clark, Amy E; Thomas, Andrea M; Cook, Lawrence J

    2012-07-01

    Emergency department (ED) and hospital charges obtained from administrative data sets are useful descriptors of injury severity and the burden to EDs and the health care system. However, charges are typically positively skewed due to costly procedures, long hospital stays, and complicated or prolonged treatment for few patients. The median is not affected by extreme observations and is useful in describing and comparing distributions of hospital charges. A least-squares analysis employing a log transformation is one approach for estimating median hospital charges, corresponding confidence intervals (CIs), and differences between groups; however, this method requires certain distributional properties. An alternate method is quantile regression, which allows estimation and inference related to the median without making distributional assumptions. The objective was to compare the log-transformation least-squares method to the quantile regression approach for estimating median hospital charges, differences in median charges between groups, and associated CIs. The authors performed simulations using repeated sampling of observed statewide ED and hospital charges and charges randomly generated from a hypothetical lognormal distribution. The median and 95% CI and the multiplicative difference between the median charges of two groups were estimated using both least-squares and quantile regression methods. Performance of the two methods was evaluated. In contrast to least squares, quantile regression produced estimates that were unbiased and had smaller mean square errors in simulations of observed ED and hospital charges. Both methods performed well in simulations of hypothetical charges that met least-squares method assumptions. When the data did not follow the assumed distribution, least-squares estimates were often biased, and the associated CIs had lower than expected coverage as sample size increased. Quantile regression analyses of hospital charges provide unbiased estimates even when lognormal and equal variance assumptions are violated. These methods may be particularly useful in describing and analyzing hospital charges from administrative data sets. © 2012 by the Society for Academic Emergency Medicine.

  3. Regression Techniques for Determining the Effective Impervious Area in Southern California Watersheds

    NASA Astrophysics Data System (ADS)

    Sultana, R.; Mroczek, M.; Dallman, S.; Sengupta, A.; Stein, E. D.

    2016-12-01

    The portion of the Total Impervious Area (TIA) that is hydraulically connected to the storm drainage network is called the Effective Impervious Area (EIA). The remaining fraction of impervious area, called the non-effective impervious area, drains onto pervious surfaces which do not contribute to runoff for smaller events. Using the TIA instead of EIA in models and calculations can lead to overestimates of runoff volumes peak discharges and oversizing of drainage system since it is assumed all impervious areas produce urban runoff that is directly connected to storm drains. This makes EIA a better predictor of actual runoff from urban catchments for hydraulic design of storm drain systems and modeling non-point source pollution. Compared to TIA, determining the EIA is considerably more difficult to calculate since it cannot be found by using remote sensing techniques, readily available EIA datasets, or aerial imagery interpretation alone. For this study, EIA percentages were calculated by two successive regression methods for five watersheds (with areas of 8.38 - 158mi2) located in Southern California using rainfall-runoff event data for the years 2004 - 2007. Runoff generated from the smaller storm events are considered to be emanating only from the effective impervious areas. Therefore, larger events that were considered to have runoff from both impervious and pervious surfaces were successively removed in the regression methods using a criterion of (1) 1mm and (2) a max (2 , 1mm) above the regression line. MSE is calculated from actual runoff and runoff predicted by the regression. Analysis of standard deviations showed that criterion of max (2 , 1mm) better fit the regression line and is the preferred method in predicting the EIA percentage. The estimated EIAs have shown to be approximately 78% to 43% of the TIA which shows use of EIA instead of TIA can have significant impact on the cost building urban hydraulic systems and stormwater capture devices.

  4. Flood-frequency prediction methods for unregulated streams of Tennessee, 2000

    USGS Publications Warehouse

    Law, George S.; Tasker, Gary D.

    2003-01-01

    Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.

  5. The Application of Censored Regression Models in Low Streamflow Analyses

    NASA Astrophysics Data System (ADS)

    Kroll, C.; Luz, J.

    2003-12-01

    Estimation of low streamflow statistics at gauged and ungauged river sites is often a daunting task. This process is further confounded by the presence of intermittent streamflows, where streamflow is sometimes reported as zero, within a region. Streamflows recorded as zero may be zero, or may be less than the measurement detection limit. Such data is often referred to as censored data. Numerous methods have been developed to characterize intermittent streamflow series. Logit regression has been proposed to develop regional models of the probability annual lowflows series (such as 7-day lowflows) are zero. In addition, Tobit regression, a method of regression that allows for censored dependent variables, has been proposed for lowflow regional regression models in regions where the lowflow statistic of interest estimated as zero at some sites in the region. While these methods have been proposed, their use in practice has been limited. Here a delete-one jackknife simulation is presented to examine the performance of Logit and Tobit models of 7-day annual minimum flows in 6 USGS water resource regions in the United States. For the Logit model, an assessment is made of whether sites are correctly classified as having at least 10% of 7-day annual lowflows equal to zero. In such a situation, the 7-day, 10-year lowflow (Q710), a commonly employed low streamflow statistic, would be reported as zero. For the Tobit model, a comparison is made between results from the Tobit model, and from performing either ordinary least squares (OLS) or principal component regression (PCR) after the zero sites are dropped from the analysis. Initial results for the Logit model indicate this method to have a high probability of correctly classifying sites into groups with Q710s as zero and non-zero. Initial results also indicate the Tobit model produces better results than PCR and OLS when more than 5% of the sites in the region have Q710 values calculated as zero.

  6. Dairy calf management-A comparison of practices and producer attitudes among conventional and organic herds.

    PubMed

    Pempek, J A; Schuenemann, G M; Holder, E; Habing, G G

    2017-10-01

    Dairy calves are at high risk for morbidity and mortality early in life. Understanding producer attitudes is important for implementation of best management practices to improve calf health. The objectives of this study were to evaluate usage frequency and producer attitudes on key calf management practices between conventional and organic dairy operations. A cross-sectional survey was mailed to conventional and organic dairy producers in Ohio and Michigan that included questions on cow-calf separation, colostrum management, and vaccination use. The overall survey response rate was 49% (727/1,488); 449 and 172 conventional and organic producer respondents, respectively, were included in the final analysis. Binary, cumulative, and multinomial logistic regression models were used to test differences within and between herd types for management practices and producer attitudes. The majority of conventional (64%, 279/439) producers reported separating the calf from the dam 30 min to 6 h after birth. More organic (34%, 56/166) than conventional (18%, 80/439) producers reported separation 6 to 12 h after birth, and organic producers were more likely to agree time before separation is beneficial. Few conventional (10%, 44/448) and organic (3%, 5/171) producers reported measuring colostrum quality. Most conventional producers (68%, 304/448) hand-fed the first feeding of colostrum, whereas the majority of organic producers (38%, 69/171) allowed calves to nurse colostrum. Last, 44% (188/430) of conventional producers reported vaccinating their calves for respiratory disease, compared with 14% (22/162) of organic producers; organic producers were more likely to perceive vaccines as ineffective and harmful to calf health. Thus, the usage frequency and perceived risks and benefits of calf management practices vary considerably between conventional and organic dairy producers. These findings provide helpful information to understand decision making at the herd level regarding key calf management and health practices, regardless of production systems. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Titrating versus targeting home care services to frail elderly clients: an application of agency theory and cost-benefit analysis to home care policy.

    PubMed

    Weissert, William; Chernew, Michael; Hirth, Richard

    2003-02-01

    The article summarizes the shortcomings of current home care targeting policy, provides a conceptual framework for understanding the sources of its problems, and proposes an alternative resource allocation method. Methods required for different aspects of the study included synthesis of the published literature, regression analysis of risk predictors, and comparison of actual resource allocations with simulated budgets. Problems of imperfect agency ranging from unclear goals and inappropriate incentives to lack of information about the marginal effectiveness of home care could be mitigated with an improved budgeting method that combines client selection and resource allocation. No program can produce its best outcome performance when its goals are unclear and its technology is unstandardized. Titration of care would reallocate resources to maximize marginal benefit for marginal cost.

  8. Compressive strength of human openwedges: a selection method

    NASA Astrophysics Data System (ADS)

    Follet, H.; Gotteland, M.; Bardonnet, R.; Sfarghiu, A. M.; Peyrot, J.; Rumelhart, C.

    2004-02-01

    A series of 44 samples of bone wedges of human origin, intended for allograft openwedge osteotomy and obtained without particular precautions during hip arthroplasty were re-examined. After viral inactivity chemical treatment, lyophilisation and radio-sterilisation (intended to produce optimal health safety), the compressive strength, independent of age, sex and the height of the sample (or angle of cut), proved to be too widely dispersed [ 10{-}158 MPa] in the first study. We propose a method for selecting samples which takes into account their geometry (width, length, thicknesses, cortical surface area). Statistical methods (Principal Components Analysis PCA, Hierarchical Cluster Analysis, Multilinear regression) allowed final selection of 29 samples having a mean compressive strength σ_{max} =103 MPa ± 26 and with variation [ 61{-}158 MPa] . These results are equivalent or greater than average materials currently used in openwedge osteotomy.

  9. Numerical analysis and experimental studies on solenoid common rail diesel injector with worn control valve

    NASA Astrophysics Data System (ADS)

    Krivtsov, S. N.; Yakimov, I. V.; Ozornin, S. P.

    2018-03-01

    A mathematical model of a solenoid common rail fuel injector was developed. Its difference from existing models is control valve wear simulation. A common rail injector of 0445110376 Series (Cummins ISf 2.8 Diesel engine) produced by Bosch Company was used as a research object. Injector parameters (fuel delivery and back leakage) were determined by calculation and experimental methods. GT-Suite model average R2 is 0.93 which means that it predicts the injection rate shape very accurately (nominal and marginal technical conditions of an injector). Numerical analysis and experimental studies showed that control valve wear increases back leakage and fuel delivery (especially at 160 MPa). The regression models for determining fuel delivery and back leakage effects on fuel pressure and energizing time were developed (for nominal and marginal technical conditions).

  10. An observational study on bloodstream extended-spectrum beta-lactamase infection in critical care unit: incidence, risk factors and its impact on outcome.

    PubMed

    Nasa, Prashant; Juneja, Deven; Singh, Omender; Dang, Rohit; Singh, Akhilesh

    2012-03-01

    The incidence of nosocomial infections caused by extended-spectrum beta-lactamase (ESBL) producing microbes is increasing rapidly in the last few years. However, the clinical significance of infections caused by ESBL-producing bacteria in ICU patients remains unclear. We did a prospective study to look for incidence, risk factors and outcome of these infections in ICU patients. Consecutive isolates of Escherichia coli and Klebsiella pneumoniae in blood cultures were included for the analysis. Patients were divided into two groups based on the production of ESBL. Primary outcome measure was ICU mortality. Logistic regression analysis was done to identify risk factors for ESBL production. Among the 95 isolates tested, 73 (76.8%) produced ESBL. Transfer from other hospitals or wards (OR 3.65; 95% CI: 1.3-10.1 and RR 1.35; 95% CI: 1.05-1.73) and previous history of antibiotics usage (OR 3.54; 95% CI: 1.04-11.97 and RR 1.5; 95% CI: 0.89-2.5) were risk factors for ESBL production. There was no significant difference in ICU mortality (p=0.588), need for organ support between two groups. There is a high incidence of ESBL producing organisms causing blood stream infections in critically ill patients. Transfer from other hospitals and previous antibiotic usage are important risk factors for ESBL production. However ESBL production may not be associated with a poorer outcome if appropriate early antibiotic therapy is instituted. Copyright © 2011 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.

  11. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  12. The Precision Efficacy Analysis for Regression Sample Size Method.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Barcikowski, Robert S.

    The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

  13. Effect of Contact Damage on the Strength of Ceramic Materials.

    DTIC Science & Technology

    1982-10-01

    variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F

  14. Future Performance Trend Indicators: A Current Value Approach to Human Resources Accounting. Report III. Multivariate Predictions of Organizational Performance Across Time.

    ERIC Educational Resources Information Center

    Pecorella, Patricia A.; Bowers, David G.

    Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…

  15. Urban stormwater quality, event-mean concentrations, and estimates of stormwater pollutant loads, Dallas-Fort Worth area, Texas, 1992-93

    USGS Publications Warehouse

    Baldys, Stanley; Raines, T.H.; Mansfield, B.L.; Sandlin, J.T.

    1998-01-01

    Local regression equations were developed to estimate loads produced by individual storms. Mean annual loads were estimated by applying the storm-load equations for all runoff-producing storms in an average climatic year and summing individual storm loads to determine the annual load.

  16. Analysis of the Magnitude and Frequency of Peak Discharges for the Navajo Nation in Arizona, Utah, Colorado, and New Mexico

    USGS Publications Warehouse

    Waltemeyer, Scott D.

    2006-01-01

    Estimates of the magnitude and frequency of peak discharges are necessary for the reliable flood-hazard mapping in the Navajo Nation in Arizona, Utah, Colorado, and New Mexico. The Bureau of Indian Affairs, U.S. Army Corps of Engineers, and Navajo Nation requested that the U.S. Geological Survey update estimates of peak discharge magnitude for gaging stations in the region and update regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites using data collected through 1999 at 146 gaging stations, an additional 13 years of peak-discharge data since a 1997 investigation, which used gaging-station data through 1986. The equations for estimation of peak discharges at ungaged sites were developed for flood regions 8, 11, high elevation, and 6 and are delineated on the basis of the hydrologic codes from the 1997 investigation. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 82 of the 146 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge having a recurrence interval of less than 1.4 years in the probability-density function. Within each region, logarithms of the peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then was applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction for a peak discharge have a recurrence interval of 100-years for region 8 was 53 percent (average) for the 100-year flood. The average standard of prediction, which includes average sampling error and average standard error of regression, ranged from 45 to 83 percent for the 100-year flood. Estimated standard error of prediction for a hybrid method for region 11 was large in the 1997 investigation. No distinction of floods produced from a high-elevation region was presented in the 1997 investigation. Overall, the equations based on generalized least-squares regression techniques are considered to be more reliable than those in the 1997 report because of the increased length of record and improved GIS method. Techniques for transferring flood-frequency relations to ungaged sites on the same stream can be estimated at an ungaged site by a direct application of the regional regression equation or at an ungaged site on a stream that has a gaging station upstream or downstream by using the drainage-area ratio and the drainage-area exponent from the regional regression equation of the respective region.

  17. Common pitfalls in statistical analysis: Linear regression analysis

    PubMed Central

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022

  18. Modifications of haematology analyzers to improve cell counting and leukocyte differentiating in cerebrospinal fluid controls of the Joint German Society for Clinical Chemistry and Laboratory Medicine.

    PubMed

    Kleine, Tilmann O; Nebe, C Thomas; Löwer, Christa; Lehmitz, Reinhard; Kruse, Rolf; Geilenkeuser, Wolf-Jochen; Dorn-Beineke, Alexandra

    2009-08-01

    Flow cytometry (FCM) is used with haematology analyzers (HAs) to count cells and differentiate leukocytes in cerebrospinal fluid (CSF). To evaluate the FCM techniques of HAs, 10 external DGKL trials with CSF controls were carried out in 2004 to 2008. Eight single platform HAs with and without CSF equipment were evaluated with living blood leukocytes and erythrocytes in CSF like DGKL controls: Coulter (LH750,755), Abbott CD3200, CD3500, CD3700, CD4000, Sapphire, ADVIA 120(R) CSF assay, and Sysmex XE-2100(R). Results were compared with visual counting of native cells in Fuchs-Rosenthal chamber, unstained, and absolute values of leukocyte differentiation, assayed by dual platform analysis with immune-FCM (FACSCalibur, CD45, CD14) and the chamber counts. Reference values X were compared with HA values Y by statistical evaluation with Passing/Bablock (P/B) linear regression analysis to reveal conformity of both methods. The HAs, studied, produced no valid results with DGKL CSF controls, because P/B regression revealed no conformity with the reference values due to:-blank problems with impedance analysis,-leukocyte loss with preanalytical erythrocyte lysis procedures, especially of monocytes,-inaccurate results with ADVIA cell sphering and cell differentiation with algorithms and enzyme activities (e.g., peroxidase). HA techniques have to be improved, e.g., using no erythrocyte lysis and CSF adequate techniques, to examine CSF samples precise and accurate. Copyright 2009 International Society for Advancement of Cytometry.

  19. Quantitative analysis of triacylglycerol regioisomers in fats and oils using reversed-phase high-performance liquid chromatography and atmospheric pressure chemical ionization mass spectrometry.

    PubMed

    Fauconnot, Laëtitia; Hau, Jörg; Aeschlimann, Jean-Marc; Fay, Laurent-Bernard; Dionisi, Fabiola

    2004-01-01

    Positional distribution of fatty acyl chains of triacylglycerols (TGs) in vegetable oils and fats (palm oil, cocoa butter) and animal fats (beef, pork and chicken fats) was examined by reversed-phase high-performance liquid chromatography (RP-HPLC) coupled to atmospheric pressure chemical ionization using a quadrupole mass spectrometer. Quantification of regioisomers was achieved for TGs containing two different fatty acyl chains (palmitic (P), stearic (S), oleic (O), and/or linoleic (L)). For seven pairs of 'AAB/ABA'-type TGs, namely PPS/PSP, PPO/POP, SSO/SOS, POO/OPO, SOO/OSO, PPL/PLP and LLS/LSL, calibration curves were established on the basis of the difference in relative abundances of the fragment ions produced by preferred losses of the fatty acid from the 1/3-position compared to the 2-position. In practice the positional isomers AAB and ABA yield mass spectra showing a significant difference in relative abundance ratios of the ions AA(+) to AB(+). Statistical analysis of the validation data obtained from analysis of TG standards and spiked oils showed that, under repeatability conditions, least-squares regression can be used to establish calibration curves for all pairs. The regression models show linear behavior that allow the determination of the proportion of each regioisomer in an AAB/ABA pair, within a working range from 10 to 1000 microg/mL and a 95% confidence interval of +/-3% for three replicates. Copyright 2003 John Wiley & Sons, Ltd.

  20. Indicators of suboptimal performance embedded in the Wechsler Memory Scale-Fourth Edition (WMS-IV).

    PubMed

    Bouman, Zita; Hendriks, Marc P H; Schmand, Ben A; Kessels, Roy P C; Aldenkamp, Albert P

    2016-01-01

    Recognition and visual working memory tasks from the Wechsler Memory Scale-Fourth Edition (WMS-IV) have previously been documented as useful indicators for suboptimal performance. The present study examined the clinical utility of the Dutch version of the WMS-IV (WMS-IV-NL) for the identification of suboptimal performance using an analogue study design. The patient group consisted of 59 mixed-etiology patients; the experimental malingerers were 50 healthy individuals who were asked to simulate cognitive impairment as a result of a traumatic brain injury; the last group consisted of 50 healthy controls who were instructed to put forth full effort. Experimental malingerers performed significantly lower on all WMS-IV-NL tasks than did the patients and healthy controls. A binary logistic regression analysis was performed on the experimental malingerers and the patients. The first model contained the visual working memory subtests (Spatial Addition and Symbol Span) and the recognition tasks of the following subtests: Logical Memory, Verbal Paired Associates, Designs, Visual Reproduction. The results showed an overall classification rate of 78.4%, and only Spatial Addition explained a significant amount of variation (p < .001). Subsequent logistic regression analysis and receiver operating characteristic (ROC) analysis supported the discriminatory power of the subtest Spatial Addition. A scaled score cutoff of <4 produced 93% specificity and 52% sensitivity for detection of suboptimal performance. The WMS-IV-NL Spatial Addition subtest may provide clinically useful information for the detection of suboptimal performance.

  1. A monoclonal cytolytic T-lymphocyte response observed in a melanoma patient vaccinated with a tumor-specific antigenic peptide encoded by gene MAGE-3

    PubMed Central

    Coulie, Pierre G.; Karanikas, Vaios; Colau, Didier; Lurquin, Christophe; Landry, Claire; Marchand, Marie; Dorval, Thierry; Brichard, Vincent; Boon, Thierry

    2001-01-01

    Vaccination of melanoma patients with tumor-specific antigens recognized by cytolytic T lymphocytes (CTL) produces significant tumor regressions in a minority of patients. These regressions appear to occur in the absence of massive CTL responses. To detect low-level responses, we resorted to antigenic stimulation of blood lymphocyte cultures in limiting dilution conditions, followed by tetramer analysis, cloning of the tetramer-positive cells, and T-cell receptor (TCR) sequence analysis of the CTL clones that showed strict specificity for the tumor antigen. A monoclonal CTL response against a MAGE-3 antigen was observed in a melanoma patient, who showed partial rejection of a large metastasis after treatment with a vaccine containing only the tumor-specific antigenic peptide. Tetramer analysis after in vitro restimulation indicated that about 1/40,000 postimmunization CD8+ blood lymphocytes were directed against the antigen. The same TCR was present in all of the positive microcultures. TCR evaluation carried out directly on blood lymphocytes by PCR amplification led to a similar frequency estimate after immunization, whereas the TCR was not found among 2.5 × 106 CD8+ lymphocytes collected before immunization. Our results prove unambiguously that vaccines containing only a tumor-specific antigenic peptide can elicit a CTL response. Even though they provide no information about the effector mechanisms responsible for the observed reduction in tumor mass in this patient, they would suggest that low-level CTL responses can initiate tumor rejection. PMID:11517302

  2. Potential for Bias When Estimating Critical Windows for Air Pollution in Children's Health.

    PubMed

    Wilson, Ander; Chiu, Yueh-Hsiu Mathilda; Hsu, Hsiao-Hsien Leon; Wright, Robert O; Wright, Rosalind J; Coull, Brent A

    2017-12-01

    Evidence supports an association between maternal exposure to air pollution during pregnancy and children's health outcomes. Recent interest has focused on identifying critical windows of vulnerability. An analysis based on a distributed lag model (DLM) can yield estimates of a critical window that are different from those from an analysis that regresses the outcome on each of the 3 trimester-average exposures (TAEs). Using a simulation study, we assessed bias in estimates of critical windows obtained using 3 regression approaches: 1) 3 separate models to estimate the association with each of the 3 TAEs; 2) a single model to jointly estimate the association between the outcome and all 3 TAEs; and 3) a DLM. We used weekly fine-particulate-matter exposure data for 238 births in a birth cohort in and around Boston, Massachusetts, and a simulated outcome and time-varying exposure effect. Estimates using separate models for each TAE were biased and identified incorrect windows. This bias arose from seasonal trends in particulate matter that induced correlation between TAEs. Including all TAEs in a single model reduced bias. DLM produced unbiased estimates and added flexibility to identify windows. Analysis of body mass index z score and fat mass in the same cohort highlighted inconsistent estimates from the 3 methods. © The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Dynamic and Regression Modeling of Ocean Variability in the Tide-Gauge Record at Seasonal and Longer Periods

    NASA Technical Reports Server (NTRS)

    Hill, Emma M.; Ponte, Rui M.; Davis, James L.

    2007-01-01

    Comparison of monthly mean tide-gauge time series to corresponding model time series based on a static inverted barometer (IB) for pressure-driven fluctuations and a ocean general circulation model (OM) reveals that the combined model successfully reproduces seasonal and interannual changes in relative sea level at many stations. Removal of the OM and IB from the tide-gauge record produces residual time series with a mean global variance reduction of 53%. The OM is mis-scaled for certain regions, and 68% of the residual time series contain a significant seasonal variability after removal of the OM and IB from the tide-gauge data. Including OM admittance parameters and seasonal coefficients in a regression model for each station, with IB also removed, produces residual time series with mean global variance reduction of 71%. Examination of the regional improvement in variance caused by scaling the OM, including seasonal terms, or both, indicates weakness in the model at predicting sea-level variation for constricted ocean regions. The model is particularly effective at reproducing sea-level variation for stations in North America, Europe, and Japan. The RMS residual for many stations in these areas is 25-35 mm. The production of "cleaner" tide-gauge time series, with oceanographic variability removed, is important for future analysis of nonsecular and regionally differing sea-level variations. Understanding the ocean model's strengths and weaknesses will allow for future improvements of the model.

  4. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  5. Random regression models on Legendre polynomials to estimate genetic parameters for weights from birth to adult age in Canchim cattle.

    PubMed

    Baldi, F; Albuquerque, L G; Alencar, M M

    2010-08-01

    The objective of this work was to estimate covariance functions for direct and maternal genetic effects, animal and maternal permanent environmental effects, and subsequently, to derive relevant genetic parameters for growth traits in Canchim cattle. Data comprised 49,011 weight records on 2435 females from birth to adult age. The model of analysis included fixed effects of contemporary groups (year and month of birth and at weighing) and age of dam as quadratic covariable. Mean trends were taken into account by a cubic regression on orthogonal polynomials of animal age. Residual variances were allowed to vary and were modelled by a step function with 1, 4 or 11 classes based on animal's age. The model fitting four classes of residual variances was the best. A total of 12 random regression models from second to seventh order were used to model direct and maternal genetic effects, animal and maternal permanent environmental effects. The model with direct and maternal genetic effects, animal and maternal permanent environmental effects fitted by quadric, cubic, quintic and linear Legendre polynomials, respectively, was the most adequate to describe the covariance structure of the data. Estimates of direct and maternal heritability obtained by multi-trait (seven traits) and random regression models were very similar. Selection for higher weight at any age, especially after weaning, will produce an increase in mature cow weight. The possibility to modify the growth curve in Canchim cattle to obtain animals with rapid growth at early ages and moderate to low mature cow weight is limited.

  6. Linear and evolutionary polynomial regression models to forecast coastal dynamics: Comparison and reliability assessment

    NASA Astrophysics Data System (ADS)

    Bruno, Delia Evelina; Barca, Emanuele; Goncalves, Rodrigo Mikosz; de Araujo Queiroz, Heithor Alexandre; Berardi, Luigi; Passarella, Giuseppe

    2018-01-01

    In this paper, the Evolutionary Polynomial Regression data modelling strategy has been applied to study small scale, short-term coastal morphodynamics, given its capability for treating a wide database of known information, non-linearly. Simple linear and multilinear regression models were also applied to achieve a balance between the computational load and reliability of estimations of the three models. In fact, even though it is easy to imagine that the more complex the model, the more the prediction improves, sometimes a "slight" worsening of estimations can be accepted in exchange for the time saved in data organization and computational load. The models' outcomes were validated through a detailed statistical, error analysis, which revealed a slightly better estimation of the polynomial model with respect to the multilinear model, as expected. On the other hand, even though the data organization was identical for the two models, the multilinear one required a simpler simulation setting and a faster run time. Finally, the most reliable evolutionary polynomial regression model was used in order to make some conjecture about the uncertainty increase with the extension of extrapolation time of the estimation. The overlapping rate between the confidence band of the mean of the known coast position and the prediction band of the estimated position can be a good index of the weakness in producing reliable estimations when the extrapolation time increases too much. The proposed models and tests have been applied to a coastal sector located nearby Torre Colimena in the Apulia region, south Italy.

  7. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  8. Statistical performance of image cytometry for DNA, lipids, cytokeratin, & CD45 in a model system for circulation tumor cell detection.

    PubMed

    Futia, Gregory L; Schlaepfer, Isabel R; Qamar, Lubna; Behbakht, Kian; Gibson, Emily A

    2017-07-01

    Detection of circulating tumor cells (CTCs) in a blood sample is limited by the sensitivity and specificity of the biomarker panel used to identify CTCs over other blood cells. In this work, we present Bayesian theory that shows how test sensitivity and specificity set the rarity of cell that a test can detect. We perform our calculation of sensitivity and specificity on our image cytometry biomarker panel by testing on pure disease positive (D + ) populations (MCF7 cells) and pure disease negative populations (D - ) (leukocytes). In this system, we performed multi-channel confocal fluorescence microscopy to image biomarkers of DNA, lipids, CD45, and Cytokeratin. Using custom software, we segmented our confocal images into regions of interest consisting of individual cells and computed the image metrics of total signal, second spatial moment, spatial frequency second moment, and the product of the spatial-spatial frequency moments. We present our analysis of these 16 features. The best performing of the 16 features produced an average separation of three standard deviations between D + and D - and an average detectable rarity of ∼1 in 200. We performed multivariable regression and feature selection to combine multiple features for increased performance and showed an average separation of seven standard deviations between the D + and D - populations making our average detectable rarity of ∼1 in 480. Histograms and receiver operating characteristics (ROC) curves for these features and regressions are presented. We conclude that simple regression analysis holds promise to further improve the separation of rare cells in cytometry applications. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.

  9. USAF (United States Air Force) Stability and Control DATCOM (Data Compendium)

    DTIC Science & Technology

    1978-04-01

    regression analysis involves the study of a group of variables to determine their effect on a given parameter. Because of the empirical nature of this...regression analysis of mathematical statistics. In general, a regression analysis involves the study of a group of variables to determine their effect on a...Excperiment, OSR TN 58-114, MIT Fluid Dynamics Research Group Rapt. 57-5, 1957. (U) 90. Kennet, H., and Ashley, H.: Review of Unsteady Aerodynamic Studies in

  10. A Method of Calculating Functional Independence Measure at Discharge from Functional Independence Measure Effectiveness Predicted by Multiple Regression Analysis Has a High Degree of Predictive Accuracy.

    PubMed

    Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru

    2017-09-01

    Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  11. Use of Multiple Regression and Use-Availability Analyses in Determining Habitat Selection by Gray Squirrels (Sciurus Carolinensis)

    Treesearch

    John W. Edwards; Susan C. Loeb; David C. Guynn

    1994-01-01

    Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...

  12. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  13. A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

    PubMed

    Zhang, L; Liu, X J

    2016-06-03

    With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.

  14. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units.

    PubMed

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien

    2015-09-01

    According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as well as more detailed geological information. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Progestin implants can rescue demi-embryo pregnancies in goats: a case study.

    PubMed

    Beckett, D M; Oppenheim, S M; Moyer, A L; BonDurant, R H; Rowe, J D; Anderson, G B

    1999-06-01

    Survival after transfer of demi-embryos (i.e., half-embryos produced by embryo splitting) to recipients usually is lower than survival after transfer of intact embryos. Reduced survival after demi-embryo transfer could be due to loss of viability after splitting, failure of a viable demi-embryo to prevent corpus luteum (CL) regression in the recipient female, or a combination of factors. From a retrospective analysis of pregnancy and embryo survival rates after demi-embryo transfer in sheep and goats, we report the rescue of caprine demi-embryo pregnancies in which CL regression occurred at the end of diestrus despite the presence of a viable conceptus in the uterus with progestin implants. Day 5 or 6 morulae and blastocysts were flushed from superovulated ewes and does and split into demi-embryos of approximately equal halves. Demi-embryos were either transferred fresh to synchronized recipients of the homologous species or frozen in liquid nitrogen. Approximately half of the recipient does and ewes were treated with norgestomet implants on Day 10 of the embryo transfer cycle and again 2 wk later. Serum collected on Day 25 from recipients with implants was assayed for progesterone to determine if a CL of pregnancy had been maintained. Pregnancy was diagnosed by ultrasonography on Day 35 of gestation. Corpus luteum regression occurred despite the presence of a viable conceptus in the uterus in 6 of 55 progestin-treated caprine demi-embryo recipients and in 0 of 66 ovine demi-embryo recipients. Five of the caprine pregnancies were maintained to term with norgestomet implants and produced 5 live kids. The sixth fetus, which was carried by a progestin implant-treated 8-mo-old doeling, died at approximately 50 d of gestation. These results suggest that, at least in goats, some demi-embryos may provide inadequate signaling for maternal recognition of pregnancy, and such pregnancies can be rescued with progestin treatment to the doe.

  16. High-flow oxygen therapy: pressure analysis in a pediatric airway model.

    PubMed

    Urbano, Javier; del Castillo, Jimena; López-Herce, Jesús; Gallardo, José A; Solana, María J; Carrillo, Ángel

    2012-05-01

    The mechanism of high-flow oxygen therapy and the pressures reached in the airway have not been defined. We hypothesized that the flow would generate a low continuous positive pressure, and that elevated flow rates in this model could produce moderate pressures. The objective of this study was to analyze the pressure generated by a high-flow oxygen therapy system in an experimental model of the pediatric airway. An experimental in vitro study was performed. A high-flow oxygen therapy system was connected to 3 types of interface (nasal cannulae, nasal mask, and oronasal mask) and applied to 2 types of pediatric manikin (infant and neonatal). The pressures generated in the circuit, in the airway, and in the pharynx were measured at different flow rates (5, 10, 15, and 20 L/min). The experiment was conducted with and without a leak (mouth sealed and unsealed). Linear regression analyses were performed for each set of measurements. The pressures generated with the different interfaces were very similar. The maximum pressure recorded was 4 cm H(2)O with a flow of 20 L/min via nasal cannulae or nasal mask. When the mouth of the manikin was held open, the pressures reached in the airway and pharynxes were undetectable. Linear regression analyses showed a similar linear relationship between flow and pressures measured in the pharynx (pressure = -0.375 + 0.138 × flow) and in the airway (pressure = -0.375 + 0.158 × flow) with the closed mouth condition. According to our hypothesis, high-flow oxygen therapy systems produced a low-level CPAP in an experimental pediatric model, even with the use of very high flow rates. Linear regression analyses showed similar linear relationships between flow and pressures measured in the pharynx and in the airway. This finding suggests that, at least in part, the effects may be due to other mechanisms.

  17. Performance characteristics of LOX-H2, tangential-entry, swirl-coaxial, rocket injectors

    NASA Technical Reports Server (NTRS)

    Howell, Doug; Petersen, Eric; Clark, Jim

    1993-01-01

    Development of a high performing swirl-coaxial injector requires an understanding of fundamental performance characteristics. This paper addresses the findings of studies on cold flow atomic characterizations which provided information on the influence of fluid properties and element operating conditions on the produced droplet sprays. These findings are applied to actual rocket conditions. The performance characteristics of swirl-coaxial injection elements under multi-element hot-fire conditions were obtained by analysis of combustion performance data from three separate test series. The injection elements are described and test results are analyzed using multi-variable linear regression. A direct comparison of test results indicated that reduced fuel injection velocity improved injection element performance through improved propellant mixing.

  18. Effect of toss and weather on County Cricket Championship outcomes.

    PubMed

    Forrest, David; Dorsey, Ron

    2008-01-01

    The principal competition in English professional cricket has become more competitive with the introduction of hierarchical divisions linked by promotion and relegation. Using regression analysis, we examine the effect on league points when teams suffer different degrees of weather disruption over the season and different amounts of luck in winning the toss for choice of first innings. The results are used to illustrate the sensitivity of championship, promotion, and relegation outcomes to such matters of chance and revised league tables are produced after applying adjustments to account for the influence of weather and toss. Policy recommendations are presented on how the influence of weather and toss might be lessened in future seasons.

  19. Longitudinal flying qualities criteria for single-pilot instrument flight operations

    NASA Technical Reports Server (NTRS)

    Stengel, R. F.; Bar-Gill, A.

    1983-01-01

    Modern estimation and control theory, flight testing, and statistical analysis were used to deduce flying qualities criteria for General Aviation Single Pilot Instrument Flight Rule (SPIFR) operations. The principal concern is that unsatisfactory aircraft dynamic response combined with high navigation/communication workload can produce problems of safety and efficiency. To alleviate these problems. The relative importance of these factors must be determined. This objective was achieved by flying SPIFR tasks with different aircraft dynamic configurations and assessing the effects of such variations under these conditions. The experimental results yielded quantitative indicators of pilot's performance and workload, and for each of them, multivariate regression was applied to evaluate several candidate flying qualities criteria.

  20. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-01

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.

  1. Prediction by regression and intrarange data scatter in surface-process studies

    USGS Publications Warehouse

    Toy, T.J.; Osterkamp, W.R.; Renard, K.G.

    1993-01-01

    Modeling is a major component of contemporary earth science, and regression analysis occupies a central position in the parameterization, calibration, and validation of geomorphic and hydrologic models. Although this methodology can be used in many ways, we are primarily concerned with the prediction of values for one variable from another variable. Examination of the literature reveals considerable inconsistency in the presentation of the results of regression analysis and the occurrence of patterns in the scatter of data points about the regression line. Both circumstances confound utilization and evaluation of the models. Statisticians are well aware of various problems associated with the use of regression analysis and offer improved practices; often, however, their guidelines are not followed. After a review of the aforementioned circumstances and until standard criteria for model evaluation become established, we recommend, as a minimum, inclusion of scatter diagrams, the standard error of the estimate, and sample size in reporting the results of regression analyses for most surface-process studies. ?? 1993 Springer-Verlag.

  2. Quantile regression for the statistical analysis of immunological data with many non-detects.

    PubMed

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  3. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  4. Socio-economic status, racial composition and the affordability of fresh fruits and vegetables in neighborhoods of a large rural region in Texas

    PubMed Central

    2011-01-01

    Background Little is known about how affordability of healthy food varies with community characteristics in rural settings. We examined how the cost of fresh fruit and vegetables varies with the economic and demographic characteristics in six rural counties of Texas. Methods Ground-truthed data from the Brazos Valley Food Environment Project were used to identify all food stores in the rural region and the availability and lowest price of fresh whole fruit and vegetables in the food stores. Socioeconomic characteristics were extracted from the 2000 U.S. Census Summary Files 3 at the level of the census block group. We used an imputation strategy to calculate two types of price indices for both fresh fruit and fresh vegetables: a high variety and a basic index; and evaluated the relationship between neighborhood economic and demographic characteristics and affordability of fresh produce, using linear regression models. Results The mean cost of meeting the USDA recommendation of fruit consumption from a high variety basket of fruit types in our sample of stores was just over $27.50 per week. Relying on the three most common fruits lowered the weekly expense to under $17.25 per week, a reduction of 37.6%. The effect of moving from a high variety to a low variety basket was much less when considering vegetable consumption: a 4.3% decline from $29.23 to $27.97 per week. Univariate regression analysis revealed that the cost of fresh produce is not associated with the racial/ethnic composition of the local community. However, multivariate regression showed that holding median income constant, stores in neighborhoods with higher percentages of Black residents paid more for fresh fruits and vegetables. The proportion of Hispanic residents was not associated with cost in either the univariate or multivariate analysis. Conclusion This study extends prior work by examining the affordability of fresh fruit and vegetables from food stores in a large rural area; and how access to an affordable supply of fresh fruit and vegetables differs by neighborhood inequalities. The approach and findings of this study are relevant and have important research and policy implications for understanding access and availability of affordable, healthy foods. PMID:21244688

  5. Socio-economic status, racial composition and the affordability of fresh fruits and vegetables in neighborhoods of a large rural region in Texas.

    PubMed

    Dunn, Richard A; Sharkey, Joseph R; Lotade-Manje, Justus; Bouhlal, Yasser; Nayga, Rodolfo M

    2011-01-18

    Little is known about how affordability of healthy food varies with community characteristics in rural settings. We examined how the cost of fresh fruit and vegetables varies with the economic and demographic characteristics in six rural counties of Texas. Ground-truthed data from the Brazos Valley Food Environment Project were used to identify all food stores in the rural region and the availability and lowest price of fresh whole fruit and vegetables in the food stores. Socioeconomic characteristics were extracted from the 2000 U.S. Census Summary Files 3 at the level of the census block group. We used an imputation strategy to calculate two types of price indices for both fresh fruit and fresh vegetables: a high variety and a basic index; and evaluated the relationship between neighborhood economic and demographic characteristics and affordability of fresh produce, using linear regression models. The mean cost of meeting the USDA recommendation of fruit consumption from a high variety basket of fruit types in our sample of stores was just over $27.50 per week. Relying on the three most common fruits lowered the weekly expense to under $17.25 per week, a reduction of 37.6%. The effect of moving from a high variety to a low variety basket was much less when considering vegetable consumption: a 4.3% decline from $29.23 to $27.97 per week. Univariate regression analysis revealed that the cost of fresh produce is not associated with the racial/ethnic composition of the local community. However, multivariate regression showed that holding median income constant, stores in neighborhoods with higher percentages of Black residents paid more for fresh fruits and vegetables. The proportion of Hispanic residents was not associated with cost in either the univariate or multivariate analysis. This study extends prior work by examining the affordability of fresh fruit and vegetables from food stores in a large rural area; and how access to an affordable supply of fresh fruit and vegetables differs by neighborhood inequalities. The approach and findings of this study are relevant and have important research and policy implications for understanding access and availability of affordable, healthy foods.

  6. Reducing Bias and Increasing Precision by Adding Either a Pretest Measure of the Study Outcome or a Nonequivalent Comparison Group to the Basic Regression Discontinuity Design: An Example from Education

    ERIC Educational Resources Information Center

    Tang, Yang; Cook, Thomas D.; Kisbu-Sakarya, Yasemin

    2015-01-01

    Regression discontinuity design (RD) has been widely used to produce reliable causal estimates. Researchers have validated the accuracy of RD design using within study comparisons (Cook, Shadish & Wong, 2008; Cook & Steiner, 2010; Shadish et al, 2011). Within study comparisons examines the validity of a quasi-experiment by comparing its…

  7. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    EPA Pesticide Factsheets

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  8. Correlates of motivation to change in pathological gamblers completing cognitive-behavioral group therapy.

    PubMed

    Gómez-Peña, Mónica; Penelo, Eva; Granero, Roser; Fernández-Aranda, Fernando; Alvarez-Moya, Eva; Santamaría, Juan José; Moragas, Laura; Neus Aymamí, Maria; Gunnard, Katarina; Menchón, José M; Jimenez-Murcia, Susana

    2012-07-01

    The present study analyzes the association between the motivation to change and the cognitive-behavioral group intervention, in terms of dropouts and relapses, in a sample of male pathological gamblers. The specific objectives were as follows: (a) to estimate the predictive value of baseline University of Rhode Island Change Assessment scale (URICA) scores (i.e., at the start of the study) as regards the risk of relapse and dropout during treatment and (b) to assess the incremental predictive ability of URICA scores, as regards the mean change produced in the clinical status of patients between the start and finish of treatment. The relationship between the URICA and the response to treatment was analyzed by means of a pre-post design applied to a sample of 191 patients who were consecutively receiving cognitive-behavioral group therapy. The statistical analysis included logistic regression models and hierarchical multiple linear regression models. The discriminative ability of the models including the four URICA scores regarding the likelihood of relapse and dropout was acceptable (area under the receiver operating haracteristic curve: .73 and .71, respectively). No significant predictive ability was found as regards the differences between baseline and posttreatment scores (changes in R(2) below 5% in the multiple regression models). The availability of useful measures of motivation to change would enable treatment outcomes to be optimized through the application of specific therapeutic interventions. © 2012 Wiley Periodicals, Inc.

  9. Advances in simultaneous atmospheric profile and cloud parameter regression based retrieval from high-spectral resolution radiance measurements

    NASA Astrophysics Data System (ADS)

    Weisz, Elisabeth; Smith, William L.; Smith, Nadia

    2013-06-01

    The dual-regression (DR) method retrieves information about the Earth surface and vertical atmospheric conditions from measurements made by any high-spectral resolution infrared sounder in space. The retrieved information includes temperature and atmospheric gases (such as water vapor, ozone, and carbon species) as well as surface and cloud top parameters. The algorithm was designed to produce a high-quality product with low latency and has been demonstrated to yield accurate results in real-time environments. The speed of the retrieval is achieved through linear regression, while accuracy is achieved through a series of classification schemes and decision-making steps. These steps are necessary to account for the nonlinearity of hyperspectral retrievals. In this work, we detail the key steps that have been developed in the DR method to advance accuracy in the retrieval of nonlinear parameters, specifically cloud top pressure. The steps and their impact on retrieval results are discussed in-depth and illustrated through relevant case studies. In addition to discussing and demonstrating advances made in addressing nonlinearity in a linear geophysical retrieval method, advances toward multi-instrument geophysical analysis by applying the DR to three different operational sounders in polar orbit are also noted. For any area on the globe, the DR method achieves consistent accuracy and precision, making it potentially very valuable to both the meteorological and environmental user communities.

  10. Seasonal mean pressure reconstruction for the North Atlantic (1750 1850) based on early marine data

    NASA Astrophysics Data System (ADS)

    Gallego, D.; Garcia-Herrera, R.; Ribera, P.; Jones, P. D.

    2005-12-01

    Measurements of wind strength and direction abstracted from European ships' logbooks during the recently finished CLIWOC project have been used to produce the first gridded Sea Level Pressure (SLP) reconstruction for the 1750-1850 period over the North Atlantic based solely on marine data. The reconstruction is based on a spatial regression analysis calibrated by using data taken from the ICOADS database. An objective methodology has been developed to select the optimal calibration period and spatial domain of the reconstruction by testing several thousands of possible models. The finally selected area, limited by the performance of the regression equations and by the availability of data, covers the region between 28° N and 52° N close to the European coast and between 28° N and 44° N in the open Ocean. The results provide a direct measure of the strength and extension of the Azores High during the 101 years of the study period. The comparison with the recent land-based SLP reconstruction by Luterbacher et al. (2002) indicates the presence of a common signal. The interannual variability of the CLIWOC reconstructions is rather high due to the current scarcity of abstracted wind data in the areas with best response in the regression. Guidelines are proposed to optimize the efficiency of future abstraction work.

  11. Seasonal mean pressure reconstruction for the North Atlantic (1750 1850) based on early marine data

    NASA Astrophysics Data System (ADS)

    Gallego, D.; Garcia-Herrera, R.; Ribera, P.; Jones, P. D.

    2005-08-01

    Measures of wind strength and direction abstracted from European ships' logbooks during the recently finished CLIWOC project have been used to produce the first gridded Sea Level Pressure (SLP) reconstruction for the 1750-1850 period over the North Atlantic based solely on marine data. The reconstruction is based on a spatial regression analysis calibrated by using data taken from the ICOADS database. An objective methodology has been developed to select the optimal calibration period and spatial domain of the reconstruction by testing several thousands of possible models. The finally selected area, limited by the performance of the regression equations and by the availability of data, covers the region between 28°N and 52°N close to the European coast and between 28°N and 44°N in the open Ocean. The results provide a direct measure of the strength and extension of the Azores High during the 101 years of the study period. The comparison with the recent land-based SLP reconstruction by Luterbacher et al. (2002) indicates the presence of a common signal. The interannual variability of the CLIWOC reconstructions is rather high due to the current scarcity of abstracted wind data in the areas with best response in the regression. Guidelines are proposed to optimize the efficiency of future abstraction work.

  12. Analysis of the inter- and extracellular formation of platinum nanoparticles by Fusarium oxysporum f. sp. lycopersici using response surface methodology

    NASA Astrophysics Data System (ADS)

    Riddin, T. L.; Gericke, M.; Whiteley, C. G.

    2006-07-01

    Fusarium oxysporum fungal strain was screened and found to be successful for the inter- and extracellular production of platinum nanoparticles. Nanoparticle formation was visually observed, over time, by the colour of the extracellular solution and/or the fungal biomass turning from yellow to dark brown, and their concentration was determined from the amount of residual hexachloroplatinic acid measured from a standard curve at 456 nm. The extracellular nanoparticles were characterized by transmission electron microscopy. Nanoparticles of varying size (10-100 nm) and shape (hexagons, pentagons, circles, squares, rectangles) were produced at both extracellular and intercellular levels by the Fusarium oxysporum. The particles precipitate out of solution and bioaccumulate by nucleation either intercellularly, on the cell wall/membrane, or extracellularly in the surrounding medium. The importance of pH, temperature and hexachloroplatinic acid (H2PtCl6) concentration in nanoparticle formation was examined through the use of a statistical response surface methodology. Only the extracellular production of nanoparticles proved to be statistically significant, with a concentration yield of 4.85 mg l-1 estimated by a first-order regression model. From a second-order polynomial regression, the predicted yield of nanoparticles increased to 5.66 mg l-1 and, after a backward step, regression gave a final model with a yield of 6.59 mg l-1.

  13. Titanium Ions Release from an Innovative Titanium-Magnesium Composite: an in Vitro Study.

    PubMed

    Stanec, Zlatko; Halambek, Jasna; Maldini, Krešimir; Balog, Martin; Križik, Peter; Schauperl, Zdravko; Ćatić, Amir

    2016-03-01

    The innovative titanium-magnesium composite (Ti-Mg) was produced by powder metallurgy (P/M) method and is characterized in terms of corrosion behavior. Two groups of experimental material, 1 mass% (Ti-1Mg) and 2 mass% (Ti-2Mg) of magnesium in titanium matrix, were tested and compared to commercially pure titanium (CP Ti). Immersion test and chemical analysis of four solutions: artificial saliva; artificial saliva pH 4; artificial saliva with fluoride and Hank balanced salt solution were performed after 42 days of immersion, using inductively coupled plasma mass spectrometry (ICP-MS) to detect the amount of released titanium ions (Ti). SEM and EDS analysis were used for surface characterization. The difference between the results from different test solutions was assessed by ANOVA and Newman-Keuls test at p<0.05. The influence of predictor variables was found by multiple regression analysis. The results of the present study revealed a low corrosion rate of titanium from the experimental Ti-Mg group. Up to 46 and 23 times lower dissolution of Ti from Ti-1Mg and Ti-2Mg, respectively was observed compared to the control group. Among the tested solutions, artificial saliva with fluorides exhibited the highest corrosion effect on all specimens tested. SEM micrographs showed preserved dual phase surface structure and EDS analysis suggested a favorable surface bioactivity. In conclusion, Ti-Mg produced by P/M as a material with better corrosion properties when compared to CP Ti is suggested.

  14. Developing lignin-based bio-nanofibers by centrifugal spinning technique.

    PubMed

    Stojanovska, Elena; Kurtulus, Mustafa; Abdelgawad, Abdelrahman; Candan, Zeki; Kilic, Ali

    2018-07-01

    Lignin-based nanofibers were produced via centrifugal spinning from lignin-thermoplastic polyurethane polymer blends. The most suitable process parameters were chosen by optimization of the rotational speed, nozzle diameter and spinneret-to-collector distance using different blend ratios of the two polymers at different total polymer concentrations. The basic characteristics of polymer solutions were enlightened by their viscosity and surface tension. The morphology of the fibers produced was characterized by SEM, while their thermal properties by DSC and TG analysis. Multiply regression was used to determine the parameters that have higher impact on the fiber diameter. It was possible to obtain thermally stable lignin/polyurethane nanofibers with diameters below 500nm. From the aspect of spinnability, 1:1 lignin/TPU contents were shown to be more feasible. On the other side, the most suitable processing parameters were found to be angular velocity of 8500rpm for nozzles of 0.5mm diameter and working distance of 30cm. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Predictive analysis and data mining among the employment of fresh graduate students in HEI

    NASA Astrophysics Data System (ADS)

    Rahman, Nor Azziaty Abdul; Tan, Kian Lam; Lim, Chen Kim

    2017-10-01

    Management of higher education have a problem in producing 100% of graduates who can meet the needs of industry while industry is also facing the problem of finding skilled graduates who suit their needs partly due to the lack of an effective method in assessing problem solving skills as well as weaknesses in the assessment of problem-solving skills. The purpose of this paper is to propose a suitable classification model that can be used in making prediction and assessment of the attributes of the student's dataset to meet the selection criteria of work demanded by the industry of the graduates in the academic field. Supervised and unsupervised Machine Learning Algorithms were used in this research where; K-Nearest Neighbor, Naïve Bayes, Decision Tree, Neural Network, Logistic Regression and Support Vector Machine. The proposed model will help the university management to make a better long-term plans for producing graduates who are skilled, knowledgeable and fulfill the industry needs as well.

  16. Clinical evaluation of a novel population-based regression analysis for detecting glaucomatous visual field progression.

    PubMed

    Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C

    2011-04-01

    The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF clusters. © Georg Thieme Verlag KG Stuttgart · New York.

  17. Using "Excel" for White's Test--An Important Technique for Evaluating the Equality of Variance Assumption and Model Specification in a Regression Analysis

    ERIC Educational Resources Information Center

    Berenson, Mark L.

    2013-01-01

    There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…

  18. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  19. Police witness identification images: a geometric morphometric analysis.

    PubMed

    Hayes, Susan; Tullberg, Cameron

    2012-11-01

    Research into witness identification images typically occurs within the laboratory and involves subjective likeness and recognizability judgments. This study analyzed whether actual witness identification images systematically alter the facial shapes of the suspects described. The shape analysis tool, geometric morphometrics, was applied to 46 homologous facial landmarks displayed on 50 witness identification images and their corresponding arrest photographs, using principal component analysis and multivariate regressions. The results indicate that compared with arrest photographs, witness identification images systematically depict suspects with lowered and medially located eyebrows (p = <0.000001). This was found to occur independently of the Police Artist, and did not occur with composites produced under laboratory conditions. There are several possible explanations for this finding, including any, or all, of the following: The suspect was frowning at the time of the incident, the witness had negative feelings toward the suspect, this is an effect of unfamiliar face processing, the suspect displayed fear at the time of their arrest photograph. © 2012 American Academy of Forensic Sciences.

  20. Exsanguinated blood volume estimation using fractal analysis of digital images.

    PubMed

    Sant, Sonia P; Fairgrieve, Scott I

    2012-05-01

    The estimation of bloodstain volume using fractal analysis of digital images of passive blood stains is presented. Binary digital photos of bloodstains of known volumes (ranging from 1 to 7 mL), dispersed in a defined area, were subjected to image analysis using FracLac V. 2.0 for ImageJ. The box-counting method was used to generate a fractal dimension for each trial. A positive correlation between the generated fractal number and the volume of blood was found (R(2) = 0.99). Regression equations were produced to estimate the volume of blood in blind trials. An error rate ranging from 78% for 1 mL to 7% for 6 mL demonstrated that as the volume increases so does the accuracy of the volume estimation. This method used in the preliminary study proved that bloodstain patterns may be deconstructed into mathematical parameters, thus removing the subjective element inherent in other methods of volume estimation. © 2012 American Academy of Forensic Sciences.

  1. Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au; Ebert, Martin A.; Bulsara, Max

    Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥more » 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions: Logistic regression and MARS were most likely to be the best-performing strategy for the prediction of urinary symptoms with elastic-net and random forest producing competitive results. The predictive power of the models was modest and endpoint-dependent. New features, including spatial dose maps, may be necessary to achieve better models.« less

  2. Predicting recreational water quality advisories: A comparison of statistical methods

    USGS Publications Warehouse

    Brooks, Wesley R.; Corsi, Steven R.; Fienen, Michael N.; Carvin, Rebecca B.

    2016-01-01

    Epidemiological studies indicate that fecal indicator bacteria (FIB) in beach water are associated with illnesses among people having contact with the water. In order to mitigate public health impacts, many beaches are posted with an advisory when the concentration of FIB exceeds a beach action value. The most commonly used method of measuring FIB concentration takes 18–24 h before returning a result. In order to avoid the 24 h lag, it has become common to ”nowcast” the FIB concentration using statistical regressions on environmental surrogate variables. Most commonly, nowcast models are estimated using ordinary least squares regression, but other regression methods from the statistical and machine learning literature are sometimes used. This study compares 14 regression methods across 7 Wisconsin beaches to identify which consistently produces the most accurate predictions. A random forest model is identified as the most accurate, followed by multiple regression fit using the adaptive LASSO.

  3. Application of dielectric spectroscopy for monitoring high cell density in monoclonal antibody producing CHO cell cultivations.

    PubMed

    Párta, László; Zalai, Dénes; Borbély, Sándor; Putics, Akos

    2014-02-01

    The application of dielectric spectroscopy was frequently investigated as an on-line cell culture monitoring tool; however, it still requires supportive data and experience in order to become a robust technique. In this study, dielectric spectroscopy was used to predict viable cell density (VCD) at industrially relevant high levels in concentrated fed-batch culture of Chinese hamster ovary cells producing a monoclonal antibody for pharmaceutical purposes. For on-line dielectric spectroscopy measurements, capacitance was scanned within a wide range of frequency values (100-19,490 kHz) in six parallel cell cultivation batches. Prior to detailed mathematical analysis of the collected data, principal component analysis (PCA) was applied to compare dielectric behavior of the cultivations. PCA analysis resulted in detecting measurement disturbances. By using the measured spectroscopic data, partial least squares regression (PLS), Cole-Cole, and linear modeling were applied and compared in order to predict VCD. The Cole-Cole and the PLS model provided reliable prediction over the entire cultivation including both the early and decline phases of cell growth, while the linear model failed to estimate VCD in the later, declining cultivation phase. In regards to the measurement error sensitivity, remarkable differences were shown among PLS, Cole-Cole, and linear modeling. VCD prediction accuracy could be improved in the runs with measurement disturbances by first derivative pre-treatment in PLS and by parameter optimization of the Cole-Cole modeling.

  4. Haererehalobacter sp. JS1, a bioemulsifier producing halophilic bacterium isolated from Indian solar salt works.

    PubMed

    Birdilla Selva Donio, Mariathason; Chelladurai Karthikeyan, Subbiahanadar; Michaelbabu, Mariavincent; Uma, Ganapathi; Raja Jeya Sekar, Ramaiyan; Citarasu, Thavasimuthu

    2018-05-18

    Bioemulsifier (BE)-producing Haererehalobacter sp. JS1 was isolated and identified from the solar salt works in India. The BE was extracted, purified, and characterized by Gas Chromatography-Mass Spectrometry (GC-MS) analysis. Emulsification activity was performed against different oils and dye degradation potential against different dyes. The production of BE was optimized using different carbon sources (C), nitrogen sources (N), pH, and NaCl. BE screening methods revealed that, Haererehalobacter sp. JS1 was highly positive BE production. Identification by 16S rRNA sequencing and analyses was found that, the Haererehalobacter sp. JS1 was closely related to Salinicoccus halophilus and Haererehalobacter sp. The structural characterization analysis confirmed that the partially purified bioemulsifier belongs to siloxane-type. Emulsification activity (E24) revealed that the bioemulsifier significantly (p < = 0.001) emulsified the commercial oils including coconut oil, gingelly oil, olive oil, and palmolein oils. Haererehalobacter sp. JS1 also significantly (p < = 0.001) degraded the dyes such as orange MR, direct violet, cotton red, reactive yellow, nitro green, and azo dye. RSM regression co-efficient and contour plot analysis clearly indicated that the combination of pH and NaCl helped to increase BE production. Siloxane-type of BE obtained from Haererehalobacter sp. JS1 was able to emulsify different oils and commercial dyes. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  6. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  7. Optimizing methods for linking cinematic features to fMRI data.

    PubMed

    Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia

    2015-04-15

    One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.

  8. An analysis of penetration and ricochet phenomena in oblique hypervelocity impact

    NASA Technical Reports Server (NTRS)

    Schonberg, William P.; Taylor, Roy A.; Horn, Jennifer R.

    1988-01-01

    An experimental investigation of phenomena associated with the oblique hypervelocity impact of spherical projectiles on multisheet aluminum structures is described. A model that can be employed in the design of meteoroid and space debris protection systems for space structures is developed. The model consists of equations that relate crater and perforation damage of a multisheet structure to parameters such as projectile size, impact velocity, and trajectory obliquity. The equations are obtained through a regression analysis of oblique hypervelocity impact test data. This data shows that the response of a multisheet structure to oblique impact is significantly different from its response to normal hypervelocity impact. It was found that obliquely incident projectiles produce ricochet debris that can severely damage panels or instrumentation located on the exterior of a space structure. Obliquity effects of high-speed impact must, therefore, be considered in the design of any structure exposed to the meteoroid and space debris environment.

  9. [Parenting styles and their relationship with hyperactivity].

    PubMed

    Raya Trenas, Antonio Félix; Herreruzo Cabrera, Javier; Pino Osuna, María José

    2008-11-01

    The present study aims to determine the relationship among factors that make up the parenting styles according to the PCRI (Parent-Child Relationship Inventory) and hyperactivity reported by parents through the BASC (Behaviour Assessment System for Children). We selected a sample of 32 children between 3 and 14 years old (23 male and 9 female) with risk scores in hyperactivity and another similar group with low scores in hyperactivity. After administering both instruments to the parents, we carried out a binomial logistic regression analysis which resulted in a prediction model for 84.4% of the sample, made up of the PCRI factors: fathers' involvement, communication and role orientation, mothers' parental support, and both parents' limit-setting and autonomy. Moreover, our analysis of the variance produced significant differences in the support perceived by the fathers and mothers of both groups. Lastly, the utility of results to propose intervention strategies within the family based on an authoritative style is discussed.

  10. Bayesian Group Bridge for Bi-level Variable Selection.

    PubMed

    Mallick, Himel; Yi, Nengjun

    2017-06-01

    A Bayesian bi-level variable selection method (BAGB: Bayesian Analysis of Group Bridge) is developed for regularized regression and classification. This new development is motivated by grouped data, where generic variables can be divided into multiple groups, with variables in the same group being mechanistically related or statistically correlated. As an alternative to frequentist group variable selection methods, BAGB incorporates structural information among predictors through a group-wise shrinkage prior. Posterior computation proceeds via an efficient MCMC algorithm. In addition to the usual ease-of-interpretation of hierarchical linear models, the Bayesian formulation produces valid standard errors, a feature that is notably absent in the frequentist framework. Empirical evidence of the attractiveness of the method is illustrated by extensive Monte Carlo simulations and real data analysis. Finally, several extensions of this new approach are presented, providing a unified framework for bi-level variable selection in general models with flexible penalties.

  11. Comprehensive Chemical Fingerprinting of High-Quality Cocoa at Early Stages of Processing: Effectiveness of Combined Untargeted and Targeted Approaches for Classification and Discrimination.

    PubMed

    Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara

    2017-08-02

    This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.

  12. Analysis of albumin Raman scattering in visible and near-infrared ranges

    NASA Astrophysics Data System (ADS)

    Lykina, Anastasia A.; Artemyev, Dmitry N.

    2018-04-01

    In this work the analysis of the shape and intensity of albumin Raman signals in visible and near-IR ranges was carried out. The experimental setup using lasers from the visible region first of all excites the fluorescence of the albumin solution, the main contribution to which is produced by sodium chloride, which is a component of the tested sample. At the same time, lasers from the near-infrared range excited the Raman signal of albumin most effectively. It was found that the highest ratio of Raman scattering to autofluorescence intensities in the detected signal was obtained using a laser with a wavelength of 1064 nm. To determine the albumin solution concentration by type of spectrum, a regression approach with the projection to latent structures method was applied. The lowest predicted error of albumin concentration of 2-3 g/l was obtained by using the near-infrared range lasers.

  13. Analysis of relativistic nucleus-nucleus interactions in emulsion chambers

    NASA Technical Reports Server (NTRS)

    Mcguire, Stephen C.

    1987-01-01

    The development of a computer-assisted method is reported for the determination of the angular distribution data for secondary particles produced in relativistic nucleus-nucleus collisions in emulsions. The method is applied to emulsion detectors that were placed in a constant, uniform magnetic field and exposed to beams of 60 and 200 GeV/nucleon O-16 ions at the Super Proton Synchrotron (SPS) of the European Center for Nuclear Research (CERN). Linear regression analysis is used to determine the azimuthal and polar emission angles from measured track coordinate data. The software, written in BASIC, is designed to be machine independent, and adaptable to an automated system for acquiring the track coordinates. The fitting algorithm is deterministic, and takes into account the experimental uncertainty in the measured points. Further, a procedure for using the track data to estimate the linear momenta of the charged particles observed in the detectors is included.

  14. Trace element analysis of rough diamond by LA-ICP-MS: a case of source discrimination?

    PubMed

    Dalpé, Claude; Hudon, Pierre; Ballantyne, David J; Williams, Darrell; Marcotte, Denis

    2010-11-01

    Current profiling of rough diamond source is performed using different physical and/or morphological techniques that require strong knowledge and experience in the field. More recently, chemical impurities have been used to discriminate diamond source and with the advance of laser ablation-inductively coupled plasma-mass spectrometry (LA-ICP-MS) empirical profiling of rough diamonds is possible to some extent. In this study, we present a LA-ICP-MS methodology that we developed for analyzing ultra-trace element impurities in rough diamond for origin determination ("profiling"). Diamonds from two sources were analyzed by LA-ICP-MS and were statistically classified by accepted methods. For the two diamond populations analyzed in this study, binomial logistic regression produced a better overall correct classification than linear discriminant analysis. The results suggest that an anticipated matrix match reference material would improve the robustness of our methodology for forensic applications. © 2010 American Academy of Forensic Sciences.

  15. Flood-Frequency Estimates for Streams on Kaua`i, O`ahu, Moloka`i, Maui, and Hawai`i, State of Hawai`i

    USGS Publications Warehouse

    Oki, Delwyn S.; Rosa, Sarah N.; Yeung, Chiu W.

    2010-01-01

    This study provides an updated analysis of the magnitude and frequency of peak stream discharges in Hawai`i. Annual peak-discharge data collected by the U.S. Geological Survey during and before water year 2008 (ending September 30, 2008) at stream-gaging stations were analyzed. The existing generalized-skew value for the State of Hawai`i was retained, although three methods were used to evaluate whether an update was needed. Regional regression equations were developed for peak discharges with 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals for unregulated streams (those for which peak discharges are not affected to a large extent by upstream reservoirs, dams, diversions, or other structures) in areas with less than 20 percent combined medium- and high-intensity development on Kaua`i, O`ahu, Moloka`i, Maui, and Hawai`i. The generalized-least-squares (GLS) regression equations relate peak stream discharge to quantified basin characteristics (for example, drainage-basin area and mean annual rainfall) that were determined using geographic information system (GIS) methods. Each of the islands of Kaua`i,O`ahu, Moloka`i, Maui, and Hawai`i was divided into two regions, generally corresponding to a wet region and a dry region. Unique peak-discharge regression equations were developed for each region. The regression equations developed for this study have standard errors of prediction ranging from 16 to 620 percent. Standard errors of prediction are greatest for regression equations developed for leeward Moloka`i and southern Hawai`i. In general, estimated 100-year peak discharges from this study are lower than those from previous studies, which may reflect the longer periods of record used in this study. Each regression equation is valid within the range of values of the explanatory variables used to develop the equation. The regression equations were developed using peak-discharge data from streams that are mainly unregulated, and they should not be used to estimate peak discharges in regulated streams. Use of a regression equation beyond its limits will produce peak-discharge estimates with unknown error and should therefore be avoided. Improved estimates of the magnitude and frequency of peak discharges in Hawai`i will require continued operation of existing stream-gaging stations and operation of additional gaging stations for areas such as Moloka`i and Hawai`i, where limited stream-gaging data are available.

  16. Epidemiological analysis of a cluster within the outbreak of Shiga toxin-producing Escherichia coli serotype O104:H4 in Northern Germany, 2011.

    PubMed

    Scharlach, Martina; Diercke, Michaela; Dreesman, Johannes; Jahn, Nicola; Krieck, Manuela; Beyrer, Konrad; Claußen, Katja; Pulz, Matthias; Floride, Regina

    2013-06-01

    In May 2011 one of the worldwide largest outbreaks of haemolytic uraemic syndrome (HUS) and bloody diarrhoea caused by Shiga toxin-producing Escherichia coli (STEC) serotype O104:H4 occurred in Germany. One of the most affected federal states was Lower Saxony. We present the investigation of a cluster of STEC and HUS cases within this outbreak by means of a retrospective cohort study. After a 70th birthday celebration which took place on 7th of May 2011 among 72 attendants seven confirmed cases and four probable cases were identified, two of them developed HUS. Median incubation period was 10 days. Only 35 persons (48.6%) definitely answered the question whether they had eaten the sprouts that were used for garnishing the salad. Univariable analysis revealed different food items, depending on the case definition, with Odds Ratio (OR)>1 indicating an association with STEC infection, but multivariable logistic regression showed no increased risk for STEC infection for any food item and any case definition. Sprouts as the source for the infection had to be assumed based on the results of a tracing back of the delivery ways from the catering company to the sprouts producer who was finally identified as the source of the entire German outbreak. In this large outbreak several case-control studies failed to identify the source of infection. Copyright © 2012 Elsevier GmbH. All rights reserved.

  17. Macrophage Migration Inhibitory Factor Induces Inflammation and Predicts Spinal Progression in Ankylosing Spondylitis.

    PubMed

    Ranganathan, Vidya; Ciccia, Francesco; Zeng, Fanxing; Sari, Ismail; Guggino, Guiliana; Muralitharan, Janogini; Gracey, Eric; Haroon, Nigil

    2017-09-01

    To investigate the role of macrophage migration inhibitory factor (MIF) in the pathogenesis of ankylosing spondylitis (AS). Patients who met the modified New York criteria for AS were recruited for the study. Healthy volunteers, rheumatoid arthritis patients, and osteoarthritis patients were included as controls. Based on the annual rate of increase in modified Stoke AS Spine Score (mSASSS), AS patients were classified as progressors or nonprogressors. MIF levels in serum and synovial fluid were quantitated by enzyme-linked immunosorbent assay. Predictors of AS progression were evaluated using logistic regression analysis. Immunohistochemical analysis of ileal tissue was performed to identify MIF-producing cells. Flow cytometry was used to identify MIF-producing subsets, expression patterns of the MIF receptor (CD74), and MIF-induced tumor necrosis factor (TNF) production in the peripheral blood. MIF-induced mineralization of osteoblast cells (SaOS-2) was analyzed by alizarin red S staining, and Western blotting was used to quantify active β-catenin levels. Baseline serum MIF levels were significantly elevated in AS patients compared to healthy controls and were found to independently predict AS progression. MIF levels were higher in the synovial fluid of AS patients, and MIF-producing macrophages and Paneth cells were enriched in their gut. MIF induced TNF production in monocytes, activated β-catenin in osteoblasts, and promoted the mineralization of osteoblasts. Our findings indicate an unexplored pathogenic role of MIF in AS and a link between inflammation and new bone formation. © 2017, American College of Rheumatology.

  18. Predicting Falls and When to Intervene in Older People: A Multilevel Logistical Regression Model and Cost Analysis

    PubMed Central

    Smith, Matthew I.; de Lusignan, Simon; Mullett, David; Correa, Ana; Tickner, Jermaine; Jones, Simon

    2016-01-01

    Introduction Falls are the leading cause of injury in older people. Reducing falls could reduce financial pressures on health services. We carried out this research to develop a falls risk model, using routine primary care and hospital data to identify those at risk of falls, and apply a cost analysis to enable commissioners of health services to identify those in whom savings can be made through referral to a falls prevention service. Methods Multilevel logistical regression was performed on routinely collected general practice and hospital data from 74751 over 65’s, to produce a risk model for falls. Validation measures were carried out. A cost-analysis was performed to identify at which level of risk it would be cost-effective to refer patients to a falls prevention service. 95% confidence intervals were calculated using a Monte Carlo Model (MCM), allowing us to adjust for uncertainty in the estimates of these variables. Results A risk model for falls was produced with an area under the curve of the receiver operating characteristics curve of 0.87. The risk cut-off with the highest combination of sensitivity and specificity was at p = 0.07 (sensitivity of 81% and specificity of 78%). The risk cut-off at which savings outweigh costs was p = 0.27 and the risk cut-off with the maximum savings was p = 0.53, which would result in referral of 1.8% and 0.45% of the over 65’s population respectively. Above a risk cut-off of p = 0.27, costs do not exceed savings. Conclusions This model is the best performing falls predictive tool developed to date; it has been developed on a large UK city population; can be readily run from routine data; and can be implemented in a way that optimises the use of health service resources. Commissioners of health services should use this model to flag and refer patients at risk to their falls service and save resources. PMID:27448280

  19. An algebraic method for constructing stable and consistent autoregressive filters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harlim, John, E-mail: jharlim@psu.edu; Department of Meteorology, the Pennsylvania State University, University Park, PA 16802; Hong, Hoon, E-mail: hong@ncsu.edu

    2015-02-15

    In this paper, we introduce an algebraic method to construct stable and consistent univariate autoregressive (AR) models of low order for filtering and predicting nonlinear turbulent signals with memory depth. By stable, we refer to the classical stability condition for the AR model. By consistent, we refer to the classical consistency constraints of Adams–Bashforth methods of order-two. One attractive feature of this algebraic method is that the model parameters can be obtained without directly knowing any training data set as opposed to many standard, regression-based parameterization methods. It takes only long-time average statistics as inputs. The proposed method provides amore » discretization time step interval which guarantees the existence of stable and consistent AR model and simultaneously produces the parameters for the AR models. In our numerical examples with two chaotic time series with different characteristics of decaying time scales, we find that the proposed AR models produce significantly more accurate short-term predictive skill and comparable filtering skill relative to the linear regression-based AR models. These encouraging results are robust across wide ranges of discretization times, observation times, and observation noise variances. Finally, we also find that the proposed model produces an improved short-time prediction relative to the linear regression-based AR-models in forecasting a data set that characterizes the variability of the Madden–Julian Oscillation, a dominant tropical atmospheric wave pattern.« less

  20. Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.

    PubMed

    Zhang, Xiaoshuai; Xue, Fuzhong; Liu, Hong; Zhu, Dianwen; Peng, Bin; Wiemels, Joseph L; Yang, Xiaowei

    2014-12-10

    Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.

  1. Advancing haemostasis automation--successful implementation of robotic centrifugation and sample processing in a tertiary service hospital.

    PubMed

    Sédille-Mostafaie, Nazanin; Engler, Hanna; Lutz, Susanne; Korte, Wolfgang

    2013-06-01

    Laboratories today face increasing pressure to automate operations due to increasing workloads and the need to reduce expenditure. Few studies to date have focussed on the laboratory automation of preanalytical coagulation specimen processing. In the present study, we examined whether a clinical chemistry automation protocol meets the preanalytical requirements for the analyses of coagulation. During the implementation of laboratory automation, we began to operate a pre- and postanalytical automation system. The preanalytical unit processes blood specimens for chemistry, immunology and coagulation by automated specimen processing. As the production of platelet-poor plasma is highly dependent on optimal centrifugation, we examined specimen handling under different centrifugation conditions in order to produce optimal platelet deficient plasma specimens. To this end, manually processed models centrifuged at 1500 g for 5 and 20 min were compared to an automated centrifugation model at 3000 g for 7 min. For analytical assays that are performed frequently enough to be targets for full automation, Passing-Bablok regression analysis showed close agreement between different centrifugation methods, with a correlation coefficient between 0.98 and 0.99 and a bias between -5% and +6%. For seldom performed assays that do not mandate full automation, the Passing-Bablok regression analysis showed acceptable to poor agreement between different centrifugation methods. A full automation solution is suitable and can be recommended for frequent haemostasis testing.

  2. INNOVATIVE INSTRUMENTATION AND ANALYSIS OF THE TEMPERATURE MEASUREMENT FOR HIGH TEMPERATURE GASIFICATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seong W. Lee

    During this reporting period, the literature survey including the gasifier temperature measurement literature, the ultrasonic application and its background study in cleaning application, and spray coating process are completed. The gasifier simulator (cold model) testing has been successfully conducted. Four factors (blower voltage, ultrasonic application, injection time intervals, particle weight) were considered as significant factors that affect the temperature measurement. The Analysis of Variance (ANOVA) was applied to analyze the test data. The analysis shows that all four factors are significant to the temperature measurements in the gasifier simulator (cold model). The regression analysis for the case with the normalizedmore » room temperature shows that linear model fits the temperature data with 82% accuracy (18% error). The regression analysis for the case without the normalized room temperature shows 72.5% accuracy (27.5% error). The nonlinear regression analysis indicates a better fit than that of the linear regression. The nonlinear regression model's accuracy is 88.7% (11.3% error) for normalized room temperature case, which is better than the linear regression analysis. The hot model thermocouple sleeve design and fabrication are completed. The gasifier simulator (hot model) design and the fabrication are completed. The system tests of the gasifier simulator (hot model) have been conducted and some modifications have been made. Based on the system tests and results analysis, the gasifier simulator (hot model) has met the proposed design requirement and the ready for system test. The ultrasonic cleaning method is under evaluation and will be further studied for the gasifier simulator (hot model) application. The progress of this project has been on schedule.« less

  3. The Economic Value of Mangroves: A Meta-Analysis

    Treesearch

    Marwa Salem; D. Evan Mercer

    2012-01-01

    This paper presents a synthesis of the mangrove ecosystem valuation literature through a meta-regression analysis. The main contribution of this study is that it is the first meta-analysis focusing solely on mangrove forests, whereas previous studies have included different types of wetlands. The number of studies included in the regression analysis is 44 for a total...

  4. ICU Acquisition Rate, Risk Factors, and Clinical Significance of Digestive Tract Colonization With Extended-Spectrum Beta-Lactamase-Producing Enterobacteriaceae: A Systematic Review and Meta-Analysis.

    PubMed

    Detsis, Marios; Karanika, Styliani; Mylonakis, Eleftherios

    2017-04-01

    To evaluate the acquisition rate, identify risk factors, and estimate the risk for subsequent infection, associated with the colonization of the digestive tract with extended-spectrum beta-lactamase-producing Enterobacteriaceae during ICU-hospitalization. PubMed, EMBASE, and reference lists of all eligible articles. Included studies provided data on ICU-acquired colonization with extended-spectrum beta-lactamase-producing Enterobacteriaceae in previously noncolonized and noninfected patients and used the double disk synergy test for extended-spectrum beta-lactamase-producing Enterobacteriaceae phenotypic confirmation. Studies reporting extended-spectrum beta-lactamase-producing Enterobacteriaceae outbreaks or data on pediatric population were excluded. Two authors independently assessed study eligibility and performed data extraction. Thirteen studies (with 15,045 ICUs-patients) were evaluated using a random-effect model and a meta-regression analysis. The acquisition rate of digestive tract colonization during ICU stay was 7% (95% CI, 5-10) and it varies from 3% (95% CI, 2-4) and 4% (95% CI, 2-6) in the Americas and Europe to 21% (95% CI, 9-35) in the Western Pacific region. Previous hospitalization (risk ratio, 1.57 [95% CI, 1.07-2.31]) or antibiotic use (risk ratio, 1.65 [95% CI, 1.15-2.37]) and exposure to beta-lactams/beta-lactamase inhibitors (risk ratio, 1.78 [95% CI, 1.24-2.56]) and carbapenems (risk ratio, 2.13 [95% CI, 1.49-3.06]) during the ICU stay were independent risk factors for ICU-acquired colonization. Importantly, colonized patients were more likely to develop an extended-spectrum beta-lactamase-producing Enterobacteriaceae infection (risk ratio, 49.62 [95% CI, 20.42-120.58]). The sensitivity and specificity of prior colonization to predict subsequent extended-spectrum beta-lactamase-producing Enterobacteriaceae infection were 95.1% (95% CI, 54.7-99.7) and 89.2% (95% CI, 77.2-95.3), respectively. The ICU acquisition rate of extended-spectrum beta-lactamase-producing Enterobacteriaceae ranged from 5% to 10%. Previous use of beta-lactam/beta-lactamase or carbapenems and recent hospitalization were independent risk factors for extended-spectrum beta-lactamase-producing Enterobacteriaceae colonization, and colonization was associated with significantly higher frequency of extended-spectrum beta-lactamase-producing Enterobacteriaceae subsequent infection and increased mortality.

  5. Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression

    PubMed Central

    2014-01-01

    Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463

  6. Introduction to methodology of dose-response meta-analysis for binary outcome: With application on software.

    PubMed

    Zhang, Chao; Jia, Pengli; Yu, Liu; Xu, Chang

    2018-05-01

    Dose-response meta-analysis (DRMA) is widely applied to investigate the dose-specific relationship between independent and dependent variables. Such methods have been in use for over 30 years and are increasingly employed in healthcare and clinical decision-making. In this article, we give an overview of the methodology used in DRMA. We summarize the commonly used regression model and the pooled method in DRMA. We also use an example to illustrate how to employ a DRMA by these methods. Five regression models, linear regression, piecewise regression, natural polynomial regression, fractional polynomial regression, and restricted cubic spline regression, were illustrated in this article to fit the dose-response relationship. And two types of pooling approaches, that is, one-stage approach and two-stage approach are illustrated to pool the dose-response relationship across studies. The example showed similar results among these models. Several dose-response meta-analysis methods can be used for investigating the relationship between exposure level and the risk of an outcome. However the methodology of DRMA still needs to be improved. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.

  7. Simulation of CO2 Solubility in Polystyrene-b-Polybutadieneb-Polystyrene (SEBS) by artificial intelligence network (ANN) method

    NASA Astrophysics Data System (ADS)

    Sharudin, R. W.; AbdulBari Ali, S.; Zulkarnain, M.; Shukri, M. A.

    2018-05-01

    This study reports on the integration of Artificial Neural Network (ANNs) with experimental data in predicting the solubility of carbon dioxide (CO2) blowing agent in SEBS by generating highest possible value for Regression coefficient (R2). Basically, foaming of thermoplastic elastomer with CO2 is highly affected by the CO2 solubility. The ability of ANN in predicting interpolated data of CO2 solubility was investigated by comparing training results via different method of network training. Regards to the final prediction result for CO2 solubility by ANN, the prediction trend (output generate) was corroborated with the experimental results. The obtained result of different method of training showed the trend of output generated by Gradient Descent with Momentum & Adaptive LR (traingdx) required longer training time and required more accurate input to produce better output with final Regression Value of 0.88. However, it goes vice versa with Levenberg-Marquardt (trainlm) technique as it produced better output in quick detention time with final Regression Value of 0.91.

  8. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    PubMed

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  9. Drug treatment rates with beta-blockers and ACE-inhibitors/angiotensin receptor blockers and recurrences in takotsubo cardiomyopathy: A meta-regression analysis.

    PubMed

    Brunetti, Natale Daniele; Santoro, Francesco; De Gennaro, Luisa; Correale, Michele; Gaglione, Antonio; Di Biase, Matteo

    2016-07-01

    In a recent paper Singh et al. analyzed the effect of drug treatment on recurrence of takotsubo cardiomyopathy (TTC) in a comprehensive meta-analysis. The study found that recurrence rates were independent of clinic utilization of BB prescription, but inversely correlated with ACEi/ARB prescription: authors therefore conclude that ACEi/ARB rather than BB may reduce risk of recurrence. We aimed to re-analyze data reported in the study, now weighted for populations' size, in a meta-regression analysis. After multiple meta-regression analysis, we found a significant regression between rates of prescription of ACEi and rates of recurrence of TTC; regression was not statistically significant for BBs. On the bases of our re-analysis, we confirm that rates of recurrence of TTC are lower in populations of patients with higher rates of treatment with ACEi/ARB. That could not necessarily imply that ACEi may prevent recurrence of TTC, but barely that, for example, rates of recurrence are lower in cohorts more compliant with therapy or more prescribed with ACEi because more carefully followed. Randomized prospective studies are surely warranted. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  10. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.

  11. Examination of influential observations in penalized spline regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2013-10-01

    In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.

  12. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less

  13. A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation.

    PubMed

    Välikangas, Tommi; Suomi, Tomi; Elo, Laura L

    2017-05-31

    Label-free mass spectrometry (MS) has developed into an important tool applied in various fields of biological and life sciences. Several software exist to process the raw MS data into quantified protein abundances, including open source and commercial solutions. Each software includes a set of unique algorithms for different tasks of the MS data processing workflow. While many of these algorithms have been compared separately, a thorough and systematic evaluation of their overall performance is missing. Moreover, systematic information is lacking about the amount of missing values produced by the different proteomics software and the capabilities of different data imputation methods to account for them.In this study, we evaluated the performance of five popular quantitative label-free proteomics software workflows using four different spike-in data sets. Our extensive testing included the number of proteins quantified and the number of missing values produced by each workflow, the accuracy of detecting differential expression and logarithmic fold change and the effect of different imputation and filtering methods on the differential expression results. We found that the Progenesis software performed consistently well in the differential expression analysis and produced few missing values. The missing values produced by the other software decreased their performance, but this difference could be mitigated using proper data filtering or imputation methods. Among the imputation methods, we found that the local least squares (lls) regression imputation consistently increased the performance of the software in the differential expression analysis, and a combination of both data filtering and local least squares imputation increased performance the most in the tested data sets. © The Author 2017. Published by Oxford University Press.

  14. Geographically weighted regression and multicollinearity: dispelling the myth

    NASA Astrophysics Data System (ADS)

    Fotheringham, A. Stewart; Oshan, Taylor M.

    2016-10-01

    Geographically weighted regression (GWR) extends the familiar regression framework by estimating a set of parameters for any number of locations within a study area, rather than producing a single parameter estimate for each relationship specified in the model. Recent literature has suggested that GWR is highly susceptible to the effects of multicollinearity between explanatory variables and has proposed a series of local measures of multicollinearity as an indicator of potential problems. In this paper, we employ a controlled simulation to demonstrate that GWR is in fact very robust to the effects of multicollinearity. Consequently, the contention that GWR is highly susceptible to multicollinearity issues needs rethinking.

  15. Forecasting daily meteorological time series using ARIMA and regression models

    NASA Astrophysics Data System (ADS)

    Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir

    2018-04-01

    The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.

  16. Social desirability bias in dietary self-report may compromise the validity of dietary intake measures.

    PubMed

    Hebert, J R; Clemow, L; Pbert, L; Ockene, I S; Ockene, J K

    1995-04-01

    Self-report of dietary intake could be biased by social desirability or social approval thus affecting risk estimates in epidemiological studies. These constructs produce response set biases, which are evident when testing in domains characterized by easily recognizable correct or desirable responses. Given the social and psychological value ascribed to diet, assessment methodologies used most commonly in epidemiological studies are particularly vulnerable to these biases. Social desirability and social approval biases were tested by comparing nutrient scores derived from multiple 24-hour diet recalls (24HR) on seven randomly assigned days with those from two 7-day diet recalls (7DDR) (similar in some respects to commonly used food frequency questionnaires), one administered at the beginning of the test period (pre) and one at the end (post). Statistical analysis included correlation and multiple linear regression. Cross-sectionally, no relationships between social approval score and the nutritional variables existed. Social desirability score was negatively correlated with most nutritional variables. In linear regression analysis, social desirability score produced a large downward bias in nutrient estimation in the 7DDR relative to the 24HR. For total energy, this bias equalled about 50 kcal/point on the social desirability scale or about 450 kcal over its interquartile range. The bias was approximately twice as large for women as for men and only about half as large in the post measures. Individuals having the highest 24HR-derived fat and total energy intake scores had the largest downward bias due to social desirability. We observed a large downward bias in reporting food intake related to social desirability score. These results are consistent with the theoretical constructs on which the hypothesis is based. The effect of social desirability bias is discussed in terms of its influence on epidemiological estimates of effect. Suggestions are made for future work aimed at improving dietary assessment methodologies and adjusting risk estimates for this bias.

  17. Relationship between leaf functional traits and productivity in Aquilaria crassna (Thymelaeaceae) plantations: a tool to aid in the early selection of high-yielding trees.

    PubMed

    López-Sampson, Arlene; Cernusak, Lucas A; Page, Tony

    2017-05-01

    Physiological traits are frequently used as indicators of tree productivity. Aquilaria species growing in a research planting were studied to investigate relationships between leaf-productivity traits and tree growth. Twenty-eight trees were selected to measure isotopic composition of carbon (δ13C) and nitrogen (δ15N) and monitor six leaf attributes. Trees were sampled randomly within each of four diametric classes (at 150 mm above ground level) ensuring the variability in growth of the whole population was represented. A model averaging technique based on the Akaike's information criterion was computed to identify whether leaf traits could assist in diameter prediction. Regression analysis was performed to test for relationships between carbon isotope values and diameter and leaf traits. Approximately one new leaf per week was produced by a shoot. The rate of leaf expansion was estimated as 1.45 mm day-1. The range of δ13C values in leaves of Aquilaria species was from -25.5‰ to -31‰, with an average of -28.4 ‰ (±1.5‰ SD). A moderate negative correlation (R2 = 0.357) between diameter and δ13C in leaf dry matter indicated that individuals with high intercellular CO2 concentrations (low δ13C) and associated low water-use efficiency sustained rapid growth. Analysis of the 95% confidence of best-ranked regression models indicated that the predictors that could best explain growth in Aquilaria species were δ13C, δ15N, petiole length, number of new leaves produced per week and specific leaf area. The model constructed with these variables explained 55% (R2 = 0.55) of the variability in stem diameter. This demonstrates that leaf traits can assist in the early selection of high-productivity trees in Aquilaria species. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Uncertainty and sensitivity analysis for two-phase flow in the vicinity of the repository in the 1996 performance assessment for the Waste Isolation Pilot Plant: Disturbed conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    HELTON,JON CRAIG; BEAN,J.E.; ECONOMY,K.

    2000-05-22

    Uncertainty and sensitivity analysis results obtained in the 1996 performance assessment (PA) for the Waste Isolation Pilot Plant (WIPP) are presented for two-phase flow in the vicinity of the repository under disturbed conditions resulting from drilling intrusions. Techniques based on Latin hypercube sampling, examination of scatterplots, stepwise regression analysis, partial correlation analysis and rank transformations are used to investigate brine inflow, gas generation repository pressure, brine saturation and brine and gas outflow. Of the variables under study, repository pressure and brine flow from the repository to the Culebra Dolomite are potentially the most important in PA for the WIPP. Subsequentmore » to a drilling intrusion repository pressure was dominated by borehole permeability and generally below the level (i.e., 8 MPa) that could potentially produce spallings and direct brine releases. Brine flow from the repository to the Culebra Dolomite tended to be small or nonexistent with its occurrence and size also dominated by borehole permeability.« less

  19. Advanced Online Survival Analysis Tool for Predictive Modelling in Clinical Data Science.

    PubMed

    Montes-Torres, Julio; Subirats, José Luis; Ribelles, Nuria; Urda, Daniel; Franco, Leonardo; Alba, Emilio; Jerez, José Manuel

    2016-01-01

    One of the prevailing applications of machine learning is the use of predictive modelling in clinical survival analysis. In this work, we present our view of the current situation of computer tools for survival analysis, stressing the need of transferring the latest results in the field of machine learning to biomedical researchers. We propose a web based software for survival analysis called OSA (Online Survival Analysis), which has been developed as an open access and user friendly option to obtain discrete time, predictive survival models at individual level using machine learning techniques, and to perform standard survival analysis. OSA employs an Artificial Neural Network (ANN) based method to produce the predictive survival models. Additionally, the software can easily generate survival and hazard curves with multiple options to personalise the plots, obtain contingency tables from the uploaded data to perform different tests, and fit a Cox regression model from a number of predictor variables. In the Materials and Methods section, we depict the general architecture of the application and introduce the mathematical background of each of the implemented methods. The study concludes with examples of use showing the results obtained with public datasets.

  20. Advanced Online Survival Analysis Tool for Predictive Modelling in Clinical Data Science

    PubMed Central

    Montes-Torres, Julio; Subirats, José Luis; Ribelles, Nuria; Urda, Daniel; Franco, Leonardo; Alba, Emilio; Jerez, José Manuel

    2016-01-01

    One of the prevailing applications of machine learning is the use of predictive modelling in clinical survival analysis. In this work, we present our view of the current situation of computer tools for survival analysis, stressing the need of transferring the latest results in the field of machine learning to biomedical researchers. We propose a web based software for survival analysis called OSA (Online Survival Analysis), which has been developed as an open access and user friendly option to obtain discrete time, predictive survival models at individual level using machine learning techniques, and to perform standard survival analysis. OSA employs an Artificial Neural Network (ANN) based method to produce the predictive survival models. Additionally, the software can easily generate survival and hazard curves with multiple options to personalise the plots, obtain contingency tables from the uploaded data to perform different tests, and fit a Cox regression model from a number of predictor variables. In the Materials and Methods section, we depict the general architecture of the application and introduce the mathematical background of each of the implemented methods. The study concludes with examples of use showing the results obtained with public datasets. PMID:27532883

  1. A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS

    NASA Astrophysics Data System (ADS)

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Abbaspour, Karim

    2018-02-01

    Considering the unstable condition of water resources in Iran and many other countries in arid and semi-arid regions, groundwater studies are very important. Therefore, the aim of this study is to model groundwater potential by qanat locations as indicators and ten advanced and soft computing models applied to the Beheshtabad Watershed, Iran. Qanat is a man-made underground construction which gathers groundwater from higher altitudes and transmits it to low land areas where it can be used for different purposes. For this purpose, at first, the location of the qanats was detected using extensive field surveys. These qanats were classified into two datasets including training (70%) and validation (30%). Then, 14 influence factors depicting the region's physical, morphological, lithological, and hydrological features were identified to model groundwater potential. Linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), flexible discriminant analysis (FDA), penalized discriminant analysis (PDA), boosted regression tree (BRT), random forest (RF), artificial neural network (ANN), K-nearest neighbor (KNN), multivariate adaptive regression splines (MARS), and support vector machine (SVM) models were applied in R scripts to produce groundwater potential maps. For evaluation of the performance accuracies of the developed models, ROC curve and kappa index were implemented. According to the results, RF had the best performance, followed by SVM and BRT models. Our results showed that qanat locations could be used as a good indicator for groundwater potential. Furthermore, altitude, slope, plan curvature, and profile curvature were found to be the most important influence factors. On the other hand, lithology, land use, and slope aspect were the least significant factors. The methodology in the current study could be used by land use and terrestrial planners and water resource managers to reduce the costs of groundwater resource discovery.

  2. A structure-activity analysis of the variation in oxime efficacy against nerve agents

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maxwell, Donald M.; Koplovitz, Irwin; Worek, Franz

    2008-09-01

    A structure-activity analysis was used to evaluate the variation in oxime efficacy of 2-PAM, obidoxime, HI-6 and ICD585 against nerve agents. In vivo oxime protection and in vitro oxime reactivation were used as indicators of oxime efficacy against VX, sarin, VR and cyclosarin. Analysis of in vivo oxime protection was conducted with oxime protective ratios (PR) from guinea pigs receiving oxime and atropine therapy after sc administration of nerve agent. Analysis of in vitro reactivation was conducted with second-order rate contants (k{sub r2}) for oxime reactivation of agent-inhibited acetylcholinesterase (AChE) from guinea pig erythrocytes. In vivo oxime PR and inmore » vitro k{sub r2} decreased as the volume of the alkylmethylphosphonate moiety of nerve agents increased from VX to cyclosarin. This effect was greater with 2-PAM and obidoxime (> 14-fold decrease in PR) than with HI-6 and ICD585 (< 3.7-fold decrease in PR). The decrease in oxime PR and k{sub r2} as the volume of the agent moiety conjugated to AChE increased was consistent with a steric hindrance mechanism. Linear regression of log (PR-1) against log (k{sub r2} {center_dot} [oxime dose]) produced two offset parallel regression lines that delineated a significant difference between the coupling of oxime reactivation and oxime protection for HI-6 and ICD585 compared to 2-PAM and obidoxime. HI-6 and ICD585 appeared to be 6.8-fold more effective than 2-PAM and obidoxime at coupling oxime reactivation to oxime protection, which suggested that the isonicotinamide group that is common to both of these oximes, but absent from 2-PAM and obidoxime, is important for oxime efficacy.« less

  3. [Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].

    PubMed

    Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan

    2015-01-01

    To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.

  4. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  5. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  6. A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

    PubMed

    Ferrari, Alberto; Comelli, Mario

    2016-12-01

    In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Regression analysis of informative current status data with the additive hazards model.

    PubMed

    Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

    2015-04-01

    This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.

  8. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    PubMed

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  9. Building Regression Models: The Importance of Graphics.

    ERIC Educational Resources Information Center

    Dunn, Richard

    1989-01-01

    Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)

  10. Testing Different Model Building Procedures Using Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…

  11. The alarming problems of confounding equivalence using logistic regression models in the perspective of causal diagrams.

    PubMed

    Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong

    2017-12-28

    Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.

  12. A Two-Step Method to Select Major Surge-Producing Extratropical Cyclones from a 10,000-Year Stochastic Catalog

    NASA Astrophysics Data System (ADS)

    Keshtpoor, M.; Carnacina, I.; Yablonsky, R. M.

    2016-12-01

    Extratropical cyclones (ETCs) are the primary driver of storm surge events along the UK and northwest mainland Europe coastlines. In an effort to evaluate the storm surge risk in coastal communities in this region, a stochastic catalog is developed by perturbing the historical storm seeds of European ETCs to account for 10,000 years of possible ETCs. Numerical simulation of the storm surge generated by the full 10,000-year stochastic catalog, however, is computationally expensive and may take several months to complete with available computational resources. A new statistical regression model is developed to select the major surge-generating events from the stochastic ETC catalog. This regression model is based on the maximum storm surge, obtained via numerical simulations using a calibrated version of the Delft3D-FM hydrodynamic model with a relatively coarse mesh, of 1750 historical ETC events that occurred over the past 38 years in Europe. These numerically-simulated surge values were regressed to the local sea level pressure and the U and V components of the wind field at the location of 196 tide gauge stations near the UK and northwest mainland Europe coastal areas. The regression model suggests that storm surge values in the area of interest are highly correlated to the U- and V-component of wind speed, as well as the sea level pressure. Based on these correlations, the regression model was then used to select surge-generating storms from the 10,000-year stochastic catalog. Results suggest that roughly 105,000 events out of 480,000 stochastic storms are surge-generating events and need to be considered for numerical simulation using a hydrodynamic model. The selected stochastic storms were then simulated in Delft3D-FM, and the final refinement of the storm population was performed based on return period analysis of the 1750 historical event simulations at each of the 196 tide gauges in preparation for Delft3D-FM fine mesh simulations.

  13. Hyperspectral imaging using a color camera and its application for pathogen detection

    NASA Astrophysics Data System (ADS)

    Yoon, Seung-Chul; Shin, Tae-Sung; Heitschmidt, Gerald W.; Lawrence, Kurt C.; Park, Bosoon; Gamble, Gary

    2015-02-01

    This paper reports the results of a feasibility study for the development of a hyperspectral image recovery (reconstruction) technique using a RGB color camera and regression analysis in order to detect and classify colonies of foodborne pathogens. The target bacterial pathogens were the six representative non-O157 Shiga-toxin producing Escherichia coli (STEC) serogroups (O26, O45, O103, O111, O121, and O145) grown in Petri dishes of Rainbow agar. The purpose of the feasibility study was to evaluate whether a DSLR camera (Nikon D700) could be used to predict hyperspectral images in the wavelength range from 400 to 1,000 nm and even to predict the types of pathogens using a hyperspectral STEC classification algorithm that was previously developed. Unlike many other studies using color charts with known and noise-free spectra for training reconstruction models, this work used hyperspectral and color images, separately measured by a hyperspectral imaging spectrometer and the DSLR color camera. The color images were calibrated (i.e. normalized) to relative reflectance, subsampled and spatially registered to match with counterpart pixels in hyperspectral images that were also calibrated to relative reflectance. Polynomial multivariate least-squares regression (PMLR) was previously developed with simulated color images. In this study, partial least squares regression (PLSR) was also evaluated as a spectral recovery technique to minimize multicollinearity and overfitting. The two spectral recovery models (PMLR and PLSR) and their parameters were evaluated by cross-validation. The QR decomposition was used to find a numerically more stable solution of the regression equation. The preliminary results showed that PLSR was more effective especially with higher order polynomial regressions than PMLR. The best classification accuracy measured with an independent test set was about 90%. The results suggest the potential of cost-effective color imaging using hyperspectral image classification algorithms for rapidly differentiating pathogens in agar plates.

  14. Linear models for calculating digestibile energy for sheep diets.

    PubMed

    Fonnesbeck, P V; Christiansen, M L; Harris, L E

    1981-05-01

    Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.

  15. Age estimation by dentin translucency measurement using digital method: An institutional study

    PubMed Central

    Gupta, Shalini; Chandra, Akhilesh; Agnihotri, Archana; Gupta, Om Prakash; Maurya, Niharika

    2017-01-01

    Aims: The aims of the present study were to measure translucency on sectioned teeth using available computer hardware and software, to correlate dimensions of root dentin translucency with age, and to assess whether translucency is reliable for age estimation. Materials and Methods: A pilot study was done on 62 freshly extracted single-rooted permanent teeth from 62 different individuals (35 males and 27 females) and their 250 μm thick sections were prepared by micromotor, carborundum disks, and Arkansas stone. Each tooth section was scanned and the images were opened in the Adobe Photoshop software. Measurement of root dentin translucency (TD length) was done on the scanned image by placing two guides (A and B) along the x-axis of ABFO NO. 2 scale. Unpaired t-test, regression analysis, and Pearson correlation coefficient were used as statistical tools. Results: A linear relationship was observed between TD length and age in the regression analysis. The Pearson correlation analysis showed that there was positive correlation (r = 0.52, P = 0.0001) between TD length and age. However, no significant (P > 0.05) difference was observed in the TD length between male (8.44 ± 2.92 mm) and female (7.80 ± 2.79 mm) samples. Conclusion: Translucency of the root dentin increases with age and it can be used as a reliable parameter for the age estimation. The method used here to digitally select and measure translucent root dentin is more refined, better correlated to age, and produce superior age estimation. PMID:28584476

  16. Quality optimization of H.264/AVC video transmission over noisy environments using a sparse regression framework

    NASA Astrophysics Data System (ADS)

    Pandremmenou, K.; Tziortziotis, N.; Paluri, S.; Zhang, W.; Blekas, K.; Kondi, L. P.; Kumar, S.

    2015-03-01

    We propose the use of the Least Absolute Shrinkage and Selection Operator (LASSO) regression method in order to predict the Cumulative Mean Squared Error (CMSE), incurred by the loss of individual slices in video transmission. We extract a number of quality-relevant features from the H.264/AVC video sequences, which are given as input to the LASSO. This method has the benefit of not only keeping a subset of the features that have the strongest effects towards video quality, but also produces accurate CMSE predictions. Particularly, we study the LASSO regression through two different architectures; the Global LASSO (G.LASSO) and Local LASSO (L.LASSO). In G.LASSO, a single regression model is trained for all slice types together, while in L.LASSO, motivated by the fact that the values for some features are closely dependent on the considered slice type, each slice type has its own regression model, in an e ort to improve LASSO's prediction capability. Based on the predicted CMSE values, we group the video slices into four priority classes. Additionally, we consider a video transmission scenario over a noisy channel, where Unequal Error Protection (UEP) is applied to all prioritized slices. The provided results demonstrate the efficiency of LASSO in estimating CMSE with high accuracy, using only a few features. les that typically contain high-entropy data, producing a footprint that is far less conspicuous than existing methods. The system uses a local web server to provide a le system, user interface and applications through an web architecture.

  17. Estimating flood magnitude and frequency at gaged and ungaged sites on streams in Alaska and conterminous basins in Canada, based on data through water year 2012

    USGS Publications Warehouse

    Curran, Janet H.; Barth, Nancy A.; Veilleux, Andrea G.; Ourso, Robert T.

    2016-03-16

    Estimates of the magnitude and frequency of floods are needed across Alaska for engineering design of transportation and water-conveyance structures, flood-insurance studies, flood-plain management, and other water-resource purposes. This report updates methods for estimating flood magnitude and frequency in Alaska and conterminous basins in Canada. Annual peak-flow data through water year 2012 were compiled from 387 streamgages on unregulated streams with at least 10 years of record. Flood-frequency estimates were computed for each streamgage using the Expected Moments Algorithm to fit a Pearson Type III distribution to the logarithms of annual peak flows. A multiple Grubbs-Beck test was used to identify potentially influential low floods in the time series of peak flows for censoring in the flood frequency analysis.For two new regional skew areas, flood-frequency estimates using station skew were computed for stations with at least 25 years of record for use in a Bayesian least-squares regression analysis to determine a regional skew value. The consideration of basin characteristics as explanatory variables for regional skew resulted in improvements in precision too small to warrant the additional model complexity, and a constant model was adopted. Regional Skew Area 1 in eastern-central Alaska had a regional skew of 0.54 and an average variance of prediction of 0.45, corresponding to an effective record length of 22 years. Regional Skew Area 2, encompassing coastal areas bordering the Gulf of Alaska, had a regional skew of 0.18 and an average variance of prediction of 0.12, corresponding to an effective record length of 59 years. Station flood-frequency estimates for study sites in regional skew areas were then recomputed using a weighted skew incorporating the station skew and regional skew. In a new regional skew exclusion area outside the regional skew areas, the density of long-record streamgages was too sparse for regional analysis and station skew was used for all estimates. Final station flood frequency estimates for all study streamgages are presented for the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities.Regional multiple-regression analysis was used to produce equations for estimating flood frequency statistics from explanatory basin characteristics. Basin characteristics, including physical and climatic variables, were updated for all study streamgages using a geographical information system and geospatial source data. Screening for similar-sized nested basins eliminated hydrologically redundant sites, and screening for eligibility for analysis of explanatory variables eliminated regulated peaks, outburst peaks, and sites with indeterminate basin characteristics. An ordinary least‑squares regression used flood-frequency statistics and basin characteristics for 341 streamgages (284 in Alaska and 57 in Canada) to determine the most suitable combination of basin characteristics for a flood-frequency regression model and to explore regional grouping of streamgages for explaining variability in flood-frequency statistics across the study area. The most suitable model for explaining flood frequency used drainage area and mean annual precipitation as explanatory variables for the entire study area as a region. Final regression equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability discharge in Alaska and conterminous basins in Canada were developed using a generalized least-squares regression. The average standard error of prediction for the regression equations for the various annual exceedance probabilities ranged from 69 to 82 percent, and the pseudo-coefficient of determination (pseudo-R2) ranged from 85 to 91 percent.The regional regression equations from this study were incorporated into the U.S. Geological Survey StreamStats program for a limited area of the State—the Cook Inlet Basin. StreamStats is a national web-based geographic information system application that facilitates retrieval of streamflow statistics and associated information. StreamStats retrieves published data for gaged sites and, for user-selected ungaged sites, delineates drainage areas from topographic and hydrographic data, computes basin characteristics, and computes flood frequency estimates using the regional regression equations.

  18. [Regression on order statistics and its application in estimating nondetects for food exposure assessment].

    PubMed

    Yu, Xiaojin; Liu, Pei; Min, Jie; Chen, Qiguang

    2009-01-01

    To explore the application of regression on order statistics (ROS) in estimating nondetects for food exposure assessment. Regression on order statistics was adopted in analysis of cadmium residual data set from global food contaminant monitoring, the mean residual was estimated basing SAS programming and compared with the results from substitution methods. The results show that ROS method performs better obviously than substitution methods for being robust and convenient for posterior analysis. Regression on order statistics is worth to adopt,but more efforts should be make for details of application of this method.

  19. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  20. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  1. Characterization of vegetation by microwave and optical remote sensing

    NASA Technical Reports Server (NTRS)

    Daughtry, C. S. T. (Principal Investigator); Ranson, K. J.; Biehl, L. L.

    1986-01-01

    Two series of carefully controlled experiments were conducted. First, plots of important crops (corn, soybeans, and sorghum), prairie grasses (big bluestem, switchgrass, tal fescue, orchardgrass, bromegrass), and forage legumes (alfalfa, red clover, and crown vetch) were manipulated to produce wide ranges of phytomass, leaf area index, and canopy architecture. Second, coniferous forest canopies were simulated using small balsam fir trees grown in large pots of soil and arranged systematically on a large (5 m) platform. Rotating the platform produced many new canopies for frequency and spatial averaging of the backscatter signal. In both series of experiments, backscatter of 5.0 GHz (C-Band) was measured as a function of view angle and polarization. Biophysical measurements included leaf area index, fresh and dry phytomass, water content of canopy elements, canopy height, and soil roughness and moisture content. For a subset of the above plots, additional measurements were acquired to exercise microwave backscatter models. These measurements included size and shape of leaves, stems, and fruit and the probability density function of leaf and stem angles. The relationships of the backscattering coefficients and the biophysical properties of the canopies were evaluated using statistical correlations, analysis of variance, and regression analysis. Results from the corn density and balsam fir experiments are discussed and analyses of data from the other experiments are summarized.

  2. Effect of playing tactics on achieving score-box possessions in a random series of team possessions from Norwegian professional soccer matches.

    PubMed

    Tenga, Albin; Holme, Ingar; Ronglan, Lars Tore; Bahr, Roald

    2010-02-01

    Methods of analysis that include an assessment of opponent interactions are thought to provide a more valid means of team match performance. The purpose of this study was to examine the effect of playing tactics on achieving score-box possession by assessing opponent interactions in Norwegian elite soccer matches. We analysed a random series of 1703 team possessions from 163 of 182 (90%) matches played in the professional men's league during the 2004 season. Multidimensional qualitative data obtained from ten ordered categorical variables were used. Offensive tactics were more effective in producing score-box possessions when playing against an imbalanced defence (28.5%) than against a balanced defence (6.5%) (P < 0.001). Multiple logistic regression found that, for the main variable "team possession type", counterattacks were more effective than elaborate attacks when playing against an imbalanced defence (odds ratio: 2.69; 95% confidence interval: 1.64 to 4.43) but not against a balanced defence (odds ratio: 1.14; 95% confidence interval: 0.47 to 2.76). Assessment of opponent interactions is critical to evaluate the effectiveness of offensive playing tactics on producing score-box possessions, and improves the validity of team match-performance analysis in soccer.

  3. A comparison of two adaptive multivariate analysis methods (PLSR and ANN) for winter wheat yield forecasting using Landsat-8 OLI images

    NASA Astrophysics Data System (ADS)

    Chen, Pengfei; Jing, Qi

    2017-02-01

    An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.

  4. Regression Analysis of Physician Distribution to Identify Areas of Need: Some Preliminary Findings.

    ERIC Educational Resources Information Center

    Morgan, Bruce B.; And Others

    A regression analysis was conducted of factors that help to explain the variance in physician distribution and which identify those factors that influence the maldistribution of physicians. Models were developed for different geographic areas to determine the most appropriate unit of analysis for the Western Missouri Area Health Education Center…

  5. Criteria for the use of regression analysis for remote sensing of sediment and pollutants

    NASA Technical Reports Server (NTRS)

    Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R. (Principal Investigator)

    1982-01-01

    Data analysis procedures for quantification of water quality parameters that are already identified and are known to exist within the water body are considered. The liner multiple-regression technique was examined as a procedure for defining and calibrating data analysis algorithms for such instruments as spectrometers and multispectral scanners.

  6. The Analysis of the Regression-Discontinuity Design in R

    ERIC Educational Resources Information Center

    Thoemmes, Felix; Liao, Wang; Jin, Ze

    2017-01-01

    This article describes the analysis of regression-discontinuity designs (RDDs) using the R packages rdd, rdrobust, and rddtools. We discuss similarities and differences between these packages and provide directions on how to use them effectively. We use real data from the Carolina Abecedarian Project to show how an analysis of an RDD can be…

  7. Influencing factors of alexithymia in Chinese medical students: a cross-sectional study.

    PubMed

    Zhu, Yaxin; Luo, Ting; Liu, Jie; Qu, Bo

    2017-04-04

    A much higher prevalence of alexithymia has been reported in medical students compared with the general population, and alexithymia is a risk factor that increases vulnerability to mental disorders. Our aim was to evaluate the level of alexithymia in Chinese medical students and to explore its influencing factors. A cross-sectional study of 1,950 medical students at Shenyang Medical College was conducted in May 2014 to evaluate alexithymia in medical students using the Chinese version of the 20-item Toronto Alexithymia Scale (TAS-20). The reliability of the questionnaire was assessed by Cronbach's α coefficient and mean inter-item correlations. Confirmatory factor analysis (CFA) was used to evaluate construct validity. The relationships between alexithymia and influencing factors were examined using Student's t-test, analysis of variance, and multiple linear regression analysis. Statistical analysis was performed using SPSS 21.0. Of the 1,950 medical students, 1,886 (96.7%) completed questionnaires. Overall, Cronbach's α coefficient of the TAS-20 questionnaire was 0.868. The results of CFA showed that the original three-factor structure produced an acceptable fit to the data. By univariate analysis, gender, grade (academic year of study), smoking behavior, alcohol use, physical activity, history of living with parents during childhood, and childhood trauma were influencing factors of TAS-20 scores (p < 0.05). Multiple linear regression analysis showed that gender, physical activity, grade, living with parents, and childhood trauma also had statistically significant association with total TAS-20 score (p < 0.05). Gender, physical activity, grade, history of living with parents during childhood, and childhood trauma were all factors determining the level of alexithymia. To prevent alexithymia, it will be advisable to promote adequate physical activity and pay greater attention to male medical students and those who are in the final year of training.

  8. Linear regression analysis of emissions factors when firing fossil fuels and biofuels in a commercial water-tube boiler

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sharon Falcone Miller; Bruce G. Miller

    2007-12-15

    This paper compares the emissions factors for a suite of liquid biofuels (three animal fats, waste restaurant grease, pressed soybean oil, and a biodiesel produced from soybean oil) and four fossil fuels (i.e., natural gas, No. 2 fuel oil, No. 6 fuel oil, and pulverized coal) in Penn State's commercial water-tube boiler to assess their viability as fuels for green heat applications. The data were broken into two subsets, i.e., fossil fuels and biofuels. The regression model for the liquid biofuels (as a subset) did not perform well for all of the gases. In addition, the coefficient in the modelsmore » showed the EPA method underestimating CO and NOx emissions. No relation could be studied for SO{sub 2} for the liquid biofuels as they contain no sulfur; however, the model showed a good relationship between the two methods for SO{sub 2} in the fossil fuels. AP-42 emissions factors for the fossil fuels were also compared to the mass balance emissions factors and EPA CFR Title 40 emissions factors. Overall, the AP-42 emissions factors for the fossil fuels did not compare well with the mass balance emissions factors or the EPA CFR Title 40 emissions factors. Regression analysis of the AP-42, EPA, and mass balance emissions factors for the fossil fuels showed a significant relationship only for CO{sub 2} and SO{sub 2}. However, the regression models underestimate the SO{sub 2} emissions by 33%. These tests illustrate the importance in performing material balances around boilers to obtain the most accurate emissions levels, especially when dealing with biofuels. The EPA emissions factors were very good at predicting the mass balance emissions factors for the fossil fuels and to a lesser degree the biofuels. While the AP-42 emissions factors and EPA CFR Title 40 emissions factors are easier to perform, especially in large, full-scale systems, this study illustrated the shortcomings of estimation techniques. 23 refs., 3 figs., 8 tabs.« less

  9. Mean Expected Error in Prediction of Total Body Water: A True Accuracy Comparison between Bioimpedance Spectroscopy and Single Frequency Regression Equations

    PubMed Central

    Abtahi, Shirin; Abtahi, Farhad; Ellegård, Lars; Johannsson, Gudmundur; Bosaeus, Ingvar

    2015-01-01

    For several decades electrical bioimpedance (EBI) has been used to assess body fluid distribution and body composition. Despite the development of several different approaches for assessing total body water (TBW), it remains uncertain whether bioimpedance spectroscopic (BIS) approaches are more accurate than single frequency regression equations. The main objective of this study was to answer this question by calculating the expected accuracy of a single measurement for different EBI methods. The results of this study showed that all methods produced similarly high correlation and concordance coefficients, indicating good accuracy as a method. Even the limits of agreement produced from the Bland-Altman analysis indicated that the performance of single frequency, Sun's prediction equations, at population level was close to the performance of both BIS methods; however, when comparing the Mean Absolute Percentage Error value between the single frequency prediction equations and the BIS methods, a significant difference was obtained, indicating slightly better accuracy for the BIS methods. Despite the higher accuracy of BIS methods over 50 kHz prediction equations at both population and individual level, the magnitude of the improvement was small. Such slight improvement in accuracy of BIS methods is suggested insufficient to warrant their clinical use where the most accurate predictions of TBW are required, for example, when assessing over-fluidic status on dialysis. To reach expected errors below 4-5%, novel and individualized approaches must be developed to improve the accuracy of bioimpedance-based methods for the advent of innovative personalized health monitoring applications. PMID:26137489

  10. An evaluation of asthma interventions for preteen students.

    PubMed

    Clark, Noreen M; Shah, Smita; Dodge, Julia A; Thomas, Lara J; Andridge, Rebecca R; Little, Roderick J A

    2010-02-01

    Asthma is a serious problem for low-income preteens living in disadvantaged communities. Among the chronic diseases of childhood and adolescence, asthma has the highest prevalence and related health care use. School-based asthma interventions have proven successful for older and younger students, but results have not been demonstrated for those in middle school. This randomized controlled study screened students 10-13 years of age in 19 middle schools in low-income communities in Detroit, Michigan. Of the 6,872 students who were screened, 1,292 students were identified with asthma. Schools were matched and randomly assigned to Program 1 or 2 or control. Baseline, 12, and 24 months data were collected by telephone (parents), at school (students) and from school system records. Measures were the students' asthma symptoms, quality of life, academic performance, self-regulation, and asthma management practices. Data were analyzed using multiple imputation with sequential regression analysis. Mixed models and Poisson regressions were used to develop final models. Neither program produced significant change in asthma symptoms or quality of life. One produced improved school grades (p = .02). The other enhanced self-regulation (p = .01) at 24 months. Both slowed the decline in self-regulation in undiagnosed preteens at 12 months and increased self-regulation at 24 months (p = .04; p = .003). Programs had effects on academic performance and self-regulation capacities of students. More developmentally focused interventions may be needed for students at this transitional stage. Disruptive factors in the schools may have reduced both program impact and the potential for outcome assessment.

  11. An Evaluation of Asthma Interventions for Preteen Students

    PubMed Central

    Clark, Noreen M.; Shah, Smita; Dodge, Julia A.; Thomas, Lara J.; Andridge, Rebecca R.; Little, Roderick J.A.

    2013-01-01

    Background Asthma is a serious problem for low income, pre teens living in disadvantaged communities. Asthma prevalence and health care use are the highest of the chronic diseases of childhood and adolescence. School based asthma interventions have proven successful for older and younger students but results have not been demonstrated for those in middle school. Methods This randomized controlled study involved 6872 students 10–13 years of age and assessed two programs, 1) self-management and 2) self-management plus peer involvement, provided in 19 middle schools in low income, communities. 1292 students were identified with asthma. Schools were matched and randomly assigned to program one or two or control. Baseline, 12, and 24 months data were collected by telephone (parents), at school (students) and from school system records. Measures were the students’ asthma symptoms, quality of life, academic performance, self-regulation and asthma management practices. Data were analyzed using multiple imputation with sequential regression analysis. Mixed models and Poisson regressions were used to develop final models. Results Neither program produced change in asthma symptoms or quality of life. One produced improved school grades (p=0.02). The other enhanced self-regulation (p=0.01) at 24 months. Both slowed the decline in self-regulation in undiagnosed preteens at 12 months and increased self regulation at 24 months (p=0.04; p=0.003). Conclusion Programs had effects on academic performance and self-regulation capacities of students. More developmentally focused interventions may be needed for students at this transitional stage. Disruptive factors in the schools may have reduced both program impact and the potential for outcome assessment. PMID:20236406

  12. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  13. Estimating regression coefficients from clustered samples: Sampling errors and optimum sample allocation

    NASA Technical Reports Server (NTRS)

    Kalton, G.

    1983-01-01

    A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.

  14. Regression Model for Light Weight and Crashworthiness Enhancement Design of Automotive Parts in Frontal CAR Crash

    NASA Astrophysics Data System (ADS)

    Bae, Gihyun; Huh, Hoon; Park, Sungho

    This paper deals with a regression model for light weight and crashworthiness enhancement design of automotive parts in frontal car crash. The ULSAB-AVC model is employed for the crash analysis and effective parts are selected based on the amount of energy absorption during the crash behavior. Finite element analyses are carried out for designated design cases in order to investigate the crashworthiness and weight according to the material and thickness of main energy absorption parts. Based on simulations results, a regression analysis is performed to construct a regression model utilized for light weight and crashworthiness enhancement design of automotive parts. An example for weight reduction of main energy absorption parts demonstrates the validity of a regression model constructed.

  15. Temperature-Dependent Survival of Hepatitis A Virus during Storage of Contaminated Onions

    PubMed Central

    Sun, Y.; Laird, D. T.

    2012-01-01

    Pre- or postharvest contamination of green onions by hepatitis A virus (HAV) has been linked to large numbers of food-borne illnesses. Understanding HAV survival in onions would assist in projecting the risk of the disease associated with their consumption. This study defined HAV inactivation rates in contaminated green onions contained in air-permeable, moisture-retaining high-density polyethylene packages that were stored at 3, 10, 14, 20, 21, 22, and 23°C. A protocol was established to recover HAV from whole green onions, with 31% as the average recovery by infectivity assay. Viruses in eluates were primarily analyzed by a 6-well plaque assay on FRhK-4 cells. Eight storage trials, including two trials at 3°C, were conducted, with 3 to 7 onion samples per sampling and 4 to 7 samplings per trial. Linear regression correlation (r2 = 0.80 to 0.98) was observed between HAV survival and storage time for each of the 8 trials, held at specific temperatures. Increases in the storage temperature resulted in greater HAV inactivation rates, e.g., a reduction of 0.033 log PFU/day at 3.4 ± 0.3°C versus 0.185 log PFU/day at 23.4 ± 0.7°C. Thus, decimal reduction time (D) values of 30, 14, 11, and 5 days, respectively, were obtained for HAV in onions stored at 3, 10, 14, and 23°C. Further regression analysis determined that 1 degree Celsius increase would increase inactivation of HAV by 0.007 log PFU/day in onions (r2 = 0.97). The data suggest that natural degradation of HAV in contaminated fresh produce is minimal and that a preventive strategy is critical to produce safety. The results are useful in predicting the risks associated with HAV contamination in fresh produce. PMID:22544253

  16. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  17. Two Paradoxes in Linear Regression Analysis.

    PubMed

    Feng, Ge; Peng, Jing; Tu, Dongke; Zheng, Julia Z; Feng, Changyong

    2016-12-25

    Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection.

  18. The Increase of Energy Consumption and Carbon Dioxide (CO2) Emission in Indonesia

    NASA Astrophysics Data System (ADS)

    Sasana, Hadi; Putri, Annisa Eka

    2018-02-01

    In the last decade, the increase of energy consumption that has multiplied carbondioxide emissions becomes world problems, especially in the developing countries undergoing industrialization to be developed ones like Indonesia. This aim of this study was to analyze the effect of fossil energy consumption, population growth, and consumption of renewable energy on carbon dioxide emission. The method used was multiple linear regression analysis with Ordinary Least Square approach using time series in the period of 1990 - 2014. The result showed that fossil energy consumption and population growth have a positive influence on carbon dioxide emission in Indonesia. Meanwhile, the consumption variable of renewable energy has a negative effect on the level of carbon dioxide emissions produced.

  19. Multiple Linear Regression Analysis of Factors Affecting Real Property Price Index From Case Study Research In Istanbul/Turkey

    NASA Astrophysics Data System (ADS)

    Denli, H. H.; Koc, Z.

    2015-12-01

    Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.

  20. Using a Modification of the Capture-Recapture Model To Estimate the Need for Substance Abuse Treatment.

    ERIC Educational Resources Information Center

    Maxwell, Jane Carlisle; Pullum, Thomas W.

    2001-01-01

    Applied the capture-recapture model, through a Poisson regression to a time series of data for admissions to treatment from 1987 to 1996 to estimate the number of heroin addicts in Texas who are "at-risk" for treatment. The entire data set produced estimates that were lower and more plausible than those produced by drawing samples,…

  1. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.

  2. [Local Regression Algorithm Based on Net Analyte Signal and Its Application in Near Infrared Spectral Analysis].

    PubMed

    Zhang, Hong-guang; Lu, Jian-gang

    2016-02-01

    Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.

  3. Multilayer Perceptron for Robust Nonlinear Interval Regression Analysis Using Genetic Algorithms

    PubMed Central

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755

  4. Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

    PubMed

    Hu, Yi-Chung

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.

  5. Contamination of Fresh Produce by Microbial Indicators on Farms and in Packing Facilities: Elucidation of Environmental Routes

    PubMed Central

    Bartz, Faith E.; Lickness, Jacquelyn Sunshine; Heredia, Norma; Fabiszewski de Aceituno, Anna; Newman, Kira L.; Hodge, Domonique Watson; Jaykus, Lee-Ann; García, Santos

    2017-01-01

    ABSTRACT To improve food safety on farms, it is critical to quantify the impact of environmental microbial contamination sources on fresh produce. However, studies are hampered by difficulties achieving study designs with powered sample sizes to elucidate relationships between environmental and produce contamination. Our goal was to quantify, in the agricultural production environment, the relationship between microbial contamination on hands, soil, and water and contamination on fresh produce. In 11 farms and packing facilities in northern Mexico, we applied a matched study design: composite samples (n = 636, equivalent to 11,046 units) of produce rinses were matched to water, soil, and worker hand rinses during two growing seasons. Microbial indicators (coliforms, Escherichia coli, Enterococcus spp., and somatic coliphage) were quantified from composite samples. Statistical measures of association and correlations were calculated through Spearman's correlation, linear regression, and logistic regression models. The concentrations of all microbial indicators were positively correlated between produce and hands (ρ range, 0.41 to 0.75; P < 0.01). When E. coli was present on hands, the handled produce was nine times more likely to contain E. coli (P < 0.05). Similarly, when coliphage was present on hands, the handled produce was eight times more likely to contain coliphage (P < 0.05). There were relatively low concentrations of indicators in soil and water samples, and a few sporadic significant associations were observed between contamination of soil and water and contamination of produce. This methodology provides a foundation for future field studies, and results highlight the need for interventions surrounding farmworker hygiene and sanitation to reduce microbial contamination of farmworkers' hands. IMPORTANCE This study of the relationships between microbes on produce and in the farm environment can be used to support the design of targeted interventions to prevent or reduce microbial contamination of fresh produce with associated reductions in foodborne illness. PMID:28363965

  6. Contamination of Fresh Produce by Microbial Indicators on Farms and in Packing Facilities: Elucidation of Environmental Routes.

    PubMed

    Bartz, Faith E; Lickness, Jacquelyn Sunshine; Heredia, Norma; Fabiszewski de Aceituno, Anna; Newman, Kira L; Hodge, Domonique Watson; Jaykus, Lee-Ann; García, Santos; Leon, Juan S

    2017-06-01

    To improve food safety on farms, it is critical to quantify the impact of environmental microbial contamination sources on fresh produce. However, studies are hampered by difficulties achieving study designs with powered sample sizes to elucidate relationships between environmental and produce contamination. Our goal was to quantify, in the agricultural production environment, the relationship between microbial contamination on hands, soil, and water and contamination on fresh produce. In 11 farms and packing facilities in northern Mexico, we applied a matched study design: composite samples ( n = 636, equivalent to 11,046 units) of produce rinses were matched to water, soil, and worker hand rinses during two growing seasons. Microbial indicators (coliforms, Escherichia coli , Enterococcus spp., and somatic coliphage) were quantified from composite samples. Statistical measures of association and correlations were calculated through Spearman's correlation, linear regression, and logistic regression models. The concentrations of all microbial indicators were positively correlated between produce and hands (ρ range, 0.41 to 0.75; P < 0.01). When E. coli was present on hands, the handled produce was nine times more likely to contain E. coli ( P < 0.05). Similarly, when coliphage was present on hands, the handled produce was eight times more likely to contain coliphage ( P < 0.05). There were relatively low concentrations of indicators in soil and water samples, and a few sporadic significant associations were observed between contamination of soil and water and contamination of produce. This methodology provides a foundation for future field studies, and results highlight the need for interventions surrounding farmworker hygiene and sanitation to reduce microbial contamination of farmworkers' hands. IMPORTANCE This study of the relationships between microbes on produce and in the farm environment can be used to support the design of targeted interventions to prevent or reduce microbial contamination of fresh produce with associated reductions in foodborne illness. Copyright © 2017 American Society for Microbiology.

  7. The use of cognitive ability measures as explanatory variables in regression analysis.

    PubMed

    Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J

    2012-12-01

    Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual's wage, or a decision such as an individual's education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score , constructed via standard psychometric practice from individuals' responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a "mixed effects structural equations" (MESE) model, may be more appropriate in many circumstances.

  8. One-year test-retest reliability of intrinsic connectivity network fMRI in older adults

    PubMed Central

    Guo, Cong C.; Kurth, Florian; Zhou, Juan; Mayer, Emeran A.; Eickhoff, Simon B; Kramer, Joel H.; Seeley, William W.

    2014-01-01

    “Resting-state” or task-free fMRI can assess intrinsic connectivity network (ICN) integrity in health and disease, suggesting a potential for use of these methods as disease-monitoring biomarkers. Numerous analytical options are available, including model-driven ROI-based correlation analysis and model-free, independent component analysis (ICA). High test-retest reliability will be a necessary feature of a successful ICN biomarker, yet available reliability data remains limited. Here, we examined ICN fMRI test-retest reliability in 24 healthy older subjects scanned roughly one year apart. We focused on the salience network, a disease-relevant ICN not previously subjected to reliability analysis. Most ICN analytical methods proved reliable (intraclass coefficients > 0.4) and could be further improved by wavelet analysis. Seed-based ROI correlation analysis showed high map-wise reliability, whereas graph theoretical measures and temporal concatenation group ICA produced the most reliable individual unit-wise outcomes. Including global signal regression in ROI-based correlation analyses reduced reliability. Our study provides a direct comparison between the most commonly used ICN fMRI methods and potential guidelines for measuring intrinsic connectivity in aging control and patient populations over time. PMID:22446491

  9. Portable Electronic Tongue Based on Microsensors for the Analysis of Cava Wines.

    PubMed

    Giménez-Gómez, Pablo; Escudé-Pujol, Roger; Capdevila, Fina; Puig-Pujol, Anna; Jiménez-Jorquera, Cecilia; Gutiérrez-Capitán, Manuel

    2016-10-27

    Cava is a quality sparkling wine produced in Spain. As a product with a designation of origin, Cava wine has to meet certain quality requirements throughout its production process; therefore, the analysis of several parameters is of great interest. In this work, a portable electronic tongue for the analysis of Cava wine is described. The system is comprised of compact and low-power-consumption electronic equipment and an array of microsensors formed by six ion-selective field effect transistors sensitive to pH, Na⁺, K⁺, Ca 2+ , Cl - , and CO₃ 2- , one conductivity sensor, one redox potential sensor, and two amperometric gold microelectrodes. This system, combined with chemometric tools, has been applied to the analysis of 78 Cava wine samples. Results demonstrate that the electronic tongue is able to classify the samples according to the aging time, with a percentage of correct prediction between 80% and 96%, by using linear discriminant analysis, as well as to quantify the total acidity, pH, volumetric alcoholic degree, potassium, conductivity, glycerol, and methanol parameters, with mean relative errors between 2.3% and 6.0%, by using partial least squares regressions.

  10. Portable Electronic Tongue Based on Microsensors for the Analysis of Cava Wines

    PubMed Central

    Giménez-Gómez, Pablo; Escudé-Pujol, Roger; Capdevila, Fina; Puig-Pujol, Anna; Jiménez-Jorquera, Cecilia; Gutiérrez-Capitán, Manuel

    2016-01-01

    Cava is a quality sparkling wine produced in Spain. As a product with a designation of origin, Cava wine has to meet certain quality requirements throughout its production process; therefore, the analysis of several parameters is of great interest. In this work, a portable electronic tongue for the analysis of Cava wine is described. The system is comprised of compact and low-power-consumption electronic equipment and an array of microsensors formed by six ion-selective field effect transistors sensitive to pH, Na+, K+, Ca2+, Cl−, and CO32−, one conductivity sensor, one redox potential sensor, and two amperometric gold microelectrodes. This system, combined with chemometric tools, has been applied to the analysis of 78 Cava wine samples. Results demonstrate that the electronic tongue is able to classify the samples according to the aging time, with a percentage of correct prediction between 80% and 96%, by using linear discriminant analysis, as well as to quantify the total acidity, pH, volumetric alcoholic degree, potassium, conductivity, glycerol, and methanol parameters, with mean relative errors between 2.3% and 6.0%, by using partial least squares regressions. PMID:27801796

  11. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  12. Methods for calculating confidence and credible intervals for the residual between-study variance in random effects meta-regression models

    PubMed Central

    2014-01-01

    Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829

  13. Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

    PubMed

    Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

    2017-02-06

    Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.

  14. Methods for estimating the magnitude and frequency of peak streamflows for unregulated streams in Oklahoma

    USGS Publications Warehouse

    Lewis, Jason M.

    2010-01-01

    Peak-streamflow regression equations were determined for estimating flows with exceedance probabilities from 50 to 0.2 percent for the state of Oklahoma. These regression equations incorporate basin characteristics to estimate peak-streamflow magnitude and frequency throughout the state by use of a generalized least squares regression analysis. The most statistically significant independent variables required to estimate peak-streamflow magnitude and frequency for unregulated streams in Oklahoma are contributing drainage area, mean-annual precipitation, and main-channel slope. The regression equations are applicable for watershed basins with drainage areas less than 2,510 square miles that are not affected by regulation. The resulting regression equations had a standard model error ranging from 31 to 46 percent. Annual-maximum peak flows observed at 231 streamflow-gaging stations through water year 2008 were used for the regression analysis. Gage peak-streamflow estimates were used from previous work unless 2008 gaging-station data were available, in which new peak-streamflow estimates were calculated. The U.S. Geological Survey StreamStats web application was used to obtain the independent variables required for the peak-streamflow regression equations. Limitations on the use of the regression equations and the reliability of regression estimates for natural unregulated streams are described. Log-Pearson Type III analysis information, basin and climate characteristics, and the peak-streamflow frequency estimates for the 231 gaging stations in and near Oklahoma are listed. Methodologies are presented to estimate peak streamflows at ungaged sites by using estimates from gaging stations on unregulated streams. For ungaged sites on urban streams and streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow magnitude and frequency.

  15. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Titanium Ions Release from an Innovative Titanium-Magnesium Composite: an in Vitro Study

    PubMed Central

    Halambek, Jasna; Maldini, Krešimir; Balog, Martin; Križik, Peter; Schauperl, Zdravko; Ćatić, Amir

    2016-01-01

    Background The innovative titanium-magnesium composite (Ti-Mg) was produced by powder metallurgy (P/M) method and is characterized in terms of corrosion behavior. Material and methods Two groups of experimental material, 1 mass% (Ti-1Mg) and 2 mass% (Ti-2Mg) of magnesium in titanium matrix, were tested and compared to commercially pure titanium (CP Ti). Immersion test and chemical analysis of four solutions: artificial saliva; artificial saliva pH 4; artificial saliva with fluoride and Hank balanced salt solution were performed after 42 days of immersion, using inductively coupled plasma mass spectrometry (ICP-MS) to detect the amount of released titanium ions (Ti). SEM and EDS analysis were used for surface characterization. Results The difference between the results from different test solutions was assessed by ANOVA and Newman-Keuls test at p<0.05. The influence of predictor variables was found by multiple regression analysis. The results of the present study revealed a low corrosion rate of titanium from the experimental Ti-Mg group. Up to 46 and 23 times lower dissolution of Ti from Ti-1Mg and Ti-2Mg, respectively was observed compared to the control group. Among the tested solutions, artificial saliva with fluorides exhibited the highest corrosion effect on all specimens tested. SEM micrographs showed preserved dual phase surface structure and EDS analysis suggested a favorable surface bioactivity. Conclusion In conclusion, Ti-Mg produced by P/M as a material with better corrosion properties when compared to CP Ti is suggested. PMID:27688425

  17. [How to fit and interpret multilevel models using SPSS].

    PubMed

    Pardo, Antonio; Ruiz, Miguel A; San Martín, Rafael

    2007-05-01

    Hierarchic or multilevel models are used to analyse data when cases belong to known groups and sample units are selected both from the individual level and from the group level. In this work, the multilevel models most commonly discussed in the statistic literature are described, explaining how to fit these models using the SPSS program (any version as of the 11 th ) and how to interpret the outcomes of the analysis. Five particular models are described, fitted, and interpreted: (1) one-way analysis of variance with random effects, (2) regression analysis with means-as-outcomes, (3) one-way analysis of covariance with random effects, (4) regression analysis with random coefficients, and (5) regression analysis with means- and slopes-as-outcomes. All models are explained, trying to make them understandable to researchers in health and behaviour sciences.

  18. Correlative and multivariate analysis of increased radon concentration in underground laboratory.

    PubMed

    Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena

    2014-11-01

    The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  19. Reliable prediction intervals with regression neural networks.

    PubMed

    Papadopoulos, Harris; Haralambous, Haris

    2011-10-01

    This paper proposes an extension to conventional regression neural networks (NNs) for replacing the point predictions they produce with prediction intervals that satisfy a required level of confidence. Our approach follows a novel machine learning framework, called Conformal Prediction (CP), for assigning reliable confidence measures to predictions without assuming anything more than that the data are independent and identically distributed (i.i.d.). We evaluate the proposed method on four benchmark datasets and on the problem of predicting Total Electron Content (TEC), which is an important parameter in trans-ionospheric links; for the latter we use a dataset of more than 60000 TEC measurements collected over a period of 11 years. Our experimental results show that the prediction intervals produced by our method are both well calibrated and tight enough to be useful in practice. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study.

    PubMed

    Brenn, T; Arnesen, E

    1985-01-01

    For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.

  1. Mortality rates in OECD countries converged during the period 1990-2010.

    PubMed

    Bremberg, Sven G

    2017-06-01

    Since the scientific revolution of the 18th century, human health has gradually improved, but there is no unifying theory that explains this improvement in health. Studies of macrodeterminants have produced conflicting results. Most studies have analysed health at a given point in time as the outcome; however, the rate of improvement in health might be a more appropriate outcome. Twenty-eight OECD member countries were selected for analysis in the period 1990-2010. The main outcomes studied, in six age groups, were the national rates of decrease in mortality in the period 1990-2010. The effects of seven potential determinants on the rates of decrease in mortality were analysed in linear multiple regression models using least squares, controlling for country-specific history constants, which represent the mortality rate in 1990. The multiple regression analyses started with models that only included mortality rates in 1990 as determinants. These models explained 87% of the intercountry variation in the children aged 1-4 years and 51% in adults aged 55-74 years. When added to the regression equations, the seven determinants did not seem to significantly increase the explanatory power of the equations. The analyses indicated a decrease in mortality in all nations and in all age groups. The development of mortality rates in the different nations demonstrated significant catch-up effects. Therefore an important objective of the national public health sector seems to be to reduce the delay between international research findings and the universal implementation of relevant innovations.

  2. Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine

    NASA Astrophysics Data System (ADS)

    Ebrahimi, Hadi; Rajaee, Taher

    2017-01-01

    Simulation of groundwater level (GWL) fluctuations is an important task in management of groundwater resources. In this study, the effect of wavelet analysis on the training of the artificial neural network (ANN), multi linear regression (MLR) and support vector regression (SVR) approaches was investigated, and the ANN, MLR and SVR along with the wavelet-ANN (WNN), wavelet-MLR (WLR) and wavelet-SVR (WSVR) models were compared in simulating one-month-ahead of GWL. The only variable used to develop the models was the monthly GWL data recorded over a period of 11 years from two wells in the Qom plain, Iran. The results showed that decomposing GWL time series into several sub-time series, extremely improved the training of the models. For both wells 1 and 2, the Meyer and Db5 wavelets produced better results compared to the other wavelets; which indicated wavelet types had similar behavior in similar case studies. The optimal number of delays was 6 months, which seems to be due to natural phenomena. The best WNN model, using Meyer mother wavelet with two decomposition levels, simulated one-month-ahead with RMSE values being equal to 0.069 m and 0.154 m for wells 1 and 2, respectively. The RMSE values for the WLR model were 0.058 m and 0.111 m, and for WSVR model were 0.136 m and 0.060 m for wells 1 and 2, respectively.

  3. Endpoint in plasma etch process using new modified w-multivariate charts and windowed regression

    NASA Astrophysics Data System (ADS)

    Zakour, Sihem Ben; Taleb, Hassen

    2017-09-01

    Endpoint detection is very important undertaking on the side of getting a good understanding and figuring out if a plasma etching process is done in the right way, especially if the etched area is very small (0.1%). It truly is a crucial part of supplying repeatable effects in every single wafer. When the film being etched has been completely cleared, the endpoint is reached. To ensure the desired device performance on the produced integrated circuit, the high optical emission spectroscopy (OES) sensor is employed. The huge number of gathered wavelengths (profiles) is then analyzed and pre-processed using a new proposed simple algorithm named Spectra peak selection (SPS) to select the important wavelengths, then we employ wavelet analysis (WA) to enhance the performance of detection by suppressing noise and redundant information. The selected and treated OES wavelengths are then used in modified multivariate control charts (MEWMA and Hotelling) for three statistics (mean, SD and CV) and windowed polynomial regression for mean. The employ of three aforementioned statistics is motivated by controlling mean shift, variance shift and their ratio (CV) if both mean and SD are not stable. The control charts show their performance in detecting endpoint especially W-mean Hotelling chart and the worst result is given by CV statistic. As the best detection of endpoint is given by the W-Hotelling mean statistic, this statistic will be used to construct a windowed wavelet Hotelling polynomial regression. This latter can only identify the window containing endpoint phenomenon.

  4. Comparing the efficiency of digital and conventional soil mapping to predict soil types in a semi-arid region in Iran

    NASA Astrophysics Data System (ADS)

    Zeraatpisheh, Mojtaba; Ayoubi, Shamsollah; Jafari, Azam; Finke, Peter

    2017-05-01

    The efficiency of different digital and conventional soil mapping approaches to produce categorical maps of soil types is determined by cost, sample size, accuracy and the selected taxonomic level. The efficiency of digital and conventional soil mapping approaches was examined in the semi-arid region of Borujen, central Iran. This research aimed to (i) compare two digital soil mapping approaches including Multinomial logistic regression and random forest, with the conventional soil mapping approach at four soil taxonomic levels (order, suborder, great group and subgroup levels), (ii) validate the predicted soil maps by the same validation data set to determine the best method for producing the soil maps, and (iii) select the best soil taxonomic level by different approaches at three sample sizes (100, 80, and 60 point observations), in two scenarios with and without a geomorphology map as a spatial covariate. In most predicted maps, using both digital soil mapping approaches, the best results were obtained using the combination of terrain attributes and the geomorphology map, although differences between the scenarios with and without the geomorphology map were not significant. Employing the geomorphology map increased map purity and the Kappa index, and led to a decrease in the 'noisiness' of soil maps. Multinomial logistic regression had better performance at higher taxonomic levels (order and suborder levels); however, random forest showed better performance at lower taxonomic levels (great group and subgroup levels). Multinomial logistic regression was less sensitive than random forest to a decrease in the number of training observations. The conventional soil mapping method produced a map with larger minimum polygon size because of traditional cartographic criteria used to make the geological map 1:100,000 (on which the conventional soil mapping map was largely based). Likewise, conventional soil mapping map had also a larger average polygon size that resulted in a lower level of detail. Multinomial logistic regression at the order level (map purity of 0.80), random forest at the suborder (map purity of 0.72) and great group level (map purity of 0.60), and conventional soil mapping at the subgroup level (map purity of 0.48) produced the most accurate maps in the study area. The multinomial logistic regression method was identified as the most effective approach based on a combined index of map purity, map information content, and map production cost. The combined index also showed that smaller sample size led to a preference for the order level, while a larger sample size led to a preference for the great group level.

  5. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  6. Two Paradoxes in Linear Regression Analysis

    PubMed Central

    FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong

    2016-01-01

    Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214

  7. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    PubMed Central

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  8. Composite marginal quantile regression analysis for longitudinal adolescent body mass index data.

    PubMed

    Yang, Chi-Chuan; Chen, Yi-Hau; Chang, Hsing-Yi

    2017-09-20

    Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  9. Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages.

    PubMed

    Choi, Youn-Kyung; Kim, Jinmi; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Kim, Yong-Il

    2016-01-01

    This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5-18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level.

  10. Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages

    PubMed Central

    Choi, Youn-Kyung; Kim, Jinmi; Maki, Koutaro; Ko, Ching-Chang

    2016-01-01

    This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5–18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level. PMID:27340668

  11. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    PubMed

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  12. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  13. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    PubMed Central

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  14. Linear regression analysis of survival data with missing censoring indicators.

    PubMed

    Wang, Qihua; Dinse, Gregg E

    2011-04-01

    Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

  15. Iterative Assessment of Statistically-Oriented and Standard Algorithms for Determining Muscle Onset with Intramuscular Electromyography.

    PubMed

    Tenan, Matthew S; Tweedell, Andrew J; Haynes, Courtney A

    2017-12-01

    The onset of muscle activity, as measured by electromyography (EMG), is a commonly applied metric in biomechanics. Intramuscular EMG is often used to examine deep musculature and there are currently no studies examining the effectiveness of algorithms for intramuscular EMG onset. The present study examines standard surface EMG onset algorithms (linear envelope, Teager-Kaiser Energy Operator, and sample entropy) and novel algorithms (time series mean-variance analysis, sequential/batch processing with parametric and nonparametric methods, and Bayesian changepoint analysis). Thirteen male and 5 female subjects had intramuscular EMG collected during isolated biceps brachii and vastus lateralis contractions, resulting in 103 trials. EMG onset was visually determined twice by 3 blinded reviewers. Since the reliability of visual onset was high (ICC (1,1) : 0.92), the mean of the 6 visual assessments was contrasted with the algorithmic approaches. Poorly performing algorithms were stepwise eliminated via (1) root mean square error analysis, (2) algorithm failure to identify onset/premature onset, (3) linear regression analysis, and (4) Bland-Altman plots. The top performing algorithms were all based on Bayesian changepoint analysis of rectified EMG and were statistically indistinguishable from visual analysis. Bayesian changepoint analysis has the potential to produce more reliable, accurate, and objective intramuscular EMG onset results than standard methodologies.

  16. Biomotor structures in elite female handball players according to performance.

    PubMed

    Cavala, Marijana; Rogulj, Nenad; Srhoj, Vatromir; Srhoj, Ljerka; Katić, Ratko

    2008-03-01

    In order to identify biomotor structures in elite female handball players, factor structures of morphological characteristics and basic motor abilities, and of variables evaluating situation motor abilities of elite female handball players (n = 53) were determined first, followed by determination of differences and relations of the morphological, motor and specific motor space according to handball performance. Factor analysis of 16 morphological measures produced three morphological factors, i.e. factor of absolute voluminosity, i.e. mesoendomorphy, factor of longitudinal skeleton dimensionality, and factor of transverse hand dimensionality. Factor analysis of 15 motor variables yielded five basic motor dimensions, i.e. factor of agility, factor of throwing explosive strength, factor of running explosive strength (sprint), factor of jumping explosive strength and factor of movement frequency rate. Factor analysis of 5 situation motor variables produced two dimensions: factor of specific agility with explosiveness and factor of specific precision with ball manipulation. Analysis of variance yielded greatest differences relative to handball performance in the factor of specific agility and throwing strength, and the factor of basic motoricity that integrates the ability of coordination (agility) with upper extremity throwing explosiveness and lower extremity sprint (30-m sprint) and jumping (standing triple jump). Considering morphological factors, the factor of voluminosity, i.e. mesoendomorphy, which is defined by muscle mass rather than adipose tissue, was found to contribute significantly to the players'performance. Results of regression analysis indicated the handball performance to be predominantly determined by the general specific motor factor based on specific agility and explosiveness, and by the morphological factor based on body mass and volume, i.e. muscle mass. Concerning basic motor abilities, the factor of movement frequency rate, which is associated with the ability of ball manipulation, was observed to predict significantly the handball players' performance.

  17. The use of regression analysis in determining reference intervals for low hematocrit and thrombocyte count in multiple electrode aggregometry and platelet function analyzer 100 testing of platelet function.

    PubMed

    Kuiper, Gerhardus J A J M; Houben, Rik; Wetzels, Rick J H; Verhezen, Paul W M; Oerle, Rene van; Ten Cate, Hugo; Henskens, Yvonne M C; Lancé, Marcus D

    2017-11-01

    Low platelet counts and hematocrit levels hinder whole blood point-of-care testing of platelet function. Thus far, no reference ranges for MEA (multiple electrode aggregometry) and PFA-100 (platelet function analyzer 100) devices exist for low ranges. Through dilution methods of volunteer whole blood, platelet function at low ranges of platelet count and hematocrit levels was assessed on MEA for four agonists and for PFA-100 in two cartridges. Using (multiple) regression analysis, 95% reference intervals were computed for these low ranges. Low platelet counts affected MEA in a positive correlation (all agonists showed r 2 ≥ 0.75) and PFA-100 in an inverse correlation (closure times were prolonged with lower platelet counts). Lowered hematocrit did not affect MEA testing, except for arachidonic acid activation (ASPI), which showed a weak positive correlation (r 2 = 0.14). Closure time on PFA-100 testing was inversely correlated with hematocrit for both cartridges. Regression analysis revealed different 95% reference intervals in comparison with originally established intervals for both MEA and PFA-100 in low platelet or hematocrit conditions. Multiple regression analysis of ASPI and both tests on the PFA-100 for combined low platelet and hematocrit conditions revealed that only PFA-100 testing should be adjusted for both thrombocytopenia and anemia. 95% reference intervals were calculated using multiple regression analysis. However, coefficients of determination of PFA-100 were poor, and some variance remained unexplained. Thus, in this pilot study using (multiple) regression analysis, we could establish reference intervals of platelet function in anemia and thrombocytopenia conditions on PFA-100 and in thrombocytopenia conditions on MEA.

  18. Flood regionalization: A hybrid geographic and predictor-variable region-of-influence regression method

    USGS Publications Warehouse

    Eng, K.; Milly, P.C.D.; Tasker, Gary D.

    2007-01-01

    To facilitate estimation of streamflow characteristics at an ungauged site, hydrologists often define a region of influence containing gauged sites hydrologically similar to the estimation site. This region can be defined either in geographic space or in the space of the variables that are used to predict streamflow (predictor variables). These approaches are complementary, and a combination of the two may be superior to either. Here we propose a hybrid region-of-influence (HRoI) regression method that combines the two approaches. The new method was applied with streamflow records from 1,091 gauges in the southeastern United States to estimate the 50-year peak flow (Q50). The HRoI approach yielded lower root-mean-square estimation errors and produced fewer extreme errors than either the predictor-variable or geographic region-of-influence approaches. It is concluded, for Q50 in the study region, that similarity with respect to the basin characteristics considered (area, slope, and annual precipitation) is important, but incomplete, and that the consideration of geographic proximity of stations provides a useful surrogate for characteristics that are not included in the analysis. ?? 2007 ASCE.

  19. An Example-Based Brain MRI Simulation Framework.

    PubMed

    He, Qing; Roy, Snehashis; Jog, Amod; Pham, Dzung L

    2015-02-21

    The simulation of magnetic resonance (MR) images plays an important role in the validation of image analysis algorithms such as image segmentation, due to lack of sufficient ground truth in real MR images. Previous work on MRI simulation has focused on explicitly modeling the MR image formation process. However, because of the overwhelming complexity of MR acquisition these simulations must involve simplifications and approximations that can result in visually unrealistic simulated images. In this work, we describe an example-based simulation framework, which uses an "atlas" consisting of an MR image and its anatomical models derived from the hard segmentation. The relationships between the MR image intensities and its anatomical models are learned using a patch-based regression that implicitly models the physics of the MR image formation. Given the anatomical models of a new brain, a new MR image can be simulated using the learned regression. This approach has been extended to also simulate intensity inhomogeneity artifacts based on the statistical model of training data. Results show that the example based MRI simulation method is capable of simulating different image contrasts and is robust to different choices of atlas. The simulated images resemble real MR images more than simulations produced by a physics-based model.

  20. The increasing financial impact of chronic kidney disease in australia.

    PubMed

    Tucker, Patrick S; Kingsley, Michael I; Morton, R Hugh; Scanlan, Aaron T; Dalbo, Vincent J

    2014-01-01

    The aim of this investigation was to determine and compare current and projected expenditure associated with chronic kidney disease (CKD), renal replacement therapy (RRT), and cardiovascular disease (CVD) in Australia. Data published by Australia and New Zealand Dialysis and Transplant Registry, Australian Institute of Health and Welfare, and World Bank were used to compare CKD-, RRT-, and CVD-related expenditure and prevalence rates. Prevalence and expenditure predictions were made using a linear regression model. Direct statistical comparisons of rates of annual increase utilised indicator variables in combined regressions. Statistical significance was set at P < 0.05. Dollar amounts were adjusted for inflation prior to analysis. Between 2012 and 2020, prevalence, per-patient expenditure, and total disease expenditure associated with CKD and RRT are estimated to increase significantly more rapidly than CVD. RRT prevalence is estimated to increase by 29%, compared to 7% in CVD. Average annual RRT per-patient expenditure is estimated to increase by 16%, compared to 8% in CVD. Total CKD- and RRT-related expenditure had been estimated to increase by 37%, compared to 14% in CVD. Per-patient, CKD produces a considerably greater financial impact on Australia's healthcare system, compared to CVD. Research focusing on novel preventative/therapeutic interventions is warranted.

Top