Sample records for regression analysis allowed

  1. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  2. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  3. Regression analysis using dependent Polya trees.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  4. Application of factor analysis of infrared spectra for quantitative determination of beta-tricalcium phosphate in calcium hydroxylapatite.

    PubMed

    Arsenyev, P A; Trezvov, V V; Saratovskaya, N V

    1997-01-01

    This work represents a method, which allows to determine phase composition of calcium hydroxylapatite basing on its infrared spectrum. The method uses factor analysis of the spectral data of calibration set of samples to determine minimal number of factors required to reproduce the spectra within experimental error. Multiple linear regression is applied to establish correlation between factor scores of calibration standards and their properties. The regression equations can be used to predict the property value of unknown sample. The regression model was built for determination of beta-tricalcium phosphate content in hydroxylapatite. Statistical estimation of quality of the model was carried out. Application of the factor analysis on spectral data allows to increase accuracy of beta-tricalcium phosphate determination and expand the range of determination towards its less concentration. Reproducibility of results is retained.

  5. A Comparison of Mean Phase Difference and Generalized Least Squares for Analyzing Single-Case Data

    ERIC Educational Resources Information Center

    Manolov, Rumen; Solanas, Antonio

    2013-01-01

    The present study focuses on single-case data analysis specifically on two procedures for quantifying differences between baseline and treatment measurements. The first technique tested is based on generalized least square regression analysis and is compared to a proposed non-regression technique, which allows obtaining similar information. The…

  6. Quantile Regression in the Study of Developmental Sciences

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…

  7. Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

    NASA Astrophysics Data System (ADS)

    Cepowski, Tomasz

    2017-06-01

    The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.

  8. Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.

    PubMed

    Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C

    2014-03-01

    To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  9. Regression: The Apple Does Not Fall Far From the Tree.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  10. Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.

    PubMed

    Ritz, Christian; Van der Vliet, Leana

    2009-09-01

    The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.

  11. Quantile regression in the presence of monotone missingness with sensitivity analysis

    PubMed Central

    Liu, Minzhao; Daniels, Michael J.; Perri, Michael G.

    2016-01-01

    In this paper, we develop methods for longitudinal quantile regression when there is monotone missingness. In particular, we propose pattern mixture models with a constraint that provides a straightforward interpretation of the marginal quantile regression parameters. Our approach allows sensitivity analysis which is an essential component in inference for incomplete data. To facilitate computation of the likelihood, we propose a novel way to obtain analytic forms for the required integrals. We conduct simulations to examine the robustness of our approach to modeling assumptions and compare its performance to competing approaches. The model is applied to data from a recent clinical trial on weight management. PMID:26041008

  12. Visual grading characteristics and ordinal regression analysis during optimisation of CT head examinations.

    PubMed

    Zarb, Francis; McEntee, Mark F; Rainford, Louise

    2015-06-01

    To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.

  13. Estimating time-varying exposure-outcome associations using case-control data: logistic and case-cohort analyses.

    PubMed

    Keogh, Ruth H; Mangtani, Punam; Rodrigues, Laura; Nguipdop Djomo, Patrick

    2016-01-05

    Traditional analyses of standard case-control studies using logistic regression do not allow estimation of time-varying associations between exposures and the outcome. We present two approaches which allow this. The motivation is a study of vaccine efficacy as a function of time since vaccination. Our first approach is to estimate time-varying exposure-outcome associations by fitting a series of logistic regressions within successive time periods, reusing controls across periods. Our second approach treats the case-control sample as a case-cohort study, with the controls forming the subcohort. In the case-cohort analysis, controls contribute information at all times they are at risk. Extensions allow left truncation, frequency matching and, using the case-cohort analysis, time-varying exposures. Simulations are used to investigate the methods. The simulation results show that both methods give correct estimates of time-varying effects of exposures using standard case-control data. Using the logistic approach there are efficiency gains by reusing controls over time and care should be taken over the definition of controls within time periods. However, using the case-cohort analysis there is no ambiguity over the definition of controls. The performance of the two analyses is very similar when controls are used most efficiently under the logistic approach. Using our methods, case-control studies can be used to estimate time-varying exposure-outcome associations where they may not previously have been considered. The case-cohort analysis has several advantages, including that it allows estimation of time-varying associations as a continuous function of time, while the logistic regression approach is restricted to assuming a step function form for the time-varying association.

  14. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  15. Methods for estimating the magnitude and frequency of floods for urban and small, rural streams in Georgia, South Carolina, and North Carolina, 2011

    USGS Publications Warehouse

    Feaster, Toby D.; Gotvald, Anthony J.; Weaver, J. Curtis

    2014-01-01

    Reliable estimates of the magnitude and frequency of floods are essential for the design of transportation and water-conveyance structures, flood-insurance studies, and flood-plain management. Such estimates are particularly important in densely populated urban areas. In order to increase the number of streamflow-gaging stations (streamgages) available for analysis, expand the geographical coverage that would allow for application of regional regression equations across State boundaries, and build on a previous flood-frequency investigation of rural U.S Geological Survey streamgages in the Southeast United States, a multistate approach was used to update methods for determining the magnitude and frequency of floods in urban and small, rural streams that are not substantially affected by regulation or tidal fluctuations in Georgia, South Carolina, and North Carolina. The at-site flood-frequency analysis of annual peak-flow data for urban and small, rural streams (through September 30, 2011) included 116 urban streamgages and 32 small, rural streamgages, defined in this report as basins draining less than 1 square mile. The regional regression analysis included annual peak-flow data from an additional 338 rural streamgages previously included in U.S. Geological Survey flood-frequency reports and 2 additional rural streamgages in North Carolina that were not included in the previous Southeast rural flood-frequency investigation for a total of 488 streamgages included in the urban and small, rural regression analysis. The at-site flood-frequency analyses for the urban and small, rural streamgages included the expected moments algorithm, which is a modification of the Bulletin 17B log-Pearson type III method for fitting the statistical distribution to the logarithms of the annual peak flows. Where applicable, the flood-frequency analysis also included low-outlier and historic information. Additionally, the application of a generalized Grubbs-Becks test allowed for the detection of multiple potentially influential low outliers. Streamgage basin characteristics were determined using geographical information system techniques. Initial ordinary least squares regression simulations reduced the number of basin characteristics on the basis of such factors as statistical significance, coefficient of determination, Mallow’s Cp statistic, and ease of measurement of the explanatory variable. Application of generalized least squares regression techniques produced final predictive (regression) equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability flows for urban and small, rural ungaged basins for three hydrologic regions (HR1, Piedmont–Ridge and Valley; HR3, Sand Hills; and HR4, Coastal Plain), which previously had been defined from exploratory regression analysis in the Southeast rural flood-frequency investigation. Because of the limited availability of urban streamgages in the Coastal Plain of Georgia, South Carolina, and North Carolina, additional urban streamgages in Florida and New Jersey were used in the regression analysis for this region. Including the urban streamgages in New Jersey allowed for the expansion of the applicability of the predictive equations in the Coastal Plain from 3.5 to 53.5 square miles. Average standard error of prediction for the predictive equations, which is a measure of the average accuracy of the regression equations when predicting flood estimates for ungaged sites, range from 25.0 percent for the 10-percent annual exceedance probability regression equation for the Piedmont–Ridge and Valley region to 73.3 percent for the 0.2-percent annual exceedance probability regression equation for the Sand Hills region.

  16. Influences on Academic Achievement Across High and Low Income Countries: A Re-Analysis of IEA Data.

    ERIC Educational Resources Information Center

    Heyneman, S.; Loxley, W.

    Previous international studies of science achievement put the data through a process of winnowing to decide which variables to keep in the final regressions. Variables were allowed to enter the final regressions if they met a minimum beta coefficient criterion of 0.05 averaged across rich and poor countries alike. The criterion was an average…

  17. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

    PubMed Central

    Jackson, Dan; White, Ian R; Riley, Richard D

    2013-01-01

    Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213

  18. Advanced Statistics for Exotic Animal Practitioners.

    PubMed

    Hodsoll, John; Hellier, Jennifer M; Ryan, Elizabeth G

    2017-09-01

    Correlation and regression assess the association between 2 or more variables. This article reviews the core knowledge needed to understand these analyses, moving from visual analysis in scatter plots through correlation, simple and multiple linear regression, and logistic regression. Correlation estimates the strength and direction of a relationship between 2 variables. Regression can be considered more general and quantifies the numerical relationships between an outcome and 1 or multiple variables in terms of a best-fit line, allowing predictions to be made. Each technique is discussed with examples and the statistical assumptions underlying their correct application. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Do Our Means of Inquiry Match our Intentions?

    PubMed Central

    Petscher, Yaacov

    2016-01-01

    A key stage of the scientific method is the analysis of data, yet despite the variety of methods that are available to researchers they are most frequently distilled to a model that focuses on the average relation between variables. Although research questions are frequently conceived with broad inquiry in mind, most regression methods are limited in comprehensively evaluating how observed behaviors are related to each other. Quantile regression is a largely unknown yet well-suited analytic technique similar to traditional regression analysis, but allows for a more systematic approach to understanding complex associations among observed phenomena in the psychological sciences. Data from the National Education Longitudinal Study of 1988/2000 are used to illustrate how quantile regression overcomes the limitations of average associations in linear regression by showing that psychological well-being and sex each differentially relate to reading achievement depending on one’s level of reading achievement. PMID:27486410

  20. Methods for estimating peak-flow frequencies at ungaged sites in Montana based on data through water year 2011: Chapter F in Montana StreamStats

    USGS Publications Warehouse

    Sando, Roy; Sando, Steven K.; McCarthy, Peter M.; Dutton, DeAnn M.

    2016-04-05

    The U.S. Geological Survey (USGS), in cooperation with the Montana Department of Natural Resources and Conservation, completed a study to update methods for estimating peak-flow frequencies at ungaged sites in Montana based on peak-flow data at streamflow-gaging stations through water year 2011. The methods allow estimation of peak-flow frequencies (that is, peak-flow magnitudes, in cubic feet per second, associated with annual exceedance probabilities of 66.7, 50, 42.9, 20, 10, 4, 2, 1, 0.5, and 0.2 percent) at ungaged sites. The annual exceedance probabilities correspond to 1.5-, 2-, 2.33-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year recurrence intervals, respectively.Regional regression analysis is a primary focus of Chapter F of this Scientific Investigations Report, and regression equations for estimating peak-flow frequencies at ungaged sites in eight hydrologic regions in Montana are presented. The regression equations are based on analysis of peak-flow frequencies and basin characteristics at 537 streamflow-gaging stations in or near Montana and were developed using generalized least squares regression or weighted least squares regression.All of the data used in calculating basin characteristics that were included as explanatory variables in the regression equations were developed for and are available through the USGS StreamStats application (http://water.usgs.gov/osw/streamstats/) for Montana. StreamStats is a Web-based geographic information system application that was created by the USGS to provide users with access to an assortment of analytical tools that are useful for water-resource planning and management. The primary purpose of the Montana StreamStats application is to provide estimates of basin characteristics and streamflow characteristics for user-selected ungaged sites on Montana streams. The regional regression equations presented in this report chapter can be conveniently solved using the Montana StreamStats application.Selected results from this study were compared with results of previous studies. For most hydrologic regions, the regression equations reported for this study had lower mean standard errors of prediction (in percent) than the previously reported regression equations for Montana. The equations presented for this study are considered to be an improvement on the previously reported equations primarily because this study (1) included 13 more years of peak-flow data; (2) included 35 more streamflow-gaging stations than previous studies; (3) used a detailed geographic information system (GIS)-based definition of the regulation status of streamflow-gaging stations, which allowed better determination of the unregulated peak-flow records that are appropriate for use in the regional regression analysis; (4) included advancements in GIS and remote-sensing technologies, which allowed more convenient calculation of basin characteristics and investigation of many more candidate basin characteristics; and (5) included advancements in computational and analytical methods, which allowed more thorough and consistent data analysis.This report chapter also presents other methods for estimating peak-flow frequencies at ungaged sites. Two methods for estimating peak-flow frequencies at ungaged sites located on the same streams as streamflow-gaging stations are described. Additionally, envelope curves relating maximum recorded annual peak flows to contributing drainage area for each of the eight hydrologic regions in Montana are presented and compared to a national envelope curve. In addition to providing general information on characteristics of large peak flows, the regional envelope curves can be used to assess the reasonableness of peak-flow frequency estimates determined using the regression equations.

  1. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    NASA Astrophysics Data System (ADS)

    Kristoufek, Ladislav

    2015-02-01

    We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.

  2. Application of logistic regression to case-control association studies involving two causative loci.

    PubMed

    North, Bernard V; Curtis, David; Sham, Pak C

    2005-01-01

    Models in which two susceptibility loci jointly influence the risk of developing disease can be explored using logistic regression analysis. Comparison of likelihoods of models incorporating different sets of disease model parameters allows inferences to be drawn regarding the nature of the joint effect of the loci. We have simulated case-control samples generated assuming different two-locus models and then analysed them using logistic regression. We show that this method is practicable and that, for the models we have used, it can be expected to allow useful inferences to be drawn from sample sizes consisting of hundreds of subjects. Interactions between loci can be explored, but interactive effects do not exactly correspond with classical definitions of epistasis. We have particularly examined the issue of the extent to which it is helpful to utilise information from a previously identified locus when investigating a second, unknown locus. We show that for some models conditional analysis can have substantially greater power while for others unconditional analysis can be more powerful. Hence we conclude that in general both conditional and unconditional analyses should be performed when searching for additional loci.

  3. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  4. Relations among soil radon, environmental parameters, volcanic and seismic events at Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.

    2013-12-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  5. Functional mixture regression.

    PubMed

    Yao, Fang; Fu, Yuejiao; Lee, Thomas C M

    2011-04-01

    In functional linear models (FLMs), the relationship between the scalar response and the functional predictor process is often assumed to be identical for all subjects. Motivated by both practical and methodological considerations, we relax this assumption and propose a new class of functional regression models that allow the regression structure to vary for different groups of subjects. By projecting the predictor process onto its eigenspace, the new functional regression model is simplified to a framework that is similar to classical mixture regression models. This leads to the proposed approach named as functional mixture regression (FMR). The estimation of FMR can be readily carried out using existing software implemented for functional principal component analysis and mixture regression. The practical necessity and performance of FMR are illustrated through applications to a longevity analysis of female medflies and a human growth study. Theoretical investigations concerning the consistent estimation and prediction properties of FMR along with simulation experiments illustrating its empirical properties are presented in the supplementary material available at Biostatistics online. Corresponding results demonstrate that the proposed approach could potentially achieve substantial gains over traditional FLMs.

  6. A secure distributed logistic regression protocol for the detection of rare adverse drug events

    PubMed Central

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-01-01

    Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397

  7. A secure distributed logistic regression protocol for the detection of rare adverse drug events.

    PubMed

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-05-01

    There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.

  8. Wavelet analysis for the study of the relations among soil radon anomalies, volcanic and seismic events: the case of Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido

    2013-04-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  9. Obscure phenomena in statistical analysis of quantitative structure-activity relationships. Part 1: Multicollinearity of physicochemical descriptors.

    PubMed

    Mager, P P; Rothe, H

    1990-10-01

    Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.

  10. The Partitioning of Variance Habits of Correlational and Experimental Psychologists: The Two Disciplines of Scientific Psychology Revisited

    ERIC Educational Resources Information Center

    Krus, David J.; Krus, Patricia H.

    1978-01-01

    The conceptual differences between coded regression analysis and traditional analysis of variance are discussed. Also, a modification of several SPSS routines is proposed which allows for direct interpretation of ANOVA and ANCOVA results in a form stressing the strength and significance of scrutinized relationships. (Author)

  11. Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis

    ERIC Educational Resources Information Center

    Luo, Wen; Azen, Razia

    2013-01-01

    Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…

  12. Intermediate and advanced topics in multilevel logistic regression analysis

    PubMed Central

    Merlo, Juan

    2017-01-01

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517

  13. Intermediate and advanced topics in multilevel logistic regression analysis.

    PubMed

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  14. Drought Patterns Forecasting using an Auto-Regressive Logistic Model

    NASA Astrophysics Data System (ADS)

    del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

    2014-12-01

    Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.

  15. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  16. Logistic regression applied to natural hazards: rare event logistic regression with replications

    NASA Astrophysics Data System (ADS)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  17. Time-Gated Raman Spectroscopy for Quantitative Determination of Solid-State Forms of Fluorescent Pharmaceuticals.

    PubMed

    Lipiäinen, Tiina; Pessi, Jenni; Movahedi, Parisa; Koivistoinen, Juha; Kurki, Lauri; Tenhunen, Mari; Yliruusi, Jouko; Juppo, Anne M; Heikkonen, Jukka; Pahikkala, Tapio; Strachan, Clare J

    2018-04-03

    Raman spectroscopy is widely used for quantitative pharmaceutical analysis, but a common obstacle to its use is sample fluorescence masking the Raman signal. Time-gating provides an instrument-based method for rejecting fluorescence through temporal resolution of the spectral signal and allows Raman spectra of fluorescent materials to be obtained. An additional practical advantage is that analysis is possible in ambient lighting. This study assesses the efficacy of time-gated Raman spectroscopy for the quantitative measurement of fluorescent pharmaceuticals. Time-gated Raman spectroscopy with a 128 × (2) × 4 CMOS SPAD detector was applied for quantitative analysis of ternary mixtures of solid-state forms of the model drug, piroxicam (PRX). Partial least-squares (PLS) regression allowed quantification, with Raman-active time domain selection (based on visual inspection) improving performance. Model performance was further improved by using kernel-based regularized least-squares (RLS) regression with greedy feature selection in which the data use in both the Raman shift and time dimensions was statistically optimized. Overall, time-gated Raman spectroscopy, especially with optimized data analysis in both the spectral and time dimensions, shows potential for sensitive and relatively routine quantitative analysis of photoluminescent pharmaceuticals during drug development and manufacturing.

  18. Fast Quantitative Analysis Of Museum Objects Using Laser-Induced Breakdown Spectroscopy And Multiple Regression Algorithms

    NASA Astrophysics Data System (ADS)

    Lorenzetti, G.; Foresta, A.; Palleschi, V.; Legnaioli, S.

    2009-09-01

    The recent development of mobile instrumentation, specifically devoted to in situ analysis and study of museum objects, allows the acquisition of many LIBS spectra in very short time. However, such large amount of data calls for new analytical approaches which would guarantee a prompt analysis of the results obtained. In this communication, we will present and discuss the advantages of statistical analytical methods, such as Partial Least Squares Multiple Regression algorithms vs. the classical calibration curve approach. PLS algorithms allows to obtain in real time the information on the composition of the objects under study; this feature of the method, compared to the traditional off-line analysis of the data, is extremely useful for the optimization of the measurement times and number of points associated with the analysis. In fact, the real time availability of the compositional information gives the possibility of concentrating the attention on the most `interesting' parts of the object, without over-sampling the zones which would not provide useful information for the scholars or the conservators. Some example on the applications of this method will be presented, including the studies recently performed by the researcher of the Applied Laser Spectroscopy Laboratory on museum bronze objects.

  19. Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.

    PubMed

    Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty

    2014-06-17

    The extent to which pathogenic and commensal E. coli (respectively PEC and CEC) can survive, and which factors predominantly determine the rate of decline, are crucial issues from a public health point of view. The goal of this study was to provide a quantitative summary of the variability in E. coli survival in soil and water over a broad range of individual studies and to identify the most important sources of variability. To that end, a meta-regression analysis on available literature data was conducted. The considerable variation in reported decline rates indicated that the persistence of E. coli is not easily predictable. The meta-analysis demonstrated that for soil and water, the type of experiment (laboratory or field), the matrix subtype (type of water and soil), and temperature were the main factors included in the regression analysis. A higher average decline rate in soil of PEC compared with CEC was observed. The regression models explained at best 57% of the variation in decline rate in soil and 41% of the variation in decline rate in water. This indicates that additional factors, not included in the current meta-regression analysis, are of importance but rarely reported. More complete reporting of experimental conditions may allow future inference on the global effects of these variables on the decline rate of E. coli.

  20. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  1. A Regional Analysis of Non-Methane Hydrocarbons And Meteorology of The Rural Southeast United States

    DTIC Science & Technology

    1996-01-01

    Zt is an ARIMA time series. This is a typical regression model , except that it allows for autocorrelation in the error term Z. In this work, an ARMA...data=folder; var residual; run; II Statistical output of 1992 regression model on 1993 ozone data ARIMA Procedure Maximum Likelihood Estimation Approx...at each of the sites, and to show the effect of synoptic meteorology on high ozone by examining NOAA daily weather maps and climatic data

  2. Simultaneous Force Regression and Movement Classification of Fingers via Surface EMG within a Unified Bayesian Framework.

    PubMed

    Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer

    2018-01-01

    This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.

  3. A general regression framework for a secondary outcome in case-control studies.

    PubMed

    Tchetgen Tchetgen, Eric J

    2014-01-01

    Modern case-control studies typically involve the collection of data on a large number of outcomes, often at considerable logistical and monetary expense. These data are of potentially great value to subsequent researchers, who, although not necessarily concerned with the disease that defined the case series in the original study, may want to use the available information for a regression analysis involving a secondary outcome. Because cases and controls are selected with unequal probability, regression analysis involving a secondary outcome generally must acknowledge the sampling design. In this paper, the author presents a new framework for the analysis of secondary outcomes in case-control studies. The approach is based on a careful re-parameterization of the conditional model for the secondary outcome given the case-control outcome and regression covariates, in terms of (a) the population regression of interest of the secondary outcome given covariates and (b) the population regression of the case-control outcome on covariates. The error distribution for the secondary outcome given covariates and case-control status is otherwise unrestricted. For a continuous outcome, the approach sometimes reduces to extending model (a) by including a residual of (b) as a covariate. However, the framework is general in the sense that models (a) and (b) can take any functional form, and the methodology allows for an identity, log or logit link function for model (a).

  4. Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data

    PubMed Central

    Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.

    2013-01-01

    Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689

  5. Challenges Associated with Estimating Utility in Wet Age-Related Macular Degeneration: A Novel Regression Analysis to Capture the Bilateral Nature of the Disease.

    PubMed

    Hodgson, Robert; Reason, Timothy; Trueman, David; Wickstead, Rose; Kusel, Jeanette; Jasilek, Adam; Claxton, Lindsay; Taylor, Matthew; Pulikottil-Jacob, Ruth

    2017-10-01

    The estimation of utility values for the economic evaluation of therapies for wet age-related macular degeneration (AMD) is a particular challenge. Previous economic models in wet AMD have been criticized for failing to capture the bilateral nature of wet AMD by modelling visual acuity (VA) and utility values associated with the better-seeing eye only. Here we present a de novo regression analysis using generalized estimating equations (GEE) applied to a previous dataset of time trade-off (TTO)-derived utility values from a sample of the UK population that wore contact lenses to simulate visual deterioration in wet AMD. This analysis allows utility values to be estimated as a function of VA in both the better-seeing eye (BSE) and worse-seeing eye (WSE). VAs in both the BSE and WSE were found to be statistically significant (p < 0.05) when regressed separately. When included without an interaction term, only the coefficient for VA in the BSE was significant (p = 0.04), but when an interaction term between VA in the BSE and WSE was included, only the constant term (mean TTO utility value) was significant, potentially a result of the collinearity between the VA of the two eyes. The lack of both formal model fit statistics from the GEE approach and theoretical knowledge to support the superiority of one model over another make it difficult to select the best model. Limitations of this analysis arise from the potential influence of collinearity between the VA of both eyes, and the use of contact lenses to reflect VA states to obtain the original dataset. Whilst further research is required to elicit more accurate utility values for wet AMD, this novel regression analysis provides a possible source of utility values to allow future economic models to capture the quality of life impact of changes in VA in both eyes. Novartis Pharmaceuticals UK Limited.

  6. Clinical evaluation of a novel population-based regression analysis for detecting glaucomatous visual field progression.

    PubMed

    Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C

    2011-04-01

    The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF clusters. © Georg Thieme Verlag KG Stuttgart · New York.

  7. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  8. Spectral distance decay: Assessing species beta-diversity by quantile regression

    USGS Publications Warehouse

    Rocchinl, D.; Nagendra, H.; Ghate, R.; Cade, B.S.

    2009-01-01

    Remotely sensed data represents key information for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance may allow us to quantitatively estimate how beta-diversity in species changes with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological datasets are characterized by a high number of zeroes that can add noise to the regression model. Quantile regression can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this paper, we used ordinary least square (ols) and quantile regression to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.05) considering both ols and quantile regression. Nonetheless, ols regression estimate of mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when spectral distance approaches zero, was very low compared with the intercepts of upper quantiles, which detected high species similarity when habitats are more similar. In this paper we demonstrated the power of using quantile regressions applied to spectral distance decay in order to reveal species diversity patterns otherwise lost or underestimated by ordinary least square regression. ?? 2009 American Society for Photogrammetry and Remote Sensing.

  9. Local linear regression for function learning: an analysis based on sample discrepancy.

    PubMed

    Cervellera, Cristiano; Macciò, Danilo

    2014-11-01

    Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.

  10. Background stratified Poisson regression analysis of cohort data.

    PubMed

    Richardson, David B; Langholz, Bryan

    2012-03-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.

  11. Whole-genome regression and prediction methods applied to plant and animal breeding.

    PubMed

    de Los Campos, Gustavo; Hickey, John M; Pong-Wong, Ricardo; Daetwyler, Hans D; Calus, Mario P L

    2013-02-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade.

  12. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding

    PubMed Central

    de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.

    2013-01-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228

  13. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  14. An example of complex modelling in dentistry using Markov chain Monte Carlo (MCMC) simulation.

    PubMed

    Helfenstein, Ulrich; Menghini, Giorgio; Steiner, Marcel; Murati, Francesca

    2002-09-01

    In the usual regression setting one regression line is computed for a whole data set. In a more complex situation, each person may be observed for example at several points in time and thus a regression line might be calculated for each person. Additional complexities, such as various forms of errors in covariables may make a straightforward statistical evaluation difficult or even impossible. During recent years methods have been developed allowing convenient analysis of problems where the data and the corresponding models show these and many other forms of complexity. The methodology makes use of a Bayesian approach and Markov chain Monte Carlo (MCMC) simulations. The methods allow the construction of increasingly elaborate models by building them up from local sub-models. The essential structure of the models can be represented visually by directed acyclic graphs (DAG). This attractive property allows communication and discussion of the essential structure and the substantial meaning of a complex model without needing algebra. After presentation of the statistical methods an example from dentistry is presented in order to demonstrate their application and use. The dataset of the example had a complex structure; each of a set of children was followed up over several years. The number of new fillings in permanent teeth had been recorded at several ages. The dependent variables were markedly different from the normal distribution and could not be transformed to normality. In addition, explanatory variables were assumed to be measured with different forms of error. Illustration of how the corresponding models can be estimated conveniently via MCMC simulation, in particular, 'Gibbs sampling', using the freely available software BUGS is presented. In addition, how the measurement error may influence the estimates of the corresponding coefficients is explored. It is demonstrated that the effect of the independent variable on the dependent variable may be markedly underestimated if the measurement error is not taken into account ('regression dilution bias'). Markov chain Monte Carlo methods may be of great value to dentists in allowing analysis of data sets which exhibit a wide range of different forms of complexity.

  15. Non-stationary hydrologic frequency analysis using B-spline quantile regression

    NASA Astrophysics Data System (ADS)

    Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

    2017-11-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.

  16. Application of linear regression analysis in accuracy assessment of rolling force calculations

    NASA Astrophysics Data System (ADS)

    Poliak, E. I.; Shim, M. K.; Kim, G. S.; Choo, W. Y.

    1998-10-01

    Efficient operation of the computational models employed in process control systems require periodical assessment of the accuracy of their predictions. Linear regression is proposed as a tool which allows separate systematic and random prediction errors from those related to measurements. A quantitative characteristic of the model predictive ability is introduced in addition to standard statistical tests for model adequacy. Rolling force calculations are considered as an example for the application. However, the outlined approach can be used to assess the performance of any computational model.

  17. Design, innovation, and rural creative places: Are the arts the cherry on top, or the secret sauce?

    PubMed

    Wojan, Timothy R; Nichols, Bonnie

    2018-01-01

    Creative class theory explains the positive relationship between the arts and commercial innovation as the mutual attraction of artists and other creative workers by an unobserved creative milieu. This study explores alternative theories for rural settings, by analyzing establishment-level survey data combined with data on the local arts scene. The study identifies the local contextual factors associated with a strong design orientation, and estimates the impact that a strong design orientation has on the local economy. Data on innovation and design come from a nationally representative sample of establishments in tradable industries. Latent class analysis allows identifying unobserved subpopulations comprised of establishments with different design and innovation orientations. Logistic regression allows estimating the association between an establishment's design orientation and local contextual factors. A quantile instrumental variable regression allows assessing the robustness of the logistic regression results with respect to endogeneity. An estimate of design orientation at the local level derived from the survey is used to examine variation in economic performance during the period of recovery from the Great Recession (2010-2014). Three distinct innovation (substantive, nominal, and non-innovators) and design orientations (design-integrated, "design last finish," and no systematic approach to design) are identified. Innovation- and design-intensive establishments were identified in both rural and urban areas. Rural design-integrated establishments tended to locate in counties with more highly educated workforces and containing at least one performing arts organization. A quantile instrumental variable regression confirmed that the logistic regression result is robust to endogeneity concerns. Finally, rural areas characterized by design-integrated establishments experienced faster growth in wages relative to rural areas characterized by establishments using no systematic approach to design.

  18. Design, innovation, and rural creative places: Are the arts the cherry on top, or the secret sauce?

    PubMed Central

    Nichols, Bonnie

    2018-01-01

    Objective Creative class theory explains the positive relationship between the arts and commercial innovation as the mutual attraction of artists and other creative workers by an unobserved creative milieu. This study explores alternative theories for rural settings, by analyzing establishment-level survey data combined with data on the local arts scene. The study identifies the local contextual factors associated with a strong design orientation, and estimates the impact that a strong design orientation has on the local economy. Method Data on innovation and design come from a nationally representative sample of establishments in tradable industries. Latent class analysis allows identifying unobserved subpopulations comprised of establishments with different design and innovation orientations. Logistic regression allows estimating the association between an establishment’s design orientation and local contextual factors. A quantile instrumental variable regression allows assessing the robustness of the logistic regression results with respect to endogeneity. An estimate of design orientation at the local level derived from the survey is used to examine variation in economic performance during the period of recovery from the Great Recession (2010–2014). Results Three distinct innovation (substantive, nominal, and non-innovators) and design orientations (design-integrated, “design last finish,” and no systematic approach to design) are identified. Innovation- and design-intensive establishments were identified in both rural and urban areas. Rural design-integrated establishments tended to locate in counties with more highly educated workforces and containing at least one performing arts organization. A quantile instrumental variable regression confirmed that the logistic regression result is robust to endogeneity concerns. Finally, rural areas characterized by design-integrated establishments experienced faster growth in wages relative to rural areas characterized by establishments using no systematic approach to design. PMID:29489884

  19. Neuropsychometric tests in cross sectional and longitudinal studies - a regression analysis of ADAS - cog, SKT and MMSE.

    PubMed

    Ihl, R; Grass-Kapanke, B; Jänner, M; Weyer, G

    1999-11-01

    In clinical and drug studies, different neuropsychometric tests are used. So far, no empirical data have been published to compare studies using different tests. The purpose of this study was to calculate a regression formula allowing a comparison of cross-sectional and longitudinal data from three neuropsychometric tests that are frequently used in drug studies (Alzheimer's Disease Assessment Scale, ADAS-cog; Syndrom Kurz Test, SKT; Mini Mental State Examination, MMSE). 177 patients with dementia according to ICD10 criteria were studied for the cross sectional and 61 for the longitudinal analysis. Correlations and linear regressions were calculated between tests. Significance was proven with ANOVA and t-tests using the SPSS statistical package. Significant Spearman correlations and slopes in the regression occurred in the cross sectional analysis (ADAS-cog-SKT r(s) = 0.77, slope = 0.45, SKT-ADAS-cog slope = 1.3, r2 = 0.59; ADAS-cog-MMSE r2 = 0.76, slope = -0.42, MMSE-ADAS-cog slope = -1.5, r2 = 0.64; MMSE-SKT r(s) = -0.79, slope = -0.87, SKT-MMSE slope = -0.71, r2 = 0.62; p<0.001 after Bonferroni correction; N = 177) and in the longitudinal analysis (SKT-ADAS-cog, r(s) = 0.48, slope = 0.69, ADAS-cog-SKT slope = 0.69, p<0.001, r2 = 0.32, MMSE-SKT, r(s) = 0.44, slope = -0.41, SKT-MMSE, slope = -0.55, p<0.001, r2 = 0.21). The results allow calculation of ADAS-scores when SKT scores are given, and vice versa. In longitudinal studies or in the course of the disease, scores assessed with the ADAS-cog and the SKT may now be statistically compared. In all comparisons, bottom and ceiling effects of the tests have to be taken into account.

  20. Determination of the chemical parameters and manufacturer of divins from their broadband transmission spectra

    NASA Astrophysics Data System (ADS)

    Khodasevich, M. A.; Sinitsyn, G. V.; Skorbanova, E. A.; Rogovaya, M. V.; Kambur, E. I.; Aseev, V. A.

    2016-06-01

    Analysis of multiparametric data on transmission spectra of 24 divins (Moldovan cognacs) in the 190-2600 nm range allows identification of outliers and their removal from a sample under study in the following consideration. The principal component analysis and classification tree with a single-rank predictor constructed in the 2D space of principal components allow classification of divin manufacturers. It is shown that the accuracy of syringaldehyde, ethyl acetate, vanillin, and gallic acid concentrations in divins calculated with the regression to latent structures depends on the sample volume and is 3, 6, 16, and 20%, respectively, which is acceptable for the application.

  1. Failure of Standard Training Sets in the Analysis of Fast-Scan Cyclic Voltammetry Data.

    PubMed

    Johnson, Justin A; Rodeberg, Nathan T; Wightman, R Mark

    2016-03-16

    The use of principal component regression, a multivariate calibration method, in the analysis of in vivo fast-scan cyclic voltammetry data allows for separation of overlapping signal contributions, permitting evaluation of the temporal dynamics of multiple neurotransmitters simultaneously. To accomplish this, the technique relies on information about current-concentration relationships across the scan-potential window gained from analysis of training sets. The ability of the constructed models to resolve analytes depends critically on the quality of these data. Recently, the use of standard training sets obtained under conditions other than those of the experimental data collection (e.g., with different electrodes, animals, or equipment) has been reported. This study evaluates the analyte resolution capabilities of models constructed using this approach from both a theoretical and experimental viewpoint. A detailed discussion of the theory of principal component regression is provided to inform this discussion. The findings demonstrate that the use of standard training sets leads to misassignment of the current-concentration relationships across the scan-potential window. This directly results in poor analyte resolution and, consequently, inaccurate quantitation, which may lead to erroneous conclusions being drawn from experimental data. Thus, it is strongly advocated that training sets be obtained under the experimental conditions to allow for accurate data analysis.

  2. Social network type and morale in old age.

    PubMed

    Litwin, H

    2001-08-01

    The aim of this research was to derive network types among an elderly population and to examine the relationship of network type to morale. Secondary analysis of data compiled by the Israeli Central Bureau of Statistics (n = 2,079) was employed, and network types were derived through K-means cluster analysis. Respondents' morale scores were regressed on network types, controlling for background and health variables. Five network types were derived. Respondents in diverse or friends networks reported the highest morale; those in exclusively family or restricted networks had the lowest. Multivariate regression analysis underscored that certain network types were second among the study variables in predicting respondents' morale, preceded only by disability level (Adjusted R(2) =.41). Classification of network types allows consideration of the interpersonal environments of older people in relation to outcomes of interest. The relative effects on morale of elective versus obligated social ties, evident in the current analysis, is a case in point.

  3. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  4. Something old, something new, something borrowed, something blue: a framework for the marriage of health econometrics and cost-effectiveness analysis.

    PubMed

    Hoch, Jeffrey S; Briggs, Andrew H; Willan, Andrew R

    2002-07-01

    Economic evaluation is often seen as a branch of health economics divorced from mainstream econometric techniques. Instead, it is perceived as relying on statistical methods for clinical trials. Furthermore, the statistic of interest in cost-effectiveness analysis, the incremental cost-effectiveness ratio is not amenable to regression-based methods, hence the traditional reliance on comparing aggregate measures across the arms of a clinical trial. In this paper, we explore the potential for health economists undertaking cost-effectiveness analysis to exploit the plethora of established econometric techniques through the use of the net-benefit framework - a recently suggested reformulation of the cost-effectiveness problem that avoids the reliance on cost-effectiveness ratios and their associated statistical problems. This allows the formulation of the cost-effectiveness problem within a standard regression type framework. We provide an example with empirical data to illustrate how a regression type framework can enhance the net-benefit method. We go on to suggest that practical advantages of the net-benefit regression approach include being able to use established econometric techniques, adjust for imperfect randomisation, and identify important subgroups in order to estimate the marginal cost-effectiveness of an intervention. Copyright 2002 John Wiley & Sons, Ltd.

  5. An analysis of equity in Brazilian health system financing.

    PubMed

    Ugá, Maria Alicia Domínguez; Santos, Isabela Soares

    2007-01-01

    Health care in Brazil is financed from many sources--taxes on income, real property, sales of goods and services, and financial transactions; private insurance purchased by households and firms; and out-of-pocket payments by households. Data on household budgets and tax revenues allow the burden of each source except firms' insurance purchases for their employees to be allocated across deciles of adjusted per capita household income, indicating the progressivity or regressivity of each kind of payment. Overall, financing is approximately neutral, with progressive public finance offsetting regressive payments. This last form of finance pushes some households into poverty.

  6. Comparative study of human blood Raman spectra and biochemical analysis of patients with cancer

    NASA Astrophysics Data System (ADS)

    Shamina, Lyudmila A.; Bratchenko, Ivan A.; Artemyev, Dmitry N.; Myakinin, Oleg O.; Moryatov, Alexander A.; Orlov, Andrey E.; Kozlov, Sergey V.; Zakharov, Valery P.

    2018-04-01

    In this study we measured spectral features of blood by Raman spectroscopy. Correlation of the obtained spectral data and biochemical studies results is investigated. Analysis of specific spectra allows for identification of informative spectral bands proportional to components whose content is associated with body fluids homeostasis changes at various pathological conditions. Regression analysis of the obtained spectral data allows for discriminating the lung cancer from other tumors with a posteriori probability of 88.3%. The potentiality of applying surface-enhanced Raman spectroscopy with utilized experimental setup for further studies of the body fluids component composition was estimated. The greatest signal amplification was achieved for the gold substrate with a surface roughness of 1 μm. In general, the developed approach of body fluids analysis provides the basis of a useful and minimally invasive method of pathologies screening.

  7. Analysis of the labor productivity of enterprises via quantile regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2017-07-01

    In this study, we have analyzed the factors that affect the performance of Turkey's Top 500 Industrial Enterprises using quantile regression. The variable about labor productivity of enterprises is considered as dependent variable, the variableabout assets is considered as independent variable. The distribution of labor productivity of enterprises is right-skewed. If the dependent distribution is skewed, linear regression could not catch important aspects of the relationships between the dependent variable and its predictors due to modeling only the conditional mean. Hence, the quantile regression, which allows modelingany quantilesof the dependent distribution, including the median,appears to be useful. It examines whether relationships between dependent and independent variables are different for low, medium, and high percentiles. As a result of analyzing data, the effect of total assets is relatively constant over the entire distribution, except the upper tail. It hasa moderately stronger effect in the upper tail.

  8. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  9. Granger causality--statistical analysis under a configural perspective.

    PubMed

    von Eye, Alexander; Wiedermann, Wolfgang; Mun, Eun-Young

    2014-03-01

    The concept of Granger causality can be used to examine putative causal relations between two series of scores. Based on regression models, it is asked whether one series can be considered the cause for the second series. In this article, we propose extending the pool of methods available for testing hypotheses that are compatible with Granger causation by adopting a configural perspective. This perspective allows researchers to assume that effects exist for specific categories only or for specific sectors of the data space, but not for other categories or sectors. Configural Frequency Analysis (CFA) is proposed as the method of analysis from a configural perspective. CFA base models are derived for the exploratory analysis of Granger causation. These models are specified so that they parallel the regression models used for variable-oriented analysis of hypotheses of Granger causation. An example from the development of aggression in adolescence is used. The example shows that only one pattern of change in aggressive impulses over time Granger-causes change in physical aggression against peers.

  10. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    PubMed

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  11. Bayesian generalized least squares regression with application to log Pearson type 3 regional skew estimation

    NASA Astrophysics Data System (ADS)

    Reis, D. S.; Stedinger, J. R.; Martins, E. S.

    2005-10-01

    This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.

  12. Above-ground biomass of mangrove species. I. Analysis of models

    NASA Astrophysics Data System (ADS)

    Soares, Mário Luiz Gomes; Schaeffer-Novelli, Yara

    2005-10-01

    This study analyzes the above-ground biomass of Rhizophora mangle and Laguncularia racemosa located in the mangroves of Bertioga (SP) and Guaratiba (RJ), Southeast Brazil. Its purpose is to determine the best regression model to estimate the total above-ground biomass and compartment (leaves, reproductive parts, twigs, branches, trunk and prop roots) biomass, indirectly. To do this, we used structural measurements such as height, diameter at breast-height (DBH), and crown area. A combination of regression types with several compositions of independent variables generated 2.272 models that were later tested. Subsequent analysis of the models indicated that the biomass of reproductive parts, branches, and prop roots yielded great variability, probably because of environmental factors and seasonality (in the case of reproductive parts). It also indicated the superiority of multiple regression to estimate above-ground biomass as it allows researchers to consider several aspects that affect above-ground biomass, specially the influence of environmental factors. This fact has been attested to the models that estimated the biomass of crown compartments.

  13. A comparison of regression methods for model selection in individual-based landscape genetic analysis

    Treesearch

    Andrew J. Shirk; Erin L. Landguth; Samuel A. Cushman

    2017-01-01

    Anthropogenic migration barriers fragment many populations and limit the ability of species to respond to climate-induced biome shifts. Conservation actions designed to conserve habitat connectivity and mitigate barriers are needed to unite fragmented populations into larger, more viable metapopulations, and to allow species to track their climate envelope over time....

  14. A Rapid Soils Analysis Kit

    DTIC Science & Technology

    2008-03-01

    behavior of moisture content-dry density Proctor curves......................................... 16 Figure 8. Moisture- density data scatter for an... density . Built-in higher order regression equations allow the user to visua- lize complete curves for Proctor density , as-built California Bearing Ratio...requirements involving soil are optimum moisture content (OMC) and maximum dry density (MDD) as determined from a laboratory compaction or Proctor test

  15. Toward Best Practices in Analyzing Datasets with Missing Data: Comparisons and Recommendations

    ERIC Educational Resources Information Center

    Johnson, David R.; Young, Rebekah

    2011-01-01

    Although several methods have been developed to allow for the analysis of data in the presence of missing values, no clear guide exists to help family researchers in choosing among the many options and procedures available. We delineate these options and examine the sensitivity of the findings in a regression model estimated in three random…

  16. Regression and Geostatistical Techniques: Considerations and Observations from Experiences in NE-FIA

    Treesearch

    Rachel Riemann; Andrew Lister

    2005-01-01

    Maps of forest variables improve our understanding of the forest resource by allowing us to view and analyze it spatially. The USDA Forest Service's Northeastern Forest Inventory and Analysis unit (NE-FIA) has used geostatistical techniques, particularly stochastic simulation, to produce maps and spatial data sets of FIA variables. That work underscores the...

  17. Spatiotemporal Bayesian analysis of Lyme disease in New York state, 1990-2000.

    PubMed

    Chen, Haiyan; Stratton, Howard H; Caraco, Thomas B; White, Dennis J

    2006-07-01

    Mapping ordinarily increases our understanding of nontrivial spatial and temporal heterogeneities in disease rates. However, the large number of parameters required by the corresponding statistical models often complicates detailed analysis. This study investigates the feasibility of a fully Bayesian hierarchical regression approach to the problem and identifies how it outperforms two more popular methods: crude rate estimates (CRE) and empirical Bayes standardization (EBS). In particular, we apply a fully Bayesian approach to the spatiotemporal analysis of Lyme disease incidence in New York state for the period 1990-2000. These results are compared with those obtained by CRE and EBS in Chen et al. (2005). We show that the fully Bayesian regression model not only gives more reliable estimates of disease rates than the other two approaches but also allows for tractable models that can accommodate more numerous sources of variation and unknown parameters.

  18. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  19. Systems and methods for knowledge discovery in spatial data

    DOEpatents

    Obradovic, Zoran; Fiez, Timothy E.; Vucetic, Slobodan; Lazarevic, Aleksandar; Pokrajac, Dragoljub; Hoskinson, Reed L.

    2005-03-08

    Systems and methods are provided for knowledge discovery in spatial data as well as to systems and methods for optimizing recipes used in spatial environments such as may be found in precision agriculture. A spatial data analysis and modeling module is provided which allows users to interactively and flexibly analyze and mine spatial data. The spatial data analysis and modeling module applies spatial data mining algorithms through a number of steps. The data loading and generation module obtains or generates spatial data and allows for basic partitioning. The inspection module provides basic statistical analysis. The preprocessing module smoothes and cleans the data and allows for basic manipulation of the data. The partitioning module provides for more advanced data partitioning. The prediction module applies regression and classification algorithms on the spatial data. The integration module enhances prediction methods by combining and integrating models. The recommendation module provides the user with site-specific recommendations as to how to optimize a recipe for a spatial environment such as a fertilizer recipe for an agricultural field.

  20. Perceived Gender Presentation Among Transgender and Gender Diverse Youth: Approaches to Analysis and Associations with Bullying Victimization and Emotional Distress.

    PubMed

    Gower, Amy L; Rider, G Nicole; Coleman, Eli; Brown, Camille; McMorris, Barbara J; Eisenberg, Marla E

    2018-06-19

    As measures of birth-assigned sex, gender identity, and perceived gender presentation are increasingly included in large-scale research studies, data analysis approaches incorporating such measures are needed. Large samples capable of demonstrating variation within the transgender and gender diverse (TGD) community can inform intervention efforts to improve health equity. A population-based sample of TGD youth was used to examine associations between perceived gender presentation, bullying victimization, and emotional distress using two data analysis approaches. Secondary data analysis of the Minnesota Student Survey included 2168 9th and 11th graders who identified as "transgender, genderqueer, genderfluid, or unsure about their gender identity." Youth reported their biological sex, how others perceived their gender presentation, experiences of four forms of bullying victimization, and four measures of emotional distress. Logistic regression and multifactor analysis of variance (ANOVA) were used to compare and contrast two analysis approaches. Logistic regressions indicated that TGD youth perceived as more gender incongruent had higher odds of bullying victimization and emotional distress relative to those perceived as very congruent with their biological sex. Multifactor ANOVAs demonstrated more variable patterns and allowed for comparisons of each perceived presentation group with all other groups, reflecting nuances that exist within TGD youth. Researchers should adopt data analysis strategies that allow for comparisons of all perceived gender presentation categories rather than assigning a reference group. Those working with TGD youth should be particularly attuned to youth perceived as gender incongruent as they may be more likely to experience bullying victimization and emotional distress.

  1. Vegetation Monitoring with Gaussian Processes and Latent Force Models

    NASA Astrophysics Data System (ADS)

    Camps-Valls, Gustau; Svendsen, Daniel; Martino, Luca; Campos, Manuel; Luengo, David

    2017-04-01

    Monitoring vegetation by biophysical parameter retrieval from Earth observation data is a challenging problem, where machine learning is currently a key player. Neural networks, kernel methods, and Gaussian Process (GP) regression have excelled in parameter retrieval tasks at both local and global scales. GP regression is based on solid Bayesian statistics, yield efficient and accurate parameter estimates, and provides interesting advantages over competing machine learning approaches such as confidence intervals. However, GP models are hampered by lack of interpretability, that prevented the widespread adoption by a larger community. In this presentation we will summarize some of our latest developments to address this issue. We will review the main characteristics of GPs and their advantages in vegetation monitoring standard applications. Then, three advanced GP models will be introduced. First, we will derive sensitivity maps for the GP predictive function that allows us to obtain feature ranking from the model and to assess the influence of examples in the solution. Second, we will introduce a Joint GP (JGP) model that combines in situ measurements and simulated radiative transfer data in a single GP model. The JGP regression provides more sensible confidence intervals for the predictions, respects the physics of the underlying processes, and allows for transferability across time and space. Finally, a latent force model (LFM) for GP modeling that encodes ordinary differential equations to blend data-driven modeling and physical models of the system is presented. The LFM performs multi-output regression, adapts to the signal characteristics, is able to cope with missing data in the time series, and provides explicit latent functions that allow system analysis and evaluation. Empirical evidence of the performance of these models will be presented through illustrative examples.

  2. Using Neural Network and Logistic Regression Analysis to Predict Prospective Mathematics Teachers' Academic Success upon Entering Graduate Education

    ERIC Educational Resources Information Center

    Bahadir, Elif

    2016-01-01

    The ability to predict the success of students when they enter a graduate program is critical for educational institutions because it allows them to develop strategic programs that will help improve students' performances during their stay at an institution. In this study, we present the results of an experimental comparison study of Logistic…

  3. The Difference That One Year of Schooling Makes for Russian Schoolchildren. Based on PISA 2009: Reading

    ERIC Educational Resources Information Center

    Tiumeneva, Yu. A.; Kuzmina, Ju. V.

    2015-01-01

    The PISA 2009 data (in reading) investigated the effectiveness of one year of schooling in seven countries: Russia, Czech Republic, Hungary, Slovakia, Germany, Canada, and Brazil. We used an instrumental variable, which allowed us to estimate the effect of one year of schooling through the fuzzy method of regression discontinuity. The analysis was…

  4. Estimating individual benefits of medical or behavioral treatments in severely ill patients.

    PubMed

    Diaz, Francisco J

    2017-01-01

    There is a need for statistical methods appropriate for the analysis of clinical trials from a personalized-medicine viewpoint as opposed to the common statistical practice that simply examines average treatment effects. This article proposes an approach to quantifying, reporting and analyzing individual benefits of medical or behavioral treatments to severely ill patients with chronic conditions, using data from clinical trials. The approach is a new development of a published framework for measuring the severity of a chronic disease and the benefits treatments provide to individuals, which utilizes regression models with random coefficients. Here, a patient is considered to be severely ill if the patient's basal severity is close to one. This allows the derivation of a very flexible family of probability distributions of individual benefits that depend on treatment duration and the covariates included in the regression model. Our approach may enrich the statistical analysis of clinical trials of severely ill patients because it allows investigating the probability distribution of individual benefits in the patient population and the variables that influence it, and we can also measure the benefits achieved in specific patients including new patients. We illustrate our approach using data from a clinical trial of the anti-depressant imipramine.

  5. BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES1

    PubMed Central

    Zhu, Xiang; Stephens, Matthew

    2017-01-01

    Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss. PMID:29399241

  6. T56. AN EXPLORATORY ANALYSIS CONVERTING SCORES BETWEEN THE PANSS AND BNSS

    PubMed Central

    Kott, Alan; Daniel, David

    2018-01-01

    Abstract Background The Brief Negative Symptom Scale is a relatively new instrument designed specifically to measure the negative symptoms in schizophrenia. Recently more clinical trials include the BNSS scale as a secondary or exploratory outcome, typically along with the PANSS. In the current analysis we aimed at establishing the equations that would allow conversion between the BNSS scale total score and the PANSS negative subscale and PANSS negative factors score as well as conversion equations between the expressive deficits and avolition/apathy factors of the scales. (Kirkpatrick, 2011; Strauss, 2012) Methods Data from 518 schizophrenia clinical trials subjects with both PANSS and BNSS data available were used. Regression analyses predicting the BNSS total score with the PANSS negative subscale score, and the BNSS total score with the PANSS Negative factor (NFS) score were performed on data from all subjects. Regression analyses predicting the BNSS avolition/apathy factor (items 1, 2, 3, 5, 6, 7, and 8) with the PANSS avolition/apathy factor (items N2, N4 and G16) and the BNSS expressive deficits factor (items 4, 9, 10, 11, 12, and 13)with the expressive deficits factor (items N1, N3, N6, G5, G7, and G13)of the PANSS were performed on a sample of 318 subjects with individual BNSS item scores available. In addition to estimating the equations we as well calculated the Pearson’s correlations between the scales. Results The PANSS and BNSS avolition/apathy factors were highly correlated (r=0.70) as were the expressive deficit factors r=0.83). The following equations predicting the BNSS total score were obtained from regression analyses performed on 2,560 data points: BNSS_total = -11.64 + 2.10*PANSS_negative_subscale BNSS_total = -9.26 + 2.11*PANSS_NFS The following equations predicting the BNSS factor scores from the PANSS factor scores were obtained from regression analyses performed on 1,634 data points: BNSS_avolition/apathy = -2.40 + 2.38 * PANSS_avolition/apathy BNSS_expressive_deficit_factor = -4.21 + 1.27 * PANSS_expressive_deficit_factor Discussion The BNSS differs from the PANSS negative factor because it addresses all five currently recognized domains of negative symptoms including anhedonia and attempts to differentiate anticipatory from consummatory states. In our analysis we have replicated the strong correlation between the BNSS total score and PANSS negative subscale and newly identified strong correlations between the BNSS total score and NFS as well as strong correlations between the avolotion/apathy and expressive deficit factors of the BNSS and the PANSS scales. (Kirkpatrick, 2011)The provided equations offer a useful tool allowing researchers and clinicians to easily convert the data between the instruments for reasons such as pooling data from multiple trials using one of the instruments, to allow interpretation of results within the context of previously conducted research, etc. but as well offer a framework for risk based monitoring to identify data deviating from the expected relationship and allow for a targeted exploration of the causes for such a disagreement. The data used for analysis included not only subjects with predominantly negative symptoms but as well acutely psychotic subjects as well as subjects in stable conditions allowing therefore to generalize the results across the majority of schizophrenic subjects. This post-hoc analysis is exploratory. We plan to further explore the potential utility of equations addressing the relationships among schizophrenia measures of symptom severity in an iterative manner with larger datasets.

  7. Calibration and Data Analysis of the MC-130 Air Balance

    NASA Technical Reports Server (NTRS)

    Booth, Dennis; Ulbrich, N.

    2012-01-01

    Design, calibration, calibration analysis, and intended use of the MC-130 air balance are discussed. The MC-130 balance is an 8.0 inch diameter force balance that has two separate internal air flow systems and one external bellows system. The manual calibration of the balance consisted of a total of 1854 data points with both unpressurized and pressurized air flowing through the balance. A subset of 1160 data points was chosen for the calibration data analysis. The regression analysis of the subset was performed using two fundamentally different analysis approaches. First, the data analysis was performed using a recently developed extension of the Iterative Method. This approach fits gage outputs as a function of both applied balance loads and bellows pressures while still allowing the application of the iteration scheme that is used with the Iterative Method. Then, for comparison, the axial force was also analyzed using the Non-Iterative Method. This alternate approach directly fits loads as a function of measured gage outputs and bellows pressures and does not require a load iteration. The regression models used by both the extended Iterative and Non-Iterative Method were constructed such that they met a set of widely accepted statistical quality requirements. These requirements lead to reliable regression models and prevent overfitting of data because they ensure that no hidden near-linear dependencies between regression model terms exist and that only statistically significant terms are included. Finally, a comparison of the axial force residuals was performed. Overall, axial force estimates obtained from both methods show excellent agreement as the differences of the standard deviation of the axial force residuals are on the order of 0.001 % of the axial force capacity.

  8. Genome-wide regression and prediction with the BGLR statistical package.

    PubMed

    Pérez, Paulino; de los Campos, Gustavo

    2014-10-01

    Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.

  9. Early Warning Signals of Financial Crises with Multi-Scale Quantile Regressions of Log-Periodic Power Law Singularities.

    PubMed

    Zhang, Qun; Zhang, Qunzhi; Sornette, Didier

    2016-01-01

    We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the "S&P 500 1987" bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs.

  10. Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

    PubMed Central

    Fernandes, Bruno J. T.; Roque, Alexandre

    2018-01-01

    Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366

  11. An Analysis of the Number of Medical Malpractice Claims and Their Amounts

    PubMed Central

    Bonetti, Marco; Cirillo, Pasquale; Musile Tanzi, Paola; Trinchero, Elisabetta

    2016-01-01

    Starting from an extensive database, pooling 9 years of data from the top three insurance brokers in Italy, and containing 38125 reported claims due to alleged cases of medical malpractice, we use an inhomogeneous Poisson process to model the number of medical malpractice claims in Italy. The intensity of the process is allowed to vary over time, and it depends on a set of covariates, like the size of the hospital, the medical department and the complexity of the medical operations performed. We choose the combination medical department by hospital as the unit of analysis. Together with the number of claims, we also model the associated amounts paid by insurance companies, using a two-stage regression model. In particular, we use logistic regression for the probability that a claim is closed with a zero payment, whereas, conditionally on the fact that an amount is strictly positive, we make use of lognormal regression to model it as a function of several covariates. The model produces estimates and forecasts that are relevant to both insurance companies and hospitals, for quality assurance, service improvement and cost reduction. PMID:27077661

  12. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  13. “Smooth” Semiparametric Regression Analysis for Arbitrarily Censored Time-to-Event Data

    PubMed Central

    Zhang, Min; Davidian, Marie

    2008-01-01

    Summary A general framework for regression analysis of time-to-event data subject to arbitrary patterns of censoring is proposed. The approach is relevant when the analyst is willing to assume that distributions governing model components that are ordinarily left unspecified in popular semiparametric regression models, such as the baseline hazard function in the proportional hazards model, have densities satisfying mild “smoothness” conditions. Densities are approximated by a truncated series expansion that, for fixed degree of truncation, results in a “parametric” representation, which makes likelihood-based inference coupled with adaptive choice of the degree of truncation, and hence flexibility of the model, computationally and conceptually straightforward with data subject to any pattern of censoring. The formulation allows popular models, such as the proportional hazards, proportional odds, and accelerated failure time models, to be placed in a common framework; provides a principled basis for choosing among them; and renders useful extensions of the models straightforward. The utility and performance of the methods are demonstrated via simulations and by application to data from time-to-event studies. PMID:17970813

  14. [In vitro testing of yeast resistance to antimycotic substances].

    PubMed

    Potel, J; Arndt, K

    1982-01-01

    Investigations have been carried out in order to clarify the antibiotic susceptibility determination of yeasts. 291 yeast strains of different species were tested for sensitivity to 7 antimycotics: amphotericin B, flucytosin, nystatin, pimaricin, clotrimazol, econazol and miconazol. Additionally to the evaluation of inhibition zone diameters and MIC-values the influence of pH was examined. 1. The dependence of inhibition zone diameters upon pH-values varies due to the antimycotic tested. For standardizing purposes the pH 6.0 is proposed; moreover, further experimental parameters, such as nutrient composition, agar depth, cell density, incubation time and -temperature, have to be normed. 2. The relation between inhibition zone size and logarythmic MIC does not fit a linear regression analysis when all species are considered together. Therefore regression functions have to be calculated selecting the individual species. In case of the antimycotics amphotericin B, nystatin and pimaricin the low scattering of the MIC-values does not allow regression analysis. 3. A quantitative susceptibility determination of yeasts--particularly to the fungistatical substances with systemic applicability, flucytosin and miconazol, -- is advocated by the results of the MIC-tests.

  15. Optimizing landslide susceptibility zonation: Effects of DEM spatial resolution and slope unit delineation on logistic regression models

    NASA Astrophysics Data System (ADS)

    Schlögel, R.; Marchesini, I.; Alvioli, M.; Reichenbach, P.; Rossi, M.; Malet, J.-P.

    2018-01-01

    We perform landslide susceptibility zonation with slope units using three digital elevation models (DEMs) of varying spatial resolution of the Ubaye Valley (South French Alps). In so doing, we applied a recently developed algorithm automating slope unit delineation, given a number of parameters, in order to optimize simultaneously the partitioning of the terrain and the performance of a logistic regression susceptibility model. The method allowed us to obtain optimal slope units for each available DEM spatial resolution. For each resolution, we studied the susceptibility model performance by analyzing in detail the relevance of the conditioning variables. The analysis is based on landslide morphology data, considering either the whole landslide or only the source area outline as inputs. The procedure allowed us to select the most useful information, in terms of DEM spatial resolution, thematic variables and landslide inventory, in order to obtain the most reliable slope unit-based landslide susceptibility assessment.

  16. Strain, Attribution, and Traffic Delinquency among Young Drivers: Measuring and Testing General Strain Theory in the Context of Driving

    ERIC Educational Resources Information Center

    Ellwanger, Steven J.

    2007-01-01

    This article enhances our knowledge of general strain theory (GST) by applying it to the context of traffic delinquency. It does so by first describing and confirming the development of a social-psychological measure allowing for a test of GST. Structural regression analysis is subsequently employed to test the theory within this context across a…

  17. A general framework for the regression analysis of pooled biomarker assessments.

    PubMed

    Liu, Yan; McMahan, Christopher; Gallagher, Colin

    2017-07-10

    As a cost-efficient data collection mechanism, the process of assaying pooled biospecimens is becoming increasingly common in epidemiological research; for example, pooling has been proposed for the purpose of evaluating the diagnostic efficacy of biological markers (biomarkers). To this end, several authors have proposed techniques that allow for the analysis of continuous pooled biomarker assessments. Regretfully, most of these techniques proceed under restrictive assumptions, are unable to account for the effects of measurement error, and fail to control for confounding variables. These limitations are understandably attributable to the complex structure that is inherent to measurements taken on pooled specimens. Consequently, in order to provide practitioners with the tools necessary to accurately and efficiently analyze pooled biomarker assessments, herein, a general Monte Carlo maximum likelihood-based procedure is presented. The proposed approach allows for the regression analysis of pooled data under practically all parametric models and can be used to directly account for the effects of measurement error. Through simulation, it is shown that the proposed approach can accurately and efficiently estimate all unknown parameters and is more computational efficient than existing techniques. This new methodology is further illustrated using monocyte chemotactic protein-1 data collected by the Collaborative Perinatal Project in an effort to assess the relationship between this chemokine and the risk of miscarriage. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  18. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis

    PubMed Central

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain

    2017-01-01

    Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993

  19. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis.

    PubMed

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A

    2017-05-01

    The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.

  20. An analysis of collegiate band directors' exposure to sound pressure levels

    NASA Astrophysics Data System (ADS)

    Roebuck, Nikole Moore

    Noise-induced hearing loss (NIHL) is a significant but unfortunate common occupational hazard. The purpose of the current study was to measure the magnitude of sound pressure levels generated within a collegiate band room and determine if those sound pressure levels are of a magnitude that exceeds the policy standards and recommendations of the Occupational Safety and Health Administration (OSHA), and the National Institute of Occupational Safety and Health (NIOSH). In addition, reverberation times were measured and analyzed in order to determine the appropriateness of acoustical conditions for the band rehearsal environment. Sound pressure measurements were taken from the rehearsal of seven collegiate marching bands. Single sample t test were conducted to compare the sound pressure levels of all bands to the noise exposure standards of OSHA and NIOSH. Multiple regression analysis were conducted and analyzed in order to determine the effect of the band room's conditions on the sound pressure levels and reverberation times. Time weighted averages (TWA), noise percentage doses, and peak levels were also collected. The mean Leq for all band directors was 90.5 dBA. The total accumulated noise percentage dose for all band directors was 77.6% of the maximum allowable daily noise dose under the OSHA standard. The total calculated TWA for all band directors was 88.2% of the maximum allowable daily noise dose under the OSHA standard. The total accumulated noise percentage dose for all band directors was 152.1% of the maximum allowable daily noise dose under the NIOSH standards, and the total calculated TWA for all band directors was 93dBA of the maximum allowable daily noise dose under the NIOSH standard. Multiple regression analysis revealed that the room volume, the level of acoustical treatment and the mean room reverberation time predicted 80% of the variance in sound pressure levels in this study.

  1. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    PubMed

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  2. Extension of the Haseman-Elston regression model to longitudinal data.

    PubMed

    Won, Sungho; Elston, Robert C; Park, Taesung

    2006-01-01

    We propose an extension to longitudinal data of the Haseman and Elston regression method for linkage analysis. The proposed model is a mixed model having several random effects. As response variable, we investigate the sibship sample mean corrected cross-product (smHE) and the BLUP-mean corrected cross product (pmHE), comparing them with the original squared difference (oHE), the overall mean corrected cross-product (rHE), and the weighted average of the squared difference and the squared mean-corrected sum (wHE). The proposed model allows for the correlation structure of longitudinal data. Also, the model can test for gene x time interaction to discover genetic variation over time. The model was applied in an analysis of the Genetic Analysis Workshop 13 (GAW13) simulated dataset for a quantitative trait simulating systolic blood pressure. Independence models did not preserve the test sizes, while the mixed models with both family and sibpair random effects tended to preserve size well. Copyright 2006 S. Karger AG, Basel.

  3. Thermophysical Property Models for Lunar Regolith

    NASA Technical Reports Server (NTRS)

    Schreiner, Samuel S.; Dominguez, Jesus A.; Sibille, Laurent; Hoffman, Jeffrey A.

    2015-01-01

    We present a set of models for a wide range of lunar regolith material properties. Data from the literature are t with regression models for the following regolith properties: composition, density, specific heat, thermal conductivity, electrical conductivity, optical absorption length, and latent heat of melting/fusion. These models contain both temperature and composition dependencies so that they can be tailored for a range of applications. These models can enable more consistent, informed analysis and design of lunar regolith processing hardware. Furthermore, these models can be utilized to further inform lunar geological simulations. In addition to regression models for each material property, the raw data is also presented to allow for further interpretation and fitting as necessary.

  4. Partial Least Squares Regression Models for the Analysis of Kinase Signaling.

    PubMed

    Bourgeois, Danielle L; Kreeger, Pamela K

    2017-01-01

    Partial least squares regression (PLSR) is a data-driven modeling approach that can be used to analyze multivariate relationships between kinase networks and cellular decisions or patient outcomes. In PLSR, a linear model relating an X matrix of dependent variables and a Y matrix of independent variables is generated by extracting the factors with the strongest covariation. While the identified relationship is correlative, PLSR models can be used to generate quantitative predictions for new conditions or perturbations to the network, allowing for mechanisms to be identified. This chapter will provide a brief explanation of PLSR and provide an instructive example to demonstrate the use of PLSR to analyze kinase signaling.

  5. [Correlation between gaseous exchange rate, body temperature, and mitochondrial protein content in the liver of mice].

    PubMed

    Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E

    2002-01-01

    Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.

  6. Knowledge, attitudes and practices survey on organ donation among a selected adult population of Pakistan

    PubMed Central

    Saleem, Taimur; Ishaque, Sidra; Habib, Nida; Hussain, Syedda Saadia; Jawed, Areeba; Khan, Aamir Ali; Ahmad, Muhammad Imran; Iftikhar, Mian Omer; Mughal, Hamza Pervez; Jehan, Imtiaz

    2009-01-01

    Background To determine the knowledge, attitudes and practices regarding organ donation in a selected adult population in Pakistan. Methods Convenience sampling was used to generate a sample of 440; 408 interviews were successfully completed and used for analysis. Data collection was carried out via a face to face interview based on a pre-tested questionnaire in selected public areas of Karachi, Pakistan. Data was analyzed using SPSS v.15 and associations were tested using the Pearson's Chi square test. Multiple logistic regression was used to find independent predictors of knowledge status and motivation of organ donation. Results Knowledge about organ donation was significantly associated with education (p = 0.000) and socioeconomic status (p = 0.038). 70/198 (35.3%) people expressed a high motivation to donate. Allowance of organ donation in religion was significantly associated with the motivation to donate (p = 0.000). Multiple logistic regression analysis revealed that higher level of education and higher socioeconomic status were significant (p < 0.05) independent predictors of knowledge status of organ donation. For motivation, multiple logistic regression revealed that higher socioeconomic status, adequate knowledge score and belief that organ donation is allowed in religion were significant (p < 0.05) independent predictors. Television emerged as the major source of information. Only 3.5% had themselves donated an organ; with only one person being an actual kidney donor. Conclusion Better knowledge may ultimately translate into the act of donation. Effective measures should be taken to educate people with relevant information with the involvement of media, doctors and religious scholars. PMID:19534793

  7. Hypnotism as a Function of Trance State Effects, Expectancy, and Suggestibility: An Italian Replication.

    PubMed

    Pekala, Ronald J; Baglio, Francesca; Cabinio, Monia; Lipari, Susanna; Baglio, Gisella; Mendozzi, Laura; Cecconi, Pietro; Pugnetti, Luigi; Sciaky, Riccardo

    2017-01-01

    Previous research using stepwise regression analyses found self-reported hypnotic depth (srHD) to be a function of suggestibility, trance state effects, and expectancy. This study sought to replicate and expand that research using a general state measure of hypnotic responsivity, the Phenomenology of Consciousness Inventory: Hypnotic Assessment Procedure (PCI-HAP). Ninety-five participants completed an Italian translation of the PCI-HAP, with srHD scores predicted from the PCI-HAP assessment items. The regression analysis replicated the previous research results. Additionally, stepwise regression analyses were able to predict the srHD score equally well using only the PCI dimension scores. These results not only replicated prior research but suggest how this methodology to assess hypnotic responsivity, when combined with more traditional neurophysiological and cognitive-behavioral methodologies, may allow for a more comprehensive understanding of that enigma called hypnosis.

  8. THE DISTRIBUTION OF COOK’S D STATISTIC

    PubMed Central

    Muller, Keith E.; Mok, Mario Chen

    2013-01-01

    Cook (1977) proposed a diagnostic to quantify the impact of deleting an observation on the estimated regression coefficients of a General Linear Univariate Model (GLUM). Simulations of models with Gaussian response and predictors demonstrate that his suggestion of comparing the diagnostic to the median of the F for overall regression captures an erratically varying proportion of the values. We describe the exact distribution of Cook’s statistic for a GLUM with Gaussian predictors and response. We also present computational forms, simple approximations, and asymptotic results. A simulation supports the accuracy of the results. The methods allow accurate evaluation of a single value or the maximum value from a regression analysis. The approximations work well for a single value, but less well for the maximum. In contrast, the cut-point suggested by Cook provides widely varying tail probabilities. As with all diagnostics, the data analyst must use scientific judgment in deciding how to treat highlighted observations. PMID:24363487

  9. Supervised Learning for Dynamical System Learning.

    PubMed

    Hefny, Ahmed; Downey, Carlton; Gordon, Geoffrey J

    2015-01-01

    Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem, we present a new view of dynamical system learning: we show how to learn dynamical systems by solving a sequence of ordinary supervised learning problems, thereby allowing users to incorporate prior knowledge via standard techniques such as L 1 regularization. Many existing spectral methods are special cases of this new framework, using linear regression as the supervised learner. We demonstrate the effectiveness of our framework by showing examples where nonlinear regression or lasso let us learn better state representations than plain linear regression does; the correctness of these instances follows directly from our general analysis.

  10. MIXOR: a computer program for mixed-effects ordinal regression analysis.

    PubMed

    Hedeker, D; Gibbons, R D

    1996-03-01

    MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.

  11. Odontological approach to sexual dimorphism in southeastern France.

    PubMed

    Lladeres, Emilie; Saliba-Serre, Bérengère; Sastre, Julien; Foti, Bruno; Tardivo, Delphine; Adalian, Pascal

    2013-01-01

    The aim of this study was to establish a prediction formula to allow for the determination of sex among the southeastern French population using dental measurements. The sample consisted of 105 individuals (57 males and 48 females, aged between 18 and 25 years). Dental measurements were calculated using Euclidean distances, in three-dimensional space, from point coordinates obtained by a Microscribe. A multiple logistic regression analysis was performed to establish the prediction formula. Among 12 selected dental distances, a stepwise logistic regression analysis highlighted the two most significant discriminate predictors of sex: one located at the mandible and the other at the maxilla. A cutpoint was proposed to prediction of true sex. The prediction formula was then tested on a validation sample (20 males and 34 females, aged between 18 and 62 years and with a history of orthodontics or restorative care) to evaluate the accuracy of the method. © 2012 American Academy of Forensic Sciences.

  12. Risk stratification personalised model for prediction of life-threatening ventricular tachyarrhythmias in patients with chronic heart failure.

    PubMed

    Frolov, Alexander Vladimirovich; Vaikhanskaya, Tatjana Gennadjevna; Melnikova, Olga Petrovna; Vorobiev, Anatoly Pavlovich; Guel, Ludmila Michajlovna

    2017-01-01

    The development of prognostic factors of life-threatening ventricular tachyarrhythmias (VTA) and sudden cardiac death (SCD) continues to maintain its priority and relevance in cardiology. The development of a method of personalised prognosis based on multifactorial analysis of the risk factors associated with life-threatening heart rhythm disturbances is considered a key research and clinical task. To design a prognostic and mathematical model to define personalised risk for life-threatening VTA in patients with chronic heart failure (CHF). The study included 240 patients with CHF (mean-age of 50.5 ± 12.1 years; left ventricular ejection fraction 32.8 ± 10.9%; follow-up period 36.8 ± 5.7 months). The participants received basic therapy for heart failure. The elec-trocardiogram (ECG) markers of myocardial electrical instability were assessed including microvolt T-wave alternans, heart rate turbulence, heart rate deceleration, and QT dispersion. Additionally, echocardiography and Holter monitoring (HM) were performed. The cardiovascular events were considered as primary endpoints, including SCD, paroxysmal ventricular tachycardia/ventricular fibrillation (VT/VF) based on HM-ECG data, and data obtained from implantable device interrogation (CRT-D, ICD) as well as appropriated shocks. During the follow-up period, 66 (27.5%) subjects with CHF showed adverse arrhythmic events, including nine SCD events and 57 VTAs. Data from a stepwise discriminant analysis of cumulative ECG-markers of myocardial electrical instability were used to make a mathematical model of preliminary VTA risk stratification. Uni- and multivariate Cox logistic regression analysis were performed to define an individualised risk stratification model of SCD/VTA. A binary logistic regression model demonstrated a high prognostic significance of discriminant function with a classification sensitivity of 80.8% and specificity of 99.1% (F = 31.2; c2 = 143.2; p < 0.0001). The method of personalised risk stratification using Cox logistic regression allows correct classification of more than 93.9% of CHF cases. A robust body of evidence concerning logistic regression prognostic significance to define VTA risk allows inclusion of this method into the algorithm of subsequent control and selection of the optimal treatment modality to treat patients with CHF.

  13. Creep analysis of silicone for podiatry applications.

    PubMed

    Janeiro-Arocas, Julia; Tarrío-Saavedra, Javier; López-Beceiro, Jorge; Naya, Salvador; López-Canosa, Adrián; Heredia-García, Nicolás; Artiaga, Ramón

    2016-10-01

    This work shows an effective methodology to characterize the creep-recovery behavior of silicones before their application in podiatry. The aim is to characterize, model and compare the creep-recovery properties of different types of silicone used in podiatry orthotics. Creep-recovery phenomena of silicones used in podiatry orthotics is characterized by dynamic mechanical analysis (DMA). Silicones provided by Herbitas are compared by observing their viscoelastic properties by Functional Data Analysis (FDA) and nonlinear regression. The relationship between strain and time is modeled by fixed and mixed effects nonlinear regression to compare easily and intuitively podiatry silicones. Functional ANOVA and Kohlrausch-Willians-Watts (KWW) model with fixed and mixed effects allows us to compare different silicones observing the values of fitting parameters and their physical meaning. The differences between silicones are related to the variations of breadth of creep-recovery time distribution and instantaneous deformation-permanent strain. Nevertheless, the mean creep-relaxation time is the same for all the studied silicones. Silicones used in palliative orthoses have higher instantaneous deformation-permanent strain and narrower creep-recovery distribution. The proposed methodology based on DMA, FDA and nonlinear regression is an useful tool to characterize and choose the proper silicone for each podiatry application according to their viscoelastic properties. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Investigating student communities with network analysis of interactions in a physics learning center

    NASA Astrophysics Data System (ADS)

    Brewe, Eric; Kramer, Laird; Sawtelle, Vashti

    2012-06-01

    Developing a sense of community among students is one of the three pillars of an overall reform effort to increase participation in physics, and the sciences more broadly, at Florida International University. The emergence of a research and learning community, embedded within a course reform effort, has contributed to increased recruitment and retention of physics majors. We utilize social network analysis to quantify interactions in Florida International University’s Physics Learning Center (PLC) that support the development of academic and social integration. The tools of social network analysis allow us to visualize and quantify student interactions and characterize the roles of students within a social network. After providing a brief introduction to social network analysis, we use sequential multiple regression modeling to evaluate factors that contribute to participation in the learning community. Results of the sequential multiple regression indicate that the PLC learning community is an equitable environment as we find that gender and ethnicity are not significant predictors of participation in the PLC. We find that providing students space for collaboration provides a vital element in the formation of a supportive learning community.

  15. Using mixed treatment comparisons and meta-regression to perform indirect comparisons to estimate the efficacy of biologic treatments in rheumatoid arthritis.

    PubMed

    Nixon, R M; Bansback, N; Brennan, A

    2007-03-15

    Mixed treatment comparison (MTC) is a generalization of meta-analysis. Instead of the same treatment for a disease being tested in a number of studies, a number of different interventions are considered. Meta-regression is also a generalization of meta-analysis where an attempt is made to explain the heterogeneity between the treatment effects in the studies by regressing on study-level covariables. Our focus is where there are several different treatments considered in a number of randomized controlled trials in a specific disease, the same treatment can be applied in several arms within a study, and where differences in efficacy can be explained by differences in the study settings. We develop methods for simultaneously comparing several treatments and adjusting for study-level covariables by combining ideas from MTC and meta-regression. We use a case study from rheumatoid arthritis. We identified relevant trials of biologic verses standard therapy or placebo and extracted the doses, comparators and patient baseline characteristics. Efficacy is measured using the log odds ratio of achieving six-month ACR50 responder status. A random-effects meta-regression model is fitted which adjusts the log odds ratio for study-level prognostic factors. A different random-effect distribution on the log odds ratios is allowed for each different treatment. The odds ratio is found as a function of the prognostic factors for each treatment. The apparent differences in the randomized trials between tumour necrosis factor alpha (TNF- alpha) antagonists are explained by differences in prognostic factors and the analysis suggests that these drugs as a class are not different from each other. Copyright (c) 2006 John Wiley & Sons, Ltd.

  16. Multivariate meta-analysis using individual participant data

    PubMed Central

    Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.

    2016-01-01

    When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484

  17. Are We All in the Same Boat? The Role of Perceptual Distance in Organizational Health Interventions.

    PubMed

    Hasson, Henna; von Thiele Schwarz, Ulrica; Nielsen, Karina; Tafvelin, Susanne

    2016-10-01

    The study investigates how agreement between leaders' and their team's perceptions influence intervention outcomes in a leadership-training intervention aimed at improving organizational learning. Agreement, i.e. perceptual distance was calculated for the organizational learning dimensions at baseline. Changes in the dimensions from pre-intervention to post-intervention were evaluated using polynomial regression analysis with response surface analysis. The general pattern of the results indicated that the organizational learning improved when leaders and their teams agreed on the level of organizational learning prior to the intervention. The improvement was greatest when the leader's and the team's perceptions at baseline were aligned and high rather than aligned and low. The least beneficial scenario was when the leader's perceptions were higher than the team's perceptions. These results give insights into the importance of comparing leaders' and their team's perceptions in intervention research. Polynomial regression analyses with response surface methodology allow three-dimensional examination of relationship between two predictor variables and an outcome. This contributes with knowledge on how combination of predictor variables may affect outcome and allows studies of potential non-linearity relating to the outcome. Future studies could use these methods in process evaluation of interventions. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  18. Computer assisted analysis of auroral images obtained from high altitude polar satellites

    NASA Technical Reports Server (NTRS)

    Samadani, Ramin; Flynn, Michael

    1993-01-01

    Automatic techniques that allow the extraction of physically significant parameters from auroral images were developed. This allows the processing of a much larger number of images than is currently possible with manual techniques. Our techniques were applied to diverse auroral image datasets. These results were made available to geophysicists at NASA and at universities in the form of a software system that performs the analysis. After some feedback from users, an upgraded system was transferred to NASA and to two universities. The feasibility of user-trained search and retrieval of large amounts of data using our automatically derived parameter indices was demonstrated. Techniques based on classification and regression trees (CART) were developed and applied to broaden the types of images to which the automated search and retrieval may be applied. Our techniques were tested with DE-1 auroral images.

  19. Nonlinear estimation of parameters in biphasic Arrhenius plots.

    PubMed

    Puterman, M L; Hrboticky, N; Innis, S M

    1988-05-01

    This paper presents a formal procedure for the statistical analysis of data on the thermotropic behavior of membrane-bound enzymes generated using the Arrhenius equation and compares the analysis to several alternatives. Data is modeled by a bent hyperbola. Nonlinear regression is used to obtain estimates and standard errors of the intersection of line segments, defined as the transition temperature, and slopes, defined as energies of activation of the enzyme reaction. The methodology allows formal tests of the adequacy of a biphasic model rather than either a single straight line or a curvilinear model. Examples on data concerning the thermotropic behavior of pig brain synaptosomal acetylcholinesterase are given. The data support the biphasic temperature dependence of this enzyme. The methodology represents a formal procedure for statistical validation of any biphasic data and allows for calculation of all line parameters with estimates of precision.

  20. PIXE analysis of cystic fibrosis sweat samples with an external proton beam

    NASA Astrophysics Data System (ADS)

    Sommer, F.; Massonnet, B.

    1987-03-01

    PIXE analysis with an external proton beam is used to study, in four control and five cystic fibrosis children, the elemental composition of sweat samples collected from different parts of the body during entire body hyperthermia. We observe no significant difference of sweat rates and of temperature variations between the two groups during sweat test. The statistical study of results obtained by PIXE analysis allows us to pick out amongst 8 elements studied, 6 elements (Na, Cl, Ca, Mn, Cu, Br) significatively different between the two groups of subjects. Using regression analysis, Na, Cl and Br concentrations could be used in a predictive equation of the state of health.

  1. Is Susceptibility to Prenatal Methylmercury Exposure from Fish Consumption Non-Homogeneous? Tree-Structured Analysis for the Seychelles Child Development Study

    PubMed Central

    Huang, Li-Shan; Myers, Gary J.; Davidson, Philip W.; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W.; Cernichiari, Elsa; Shamlaye, Conrad F.; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W.

    2007-01-01

    Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age nine years. The analyses for the most recent nine-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated nonlinearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of twenty-one endpoints available at age nine years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other nine-year endpoints that in the linear analysis has a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels. PMID:17942158

  2. Is susceptibility to prenatal methylmercury exposure from fish consumption non-homogeneous? Tree-structured analysis for the Seychelles Child Development Study.

    PubMed

    Huang, Li-Shan; Myers, Gary J; Davidson, Philip W; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W; Cernichiari, Elsa; Shamlaye, Conrad F; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W

    2007-11-01

    Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age 9 years. The analyses for the most recent 9-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated non-linearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of 21 endpoints available at age 9 years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other 9-year endpoints that in the linear analysis had a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels.

  3. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    PubMed

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.

  4. A Semiparametric Change-Point Regression Model for Longitudinal Observations.

    PubMed

    Xing, Haipeng; Ying, Zhiliang

    2012-12-01

    Many longitudinal studies involve relating an outcome process to a set of possibly time-varying covariates, giving rise to the usual regression models for longitudinal data. When the purpose of the study is to investigate the covariate effects when experimental environment undergoes abrupt changes or to locate the periods with different levels of covariate effects, a simple and easy-to-interpret approach is to introduce change-points in regression coefficients. In this connection, we propose a semiparametric change-point regression model, in which the error process (stochastic component) is nonparametric and the baseline mean function (functional part) is completely unspecified, the observation times are allowed to be subject-specific, and the number, locations and magnitudes of change-points are unknown and need to be estimated. We further develop an estimation procedure which combines the recent advance in semiparametric analysis based on counting process argument and multiple change-points inference, and discuss its large sample properties, including consistency and asymptotic normality, under suitable regularity conditions. Simulation results show that the proposed methods work well under a variety of scenarios. An application to a real data set is also given.

  5. Early Warning Signals of Financial Crises with Multi-Scale Quantile Regressions of Log-Periodic Power Law Singularities

    PubMed Central

    Zhang, Qun; Zhang, Qunzhi; Sornette, Didier

    2016-01-01

    We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the “S&P 500 1987” bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs. PMID:27806093

  6. Bayesian dose-response analysis for epidemiological studies with complex uncertainty in dose estimation.

    PubMed

    Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L

    2016-02-10

    Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.

  7. A PDE approach for quantifying and visualizing tumor progression and regression

    NASA Astrophysics Data System (ADS)

    Sintay, Benjamin J.; Bourland, J. Daniel

    2009-02-01

    Quantification of changes in tumor shape and size allows physicians the ability to determine the effectiveness of various treatment options, adapt treatment, predict outcome, and map potential problem sites. Conventional methods are often based on metrics such as volume, diameter, or maximum cross sectional area. This work seeks to improve the visualization and analysis of tumor changes by simultaneously analyzing changes in the entire tumor volume. This method utilizes an elliptic partial differential equation (PDE) to provide a roadmap of boundary displacement that does not suffer from the discontinuities associated with other measures such as Euclidean distance. Streamline pathways defined by Laplace's equation (a commonly used PDE) are used to track tumor progression and regression at the tumor boundary. Laplace's equation is particularly useful because it provides a smooth, continuous solution that can be evaluated with sub-pixel precision on variable grid sizes. Several metrics are demonstrated including maximum, average, and total regression and progression. This method provides many advantages over conventional means of quantifying change in tumor shape because it is observer independent, stable for highly unusual geometries, and provides an analysis of the entire three-dimensional tumor volume.

  8. Effect of socioeconomic deprivation on waiting time for cardiac surgery: retrospective cohort study

    PubMed Central

    Pell, Jill P; Pell, Alastair C H; Norrie, John; Ford, Ian; Cobbe, Stuart M

    2000-01-01

    Objective To determine whether the priority given to patients referred for cardiac surgery is associated with socioeconomic status. Design Retrospective study with multivariate logistic regression analysis of the association between deprivation and classification of urgency with allowance for age, sex, and type of operation. Multivariate linear regression analysis was used to determine association between deprivation and waiting time within each category of urgency, with allowance for age, sex, and type of operation. Setting NHS waiting lists in Scotland. Participants 26 642 patients waiting for cardiac surgery, 1 January 1986 to 31 December 1997. Main outcome measures Deprivation as measured by Carstairs deprivation category. Time spent on NHS waiting list. Results Patients who were most deprived tended to be younger and were more likely to be female. Patients in deprivation categories 6 and 7 (most deprived) waited about three weeks longer for surgery than those in category 1 (mean difference 24 days, 95% confidence interval 15 to 32). Deprived patients had an odds ratio of 0.5 (0.46 to 0.61) for having their operations classified as urgent compared with the least deprived, after allowance for age, sex, and type of operation. When urgent and routine cases were considered separately, there was no significant difference in waiting times between the most and least deprived categories. Conclusions Socioeconomically deprived patients are thought to be more likely to develop coronary heart disease but are less likely to be investigated and offered surgery once it has developed. Such patients may be further disadvantaged by having to wait longer for surgery because of being given lower priority. PMID:10617517

  9. Intra- and inter-population variability and evaluation of the physical development of a young generation.

    PubMed

    Yampolskaya, Yulia A

    2005-07-01

    This study is concerned with long-term anthropometric examinations of children and adolescents aged 3-17 years in Moscow (over 10,500 persons, longitudinal and cross-sectional). Population variability of physical development was analyzed by means of regional estimation tables, which were developed on the basis of a regression analysis (scale of the regression of body mass to body length within a range from M - 1sigmaR to M + 2 sigmaR) and used for individual and group diagnostics taking into account age and sex. Such an approach allowed for the determination of the dynamics of the variability of Moscow schoolchildren from decade to decade (inter-population variability) and variations due to social differences (intra-population variability).

  10. Demonstration of a Fiber Optic Regression Probe in a High-Temperature Flow

    NASA Technical Reports Server (NTRS)

    Korman, Valentin; Polzin, Kurt

    2011-01-01

    The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for empirically anchoring any analysis geared towards lifetime qualification. Erosion rate data over an operating envelope could also be useful in the modeling detailed physical processes. The sensor has been embedded in many regressing media to demonstrate the capabilities in a number of regressing environments. In the present work, sensors were installed in the eroding/regressing throat region of a converging-diverging flow, with the working gas heated to high temperatures by means of a high-pressure arc discharge at steady-state discharge power levels up to 500 kW. The amount of regression observed in each material sample was quantified using a later profilometer, which was compared to the in-situ erosion measurements to demonstrate the efficacy of the measurement technique in very harsh, high-temperature environments.

  11. Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test

    PubMed Central

    Zhao, Ni; Chen, Jun; Carroll, Ian M.; Ringel-Kulka, Tamar; Epstein, Michael P.; Zhou, Hua; Zhou, Jin J.; Ringel, Yehuda; Li, Hongzhe; Wu, Michael C.

    2015-01-01

    High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals’ microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. “Optimal” MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for. PMID:25957468

  12. Hazard Regression Models of Early Mortality in Trauma Centers

    PubMed Central

    Clark, David E; Qian, Jing; Winchell, Robert J; Betensky, Rebecca A

    2013-01-01

    Background Factors affecting early hospital deaths after trauma may be different from factors affecting later hospital deaths, and the distribution of short and long prehospital times may vary among hospitals. Hazard regression (HR) models may therefore be more useful than logistic regression (LR) models for analysis of trauma mortality, especially when treatment effects at different time points are of interest. Study Design We obtained data for trauma center patients from the 2008–9 National Trauma Data Bank (NTDB). Cases were included if they had complete data for prehospital times, hospital times, survival outcome, age, vital signs, and severity scores. Cases were excluded if pulseless on admission, transferred in or out, or ISS<9. Using covariates proposed for the Trauma Quality Improvement Program and an indicator for each hospital, we compared LR models predicting survival at 8 hours after injury to HR models with survival censored at 8 hours. HR models were then modified to allow time-varying hospital effects. Results 85,327 patients in 161 hospitals met inclusion criteria. Crude hazards peaked initially, then steadily declined. When hazard ratios were assumed constant in HR models, they were similar to odds ratios in LR models associating increased mortality with increased age, firearm mechanism, increased severity, more deranged physiology, and estimated hospital-specific effects. However, when hospital effects were allowed to vary by time, HR models demonstrated that hospital outliers were not the same at different times after injury. Conclusions HR models with time-varying hazard ratios reveal inconsistencies in treatment effects, data quality, and/or timing of early death among trauma centers. HR models are generally more flexible than LR models, can be adapted for censored data, and potentially offer a better tool for analysis of factors affecting early death after injury. PMID:23036828

  13. A Unified and Comprehensible View of Parametric and Kernel Methods for Genomic Prediction with Application to Rice.

    PubMed

    Jacquin, Laval; Cao, Tuong-Vi; Ahmadi, Nourollah

    2016-01-01

    One objective of this study was to provide readers with a clear and unified understanding of parametric statistical and kernel methods, used for genomic prediction, and to compare some of these in the context of rice breeding for quantitative traits. Furthermore, another objective was to provide a simple and user-friendly R package, named KRMM, which allows users to perform RKHS regression with several kernels. After introducing the concept of regularized empirical risk minimization, the connections between well-known parametric and kernel methods such as Ridge regression [i.e., genomic best linear unbiased predictor (GBLUP)] and reproducing kernel Hilbert space (RKHS) regression were reviewed. Ridge regression was then reformulated so as to show and emphasize the advantage of the kernel "trick" concept, exploited by kernel methods in the context of epistatic genetic architectures, over parametric frameworks used by conventional methods. Some parametric and kernel methods; least absolute shrinkage and selection operator (LASSO), GBLUP, support vector machine regression (SVR) and RKHS regression were thereupon compared for their genomic predictive ability in the context of rice breeding using three real data sets. Among the compared methods, RKHS regression and SVR were often the most accurate methods for prediction followed by GBLUP and LASSO. An R function which allows users to perform RR-BLUP of marker effects, GBLUP and RKHS regression, with a Gaussian, Laplacian, polynomial or ANOVA kernel, in a reasonable computation time has been developed. Moreover, a modified version of this function, which allows users to tune kernels for RKHS regression, has also been developed and parallelized for HPC Linux clusters. The corresponding KRMM package and all scripts have been made publicly available.

  14. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  15. Bayesian nonparametric regression with varying residual density

    PubMed Central

    Pati, Debdeep; Dunson, David B.

    2013-01-01

    We consider the problem of robust Bayesian inference on the mean regression function allowing the residual density to change flexibly with predictors. The proposed class of models is based on a Gaussian process prior for the mean regression function and mixtures of Gaussians for the collection of residual densities indexed by predictors. Initially considering the homoscedastic case, we propose priors for the residual density based on probit stick-breaking (PSB) scale mixtures and symmetrized PSB (sPSB) location-scale mixtures. Both priors restrict the residual density to be symmetric about zero, with the sPSB prior more flexible in allowing multimodal densities. We provide sufficient conditions to ensure strong posterior consistency in estimating the regression function under the sPSB prior, generalizing existing theory focused on parametric residual distributions. The PSB and sPSB priors are generalized to allow residual densities to change nonparametrically with predictors through incorporating Gaussian processes in the stick-breaking components. This leads to a robust Bayesian regression procedure that automatically down-weights outliers and influential observations in a locally-adaptive manner. Posterior computation relies on an efficient data augmentation exact block Gibbs sampler. The methods are illustrated using simulated and real data applications. PMID:24465053

  16. Meta-analysis of haplotype-association studies: comparison of methods and empirical evaluation of the literature

    PubMed Central

    2011-01-01

    Background Meta-analysis is a popular methodology in several fields of medical research, including genetic association studies. However, the methods used for meta-analysis of association studies that report haplotypes have not been studied in detail. In this work, methods for performing meta-analysis of haplotype association studies are summarized, compared and presented in a unified framework along with an empirical evaluation of the literature. Results We present multivariate methods that use summary-based data as well as methods that use binary and count data in a generalized linear mixed model framework (logistic regression, multinomial regression and Poisson regression). The methods presented here avoid the inflation of the type I error rate that could be the result of the traditional approach of comparing a haplotype against the remaining ones, whereas, they can be fitted using standard software. Moreover, formal global tests are presented for assessing the statistical significance of the overall association. Although the methods presented here assume that the haplotypes are directly observed, they can be easily extended to allow for such an uncertainty by weighting the haplotypes by their probability. Conclusions An empirical evaluation of the published literature and a comparison against the meta-analyses that use single nucleotide polymorphisms, suggests that the studies reporting meta-analysis of haplotypes contain approximately half of the included studies and produce significant results twice more often. We show that this excess of statistically significant results, stems from the sub-optimal method of analysis used and, in approximately half of the cases, the statistical significance is refuted if the data are properly re-analyzed. Illustrative examples of code are given in Stata and it is anticipated that the methods developed in this work will be widely applied in the meta-analysis of haplotype association studies. PMID:21247440

  17. What is new about covered interest parity condition in the European Union? Evidence from fractal cross-correlation regressions

    NASA Astrophysics Data System (ADS)

    Ferreira, Paulo; Kristoufek, Ladislav

    2017-11-01

    We analyse the covered interest parity (CIP) using two novel regression frameworks based on cross-correlation analysis (detrended cross-correlation analysis and detrending moving-average cross-correlation analysis), which allow for studying the relationships at different scales and work well under non-stationarity and heavy tails. CIP is a measure of capital mobility commonly used to analyse financial integration, which remains an interesting feature of study in the context of the European Union. The importance of this features is related to the fact that the adoption of a common currency is associated with some benefits for countries, but also involves some risks such as the loss of economic instruments to face possible asymmetric shocks. While studying the Eurozone members could explain some problems in the common currency, studying the non-Euro countries is important to analyse if they are fit to take the possible benefits. Our results point to the CIP verification mainly in the Central European countries while in the remaining countries, the verification of the parity is only residual.

  18. Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.

    PubMed

    Houpt, Joseph W; Bittner, Jennifer L

    2018-07-01

    Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. MIXREG: a computer program for mixed-effects regression analysis with autocorrelated errors.

    PubMed

    Hedeker, D; Gibbons, R D

    1996-05-01

    MIXREG is a program that provides estimates for a mixed-effects regression model (MRM) for normally-distributed response data including autocorrelated errors. This model can be used for analysis of unbalanced longitudinal data, where individuals may be measured at a different number of timepoints, or even at different timepoints. Autocorrelated errors of a general form or following an AR(1), MA(1), or ARMA(1,1) form are allowable. This model can also be used for analysis of clustered data, where the mixed-effects model assumes data within clusters are dependent. The degree of dependency is estimated jointly with estimates of the usual model parameters, thus adjusting for clustering. MIXREG uses maximum marginal likelihood estimation, utilizing both the EM algorithm and a Fisher-scoring solution. For the scoring solution, the covariance matrix of the random effects is expressed in its Gaussian decomposition, and the diagonal matrix reparameterized using the exponential transformation. Estimation of the individual random effects is accomplished using an empirical Bayes approach. Examples illustrating usage and features of MIXREG are provided.

  20. The perception of the relationship between environment and health according to data from Italian Behavioural Risk Factor Surveillance System (PASSI).

    PubMed

    Sampaolo, Letizia; Tommaso, Giulia; Gherardi, Bianca; Carrozzi, Giuliano; Freni Sterrantino, Anna; Ottone, Marta; Goldoni, Carlo Alberto; Bertozzi, Nicoletta; Scaringi, Meri; Bolognesi, Lara; Masocco, Maria; Salmaso, Stefania; Lauriola, Paolo

    2017-01-01

    "OBJECTIVES: to identify groups of people in relation to the perception of environmental risk and to assess the main characteristics using data collected in the environmental module of the surveillance network Italian Behavioral Risk Factor Surveillance System (PASSI). perceptive profiles were identified using a latent class analysis; later they were included as outcome in multinomial logistic regression models to assess the association between environmental risk perception and demographic, health, socio-economic and behavioural variables. the latent class analysis allowed to split the sample in "worried", "indifferent", and "positive" people. The multinomial logistic regression model showed that the "worried" profile typically includes people of Italian nationality, living in highly urbanized areas, with a high level of education, and with economic difficulties; they pay special attention to their own health and fitness, but they have a negative perception of their own psychophysical state. the application of advanced statistical analysis enable to appraise PASSI data in order to characterize the perception of environmental risk, making the planning of interventions related to risk communication possible. ".

  1. High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

    PubMed Central

    Daye, Z. John; Chen, Jinbo; Li, Hongzhe

    2011-01-01

    Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833

  2. Project risk management in the construction of high-rise buildings

    NASA Astrophysics Data System (ADS)

    Titarenko, Boris; Hasnaoui, Amir; Titarenko, Roman; Buzuk, Liliya

    2018-03-01

    This paper shows the project risk management methods, which allow to better identify risks in the construction of high-rise buildings and to manage them throughout the life cycle of the project. One of the project risk management processes is a quantitative analysis of risks. The quantitative analysis usually includes the assessment of the potential impact of project risks and their probabilities. This paper shows the most popular methods of risk probability assessment and tries to indicate the advantages of the robust approach over the traditional methods. Within the framework of the project risk management model a robust approach of P. Huber is applied and expanded for the tasks of regression analysis of project data. The suggested algorithms used to assess the parameters in statistical models allow to obtain reliable estimates. A review of the theoretical problems of the development of robust models built on the methodology of the minimax estimates was done and the algorithm for the situation of asymmetric "contamination" was developed.

  3. MAGMA: Generalized Gene-Set Analysis of GWAS Data

    PubMed Central

    de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle

    2015-01-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710

  4. MAGMA: generalized gene-set analysis of GWAS data.

    PubMed

    de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

    2015-04-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

  5. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    PubMed

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  6. Predicting U.S. Army Reserve Unit Manning Using Market Demographics

    DTIC Science & Technology

    2015-06-01

    develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S

  7. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  8. A mixed-effects regression model for longitudinal multivariate ordinal data.

    PubMed

    Liu, Li C; Hedeker, Donald

    2006-03-01

    A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.

  9. Estimating life expectancies for US small areas: a regression framework

    NASA Astrophysics Data System (ADS)

    Congdon, Peter

    2014-01-01

    Analysis of area mortality variations and estimation of area life tables raise methodological questions relevant to assessing spatial clustering, and socioeconomic inequalities in mortality. Existing small area analyses of US life expectancy variation generally adopt ad hoc amalgamations of counties to alleviate potential instability of mortality rates involved in deriving life tables, and use conventional life table analysis which takes no account of correlated mortality for adjacent areas or ages. The alternative strategy here uses structured random effects methods that recognize correlations between adjacent ages and areas, and allows retention of the original county boundaries. This strategy generalizes to include effects of area category (e.g. poverty status, ethnic mix), allowing estimation of life tables according to area category, and providing additional stabilization of estimated life table functions. This approach is used here to estimate stabilized mortality rates, derive life expectancies in US counties, and assess trends in clustering and in inequality according to county poverty category.

  10. The Necessity-Concerns-Framework: A Multidimensional Theory Benefits from Multidimensional Analysis

    PubMed Central

    Phillips, L. Alison; Diefenbach, Michael; Kronish, Ian M.; Negron, Rennie M.; Horowitz, Carol R.

    2014-01-01

    Background Patients’ medication-related concerns and necessity-beliefs predict adherence. Evaluation of the potentially complex interplay of these two dimensions has been limited because of methods that reduce them to a single dimension (difference scores). Purpose We use polynomial regression to assess the multidimensional effect of stroke-event survivors’ medication-related concerns and necessity-beliefs on their adherence to stroke-prevention medication. Methods Survivors (n=600) rated their concerns, necessity-beliefs, and adherence to medication. Confirmatory and exploratory polynomial regression determined the best-fitting multidimensional model. Results As posited by the Necessity-Concerns Framework (NCF), the greatest and lowest adherence was reported by those with strong necessity-beliefs/weak concerns and strong concerns/weak necessity-beliefs, respectively. However, as could not be assessed using a difference-score model, patients with ambivalent beliefs were less adherent than those exhibiting indifference. Conclusions Polynomial regression allows for assessment of the multidimensional nature of the NCF. Clinicians/Researchers should be aware that concerns and necessity dimensions are not polar opposites. PMID:24500078

  11. The necessity-concerns framework: a multidimensional theory benefits from multidimensional analysis.

    PubMed

    Phillips, L Alison; Diefenbach, Michael A; Kronish, Ian M; Negron, Rennie M; Horowitz, Carol R

    2014-08-01

    Patients' medication-related concerns and necessity-beliefs predict adherence. Evaluation of the potentially complex interplay of these two dimensions has been limited because of methods that reduce them to a single dimension (difference scores). We use polynomial regression to assess the multidimensional effect of stroke-event survivors' medication-related concerns and necessity beliefs on their adherence to stroke-prevention medication. Survivors (n = 600) rated their concerns, necessity beliefs, and adherence to medication. Confirmatory and exploratory polynomial regression determined the best-fitting multidimensional model. As posited by the necessity-concerns framework (NCF), the greatest and lowest adherence was reported by those necessity weak concerns and strong concerns/weak Necessity-Beliefs, respectively. However, as could not be assessed using a difference-score model, patients with ambivalent beliefs were less adherent than those exhibiting indifference. Polynomial regression allows for assessment of the multidimensional nature of the NCF. Clinicians/Researchers should be aware that concerns and necessity dimensions are not polar opposites.

  12. Computational tools for exact conditional logistic regression.

    PubMed

    Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P

    Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.

  13. Feasibility study of palm-based fuels for hybrid rocket motor applications

    NASA Astrophysics Data System (ADS)

    Tarmizi Ahmad, M.; Abidin, Razali; Taha, A. Latif; Anudip, Amzaryi

    2018-02-01

    This paper describes the combined analysis done in pure palm-based wax that can be used as solid fuel in a hybrid rocket engine. The measurement of pure palm wax calorific value was performed using a bomb calorimeter. An experimental rocket engine and static test stand facility were established. After initial measurement and calibration, repeated procedures were performed. Instrumentation supplies carried out allow fuel regression rate measurements, oxidizer mass flow rates and stearic acid rocket motors measurements. Similar tests are also carried out with stearate acid (from palm oil by-products) dissolved with nitrocellulose and bee solution. Calculated data and experiments show that rates and regression thrust can be achieved even in pure-tested palm-based wax. Additionally, palm-based wax is mixed with beeswax characterized by higher nominal melting temperatures to increase moisturizing points to higher temperatures without affecting regression rate values. Calorie measurements and ballistic experiments were performed on this new fuel formulation. This new formulation promises driving applications in a wide range of temperatures.

  14. EUCLID: an outcome analysis tool for high-dimensional clinical studies

    NASA Astrophysics Data System (ADS)

    Gayou, Olivier; Parda, David S.; Miften, Moyed

    2007-03-01

    Treatment management decisions in three-dimensional conformal radiation therapy (3DCRT) and intensity-modulated radiation therapy (IMRT) are usually made based on the dose distributions in the target and surrounding normal tissue. These decisions may include, for example, the choice of one treatment over another and the level of tumour dose escalation. Furthermore, biological predictors such as tumour control probability (TCP) and normal tissue complication probability (NTCP), whose parameters available in the literature are only population-based estimates, are often used to assess and compare plans. However, a number of other clinical, biological and physiological factors also affect the outcome of radiotherapy treatment and are often not considered in the treatment planning and evaluation process. A statistical outcome analysis tool, EUCLID, for direct use by radiation oncologists and medical physicists was developed. The tool builds a mathematical model to predict an outcome probability based on a large number of clinical, biological, physiological and dosimetric factors. EUCLID can first analyse a large set of patients, such as from a clinical trial, to derive regression correlation coefficients between these factors and a given outcome. It can then apply such a model to an individual patient at the time of treatment to derive the probability of that outcome, allowing the physician to individualize the treatment based on medical evidence that encompasses a wide range of factors. The software's flexibility allows the clinicians to explore several avenues to select the best predictors of a given outcome. Its link to record-and-verify systems and data spreadsheets allows for a rapid and practical data collection and manipulation. A wide range of statistical information about the study population, including demographics and correlations between different factors, is available. A large number of one- and two-dimensional plots, histograms and survival curves allow for an easy visual analysis of the population. Several visual and analytical methods are available to quantify the predictive power of the multivariate regression model. The EUCLID tool can be readily integrated with treatment planning and record-and-verify systems.

  15. EUCLID: an outcome analysis tool for high-dimensional clinical studies.

    PubMed

    Gayou, Olivier; Parda, David S; Miften, Moyed

    2007-03-21

    Treatment management decisions in three-dimensional conformal radiation therapy (3DCRT) and intensity-modulated radiation therapy (IMRT) are usually made based on the dose distributions in the target and surrounding normal tissue. These decisions may include, for example, the choice of one treatment over another and the level of tumour dose escalation. Furthermore, biological predictors such as tumour control probability (TCP) and normal tissue complication probability (NTCP), whose parameters available in the literature are only population-based estimates, are often used to assess and compare plans. However, a number of other clinical, biological and physiological factors also affect the outcome of radiotherapy treatment and are often not considered in the treatment planning and evaluation process. A statistical outcome analysis tool, EUCLID, for direct use by radiation oncologists and medical physicists was developed. The tool builds a mathematical model to predict an outcome probability based on a large number of clinical, biological, physiological and dosimetric factors. EUCLID can first analyse a large set of patients, such as from a clinical trial, to derive regression correlation coefficients between these factors and a given outcome. It can then apply such a model to an individual patient at the time of treatment to derive the probability of that outcome, allowing the physician to individualize the treatment based on medical evidence that encompasses a wide range of factors. The software's flexibility allows the clinicians to explore several avenues to select the best predictors of a given outcome. Its link to record-and-verify systems and data spreadsheets allows for a rapid and practical data collection and manipulation. A wide range of statistical information about the study population, including demographics and correlations between different factors, is available. A large number of one- and two-dimensional plots, histograms and survival curves allow for an easy visual analysis of the population. Several visual and analytical methods are available to quantify the predictive power of the multivariate regression model. The EUCLID tool can be readily integrated with treatment planning and record-and-verify systems.

  16. Computerized dynamic posturography: the influence of platform stability on postural control.

    PubMed

    Palm, Hans-Georg; Lang, Patricia; Strobel, Johannes; Riesner, Hans-Joachim; Friemert, Benedikt

    2014-01-01

    Postural stability can be quantified using posturography systems, which allow different foot platform stability settings to be selected. It is unclear, however, how platform stability and postural control are mathematically correlated. Twenty subjects performed tests on the Biodex Stability System at all 13 stability levels. Overall stability index, medial-lateral stability index, and anterior-posterior stability index scores were calculated, and data were analyzed using analysis of variance and linear regression analysis. A decrease in platform stability from the static level to the second least stable level was associated with a linear decrease in postural control. The overall stability index scores were 1.5 ± 0.8 degrees (static), 2.2 ± 0.9 degrees (level 8), and 3.6 ± 1.7 degrees (level 2). The slope of the regression lines was 0.17 for the men and 0.10 for the women. A linear correlation was demonstrated between platform stability and postural control. The influence of stability levels seems to be almost twice as high in men as in women.

  17. A canonical correlation neural network for multicollinearity and functional data.

    PubMed

    Gou, Zhenkun; Fyfe, Colin

    2004-03-01

    We review a recent neural implementation of Canonical Correlation Analysis and show, using ideas suggested by Ridge Regression, how to make the algorithm robust. The network is shown to operate on data sets which exhibit multicollinearity. We develop a second model which not only performs as well on multicollinear data but also on general data sets. This model allows us to vary a single parameter so that the network is capable of performing Partial Least Squares regression (at one extreme) to Canonical Correlation Analysis (at the other)and every intermediate operation between the two. On multicollinear data, the parameter setting is shown to be important but on more general data no particular parameter setting is required. Finally, we develop a second penalty term which acts on such data as a smoother in that the resulting weight vectors are much smoother and more interpretable than the weights without the robustification term. We illustrate our algorithms on both artificial and real data.

  18. Multivariate meta-analysis using individual participant data.

    PubMed

    Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R

    2015-06-01

    When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.

  19. Heritability estimations for inner muscular fat in Hereford cattle using random regressions

    USDA-ARS?s Scientific Manuscript database

    Random regressions make possible to make genetic predictions and parameters estimation across a gradient of environments, allowing a more accurate and beneficial use of animals as breeders in specific environments. The objective of this study was to use random regression models to estimate heritabil...

  20. Adaptation of a weighted regression approach to evaluate water quality trends in anestuary

    EPA Science Inventory

    To improve the description of long-term changes in water quality, a weighted regression approach developed to describe trends in pollutant transport in rivers was adapted to analyze a long-term water quality dataset from Tampa Bay, Florida. The weighted regression approach allows...

  1. SEPARABLE FACTOR ANALYSIS WITH APPLICATIONS TO MORTALITY DATA

    PubMed Central

    Fosdick, Bailey K.; Hoff, Peter D.

    2014-01-01

    Human mortality data sets can be expressed as multiway data arrays, the dimensions of which correspond to categories by which mortality rates are reported, such as age, sex, country and year. Regression models for such data typically assume an independent error distribution or an error model that allows for dependence along at most one or two dimensions of the data array. However, failing to account for other dependencies can lead to inefficient estimates of regression parameters, inaccurate standard errors and poor predictions. An alternative to assuming independent errors is to allow for dependence along each dimension of the array using a separable covariance model. However, the number of parameters in this model increases rapidly with the dimensions of the array and, for many arrays, maximum likelihood estimates of the covariance parameters do not exist. In this paper, we propose a submodel of the separable covariance model that estimates the covariance matrix for each dimension as having factor analytic structure. This model can be viewed as an extension of factor analysis to array-valued data, as it uses a factor model to estimate the covariance along each dimension of the array. We discuss properties of this model as they relate to ordinary factor analysis, describe maximum likelihood and Bayesian estimation methods, and provide a likelihood ratio testing procedure for selecting the factor model ranks. We apply this methodology to the analysis of data from the Human Mortality Database, and show in a cross-validation experiment how it outperforms simpler methods. Additionally, we use this model to impute mortality rates for countries that have no mortality data for several years. Unlike other approaches, our methodology is able to estimate similarities between the mortality rates of countries, time periods and sexes, and use this information to assist with the imputations. PMID:25489353

  2. Continuity of cannabis use and violent offending over the life course.

    PubMed

    Schoeler, T; Theobald, D; Pingault, J-B; Farrington, D P; Jennings, W G; Piquero, A R; Coid, J W; Bhattacharyya, S

    2016-06-01

    Although the association between cannabis use and violence has been reported in the literature, the precise nature of this relationship, especially the directionality of the association, is unclear. Young males from the Cambridge Study of Delinquent Development (n = 411) were followed up between the ages of 8 and 56 years to prospectively investigate the association between cannabis use and violence. A multi-wave (eight assessments, T1-T8) follow-up design was employed that allowed temporal sequencing of the variables of interest and the analysis of violent outcome measures obtained from two sources: (i) criminal records (violent conviction); and (ii) self-reports. A combination of analytic approaches allowing inferences as to the directionality of associations was employed, including multivariate logistic regression analysis, fixed-effects analysis and cross-lagged modelling. Multivariable logistic regression revealed that compared with never-users, continued exposure to cannabis (use at age 18, 32 and 48 years) was associated with a higher risk of subsequent violent behaviour, as indexed by convictions [odds ratio (OR) 7.1, 95% confidence interval (CI) 2.19-23.59] or self-reports (OR 8.9, 95% CI 2.37-46.21). This effect persisted after controlling for other putative risk factors for violence. In predicting violence, fixed-effects analysis and cross-lagged modelling further indicated that this effect could not be explained by other unobserved time-invariant factors. Furthermore, these analyses uncovered a bi-directional relationship between cannabis use and violence. Together, these results provide strong indication that cannabis use predicts subsequent violent offending, suggesting a possible causal effect, and provide empirical evidence that may have implications for public policy.

  3. Development of a chromatographic method with multi-criteria decision making design for simultaneous determination of nifedipine and atenolol in content uniformity testing.

    PubMed

    Ahmed, Sameh; Alqurshi, Abdulmalik; Mohamed, Abdel-Maaboud Ismail

    2018-07-01

    A new robust and reliable high-performance liquid chromatography (HPLC) method with multi-criteria decision making (MCDM) approach was developed to allow simultaneous quantification of atenolol (ATN) and nifedipine (NFD) in content uniformity testing. Felodipine (FLD) was used as an internal standard (I.S.) in this study. A novel marriage between a new interactive response optimizer and a HPLC method was suggested for multiple response optimizations of target responses. An interactive response optimizer was used as a decision and prediction tool for the optimal settings of target responses, according to specified criteria, based on Derringer's desirability. Four independent variables were considered in this study: Acetonitrile%, buffer pH and concentration along with column temperature. Eight responses were optimized: retention times of ATN, NFD, and FLD, resolutions between ATN/NFD and NFD/FLD, and plate numbers for ATN, NFD, and FLD. Multiple regression analysis was applied in order to scan the influences of the most significant variables for the regression models. The experimental design was set to give minimum retention times, maximum resolution and plate numbers. The interactive response optimizer allowed prediction of optimum conditions according to these criteria with a good composite desirability value of 0.98156. The developed method was validated according to the International Conference on Harmonization (ICH) guidelines with the aid of the experimental design. The developed MCDM-HPLC method showed superior robustness and resolution in short analysis time allowing successful simultaneous content uniformity testing of ATN and NFD in marketed capsules. The current work presents an interactive response optimizer as an efficient platform to optimize, predict responses, and validate HPLC methodology with tolerable design space for assay in quality control laboratories. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. In-Situ Measurement of Hall Thruster Erosion Using a Fiber Optic Regression Probe

    NASA Technical Reports Server (NTRS)

    Polzink, Kurt A.; Korman, Valentin

    2008-01-01

    One potential life-limiting mechanism in a Hall thruster is the erosion of the ceramic material comprising the discharge channel. This is especially true for missions that require long thrusting periods and can be problematic for lifetime qualification, especially when attempting to qualify a thruster by analysis rather than a test lasting the full duration of the mission. In addition to lifetime, several analytical and numerical models include electrode erosion as a mechanism contributing to enhanced transport properties. However, there is still a great deal of dispute over the importance of erosion to transport in Hall thrusters. The capability to perform an in-situ measurement of discharge channel erosion is useful in addressing both the lifetime and transport concerns. An in-situ measurement would allow for real-time data regarding the erosion rates at different operating points, providing a quick method for empirically anchoring any analysis geared towards lifetime qualification. Erosion rate data over a thruster's operating envelope would also be useful in the modeling of the detailed physics inside the discharge chamber. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor was tested using a linear Hall thruster geometry, which served as a means of producing plasma erosion of a ceramic discharge chamber. The mass flow rate, discharge voltage, and applied magnetic field strength could be varied, allowing for erosion measurements over a broad thruster operating envelope. Results are presented demonstrating the ability of the REAST sensor to capture not only the insulator erosion rates but also changes in these rates as a function of the discharge parameters.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thein, S.L.; Weatherall, D.J.; Sampietro, M.

    [open quotes]Heterocellular hereditary persistence of fetal hemoglobin[close quotes] (HPFH) is the term used to describe the genetically determined persistence of fetal hemoglobin (Hb F) production into adult life, in the absence of any related hematological disorder. Whereas some forms are caused by mutations in the [beta]-globin gene cluster on chromosome 11, others segregate independently. While the latter are of particular interest with respect to the regulation of globin gene switching, it has not been possible to determine their chromosomal location, mainly because their mode of inheritance is not clear, but also because several other factors are known to modify Hbmore » F production. The authors have examined a large Asian Indian pedigree which includes individuals with heterocellular HPFH associated with [beta]-thalassemia and/or [alpha]-thalassemia. Segregation analysis was conducted on the HPFH trait FC, defined to be the percentage of Hb F-containing cells (F-cells), using the class D regressive model. The results provide evidence for the presence of a major gene, dominant or codominant, which controls the FC values with residual familial correlations. The major gene was detected when the effects of genetic modifiers, notably [beta]-thalassemia and the XmnI-[sup G][gamma] polymorphism, are accounted for in this analysis. Linkage with the [beta]-globin gene cluster is excluded. The transmission of the FC values in this pedigree is informative enough to allow detection of linkage with an appropriate marker(s). The analytical approach outlined in this study, using simple regression to allow for genetic modifiers and thus allowing the mode of inheritance of a trait to be dissected out, may be useful as a model for segregation and linkage analyses of other complex phenotypes. 39 refs., 4 figs., 6 tabs.« less

  6. Selection of higher order regression models in the analysis of multi-factorial transcription data.

    PubMed

    Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W; Mansmann, Ulrich; Buch, Thorsten; Tresch, Achim

    2014-01-01

    Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.

  7. Evaluation of drainage-area ratio method used to estimate streamflow for the Red River of the North Basin, North Dakota and Minnesota

    USGS Publications Warehouse

    Emerson, Douglas G.; Vecchia, Aldo V.; Dahl, Ann L.

    2005-01-01

    The drainage-area ratio method commonly is used to estimate streamflow for sites where no streamflow data were collected. To evaluate the validity of the drainage-area ratio method and to determine if an improved method could be developed to estimate streamflow, a multiple-regression technique was used to determine if drainage area, main channel slope, and precipitation were significant variables for estimating streamflow in the Red River of the North Basin. A separate regression analysis was performed for streamflow for each of three seasons-- winter, spring, and summer. Drainage area and summer precipitation were the most significant variables. However, the regression equations generally overestimated streamflows for North Dakota stations and underestimated streamflows for Minnesota stations. To correct the bias in the residuals for the two groups of stations, indicator variables were included to allow both the intercept and the coefficient for the logarithm of drainage area to depend on the group. Drainage area was the only significant variable in the revised regression equations. The exponents for the drainage-area ratio were 0.85 for the winter season, 0.91 for the spring season, and 1.02 for the summer season.

  8. Mapping the Structure-Function Relationship in Glaucoma and Healthy Patients Measured with Spectralis OCT and Humphrey Perimetry

    PubMed Central

    Muñoz–Negrete, Francisco J.; Oblanca, Noelia; Rebolleda, Gema

    2018-01-01

    Purpose To study the structure-function relationship in glaucoma and healthy patients assessed with Spectralis OCT and Humphrey perimetry using new statistical approaches. Materials and Methods Eighty-five eyes were prospectively selected and divided into 2 groups: glaucoma (44) and healthy patients (41). Three different statistical approaches were carried out: (1) factor analysis of the threshold sensitivities (dB) (automated perimetry) and the macular thickness (μm) (Spectralis OCT), subsequently applying Pearson's correlation to the obtained regions, (2) nonparametric regression analysis relating the values in each pair of regions that showed significant correlation, and (3) nonparametric spatial regressions using three models designed for the purpose of this study. Results In the glaucoma group, a map that relates structural and functional damage was drawn. The strongest correlation with visual fields was observed in the peripheral nasal region of both superior and inferior hemigrids (r = 0.602 and r = 0.458, resp.). The estimated functions obtained with the nonparametric regressions provided the mean sensitivity that corresponds to each given macular thickness. These functions allowed for accurate characterization of the structure-function relationship. Conclusions Both maps and point-to-point functions obtained linking structure and function damage contribute to a better understanding of this relationship and may help in the future to improve glaucoma diagnosis. PMID:29850196

  9. Refining cost-effectiveness analyses using the net benefit approach and econometric methods: an example from a trial of anti-depressant treatment.

    PubMed

    Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony

    2013-04-01

    Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.

  10. Improvement of calculation method for electrical parameters of short network of ore-thermal furnaces

    NASA Astrophysics Data System (ADS)

    Aliferov, A. I.; Bikeev, R. A.; Goreva, L. P.

    2017-10-01

    The paper describes a new calculation method for active and inductive resistance of split interleaved current leads packages in ore-thermal electric furnaces. The method is developed on basis of regression analysis of dependencies of active and inductive resistances of the packages on their geometrical parameters, mutual disposition and interleaving pattern. These multi-parametric calculations have been performed with ANSYS software. The proposed method allows solving split current lead electrical parameters minimization and balancing problems for ore-thermal furnaces.

  11. Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.

    PubMed

    Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio

    2014-11-24

    The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.

  12. Optimizing Prophylactic CPAP in Patients Without Obstructive Sleep Apnoea for High-Risk Abdominal Surgeries: A Meta-regression Analysis.

    PubMed

    Singh, Preet Mohinder; Borle, Anuradha; Shah, Dipal; Sinha, Ashish; Makkar, Jeetinder Kaur; Trikha, Anjan; Goudra, Basavana Gouda

    2016-04-01

    Prophylactic continuous positive airway pressure (CPAP) can prevent pulmonary adverse events following upper abdominal surgeries. The present meta-regression evaluates and quantifies the effect of degree/duration of (CPAP) on the incidence of postoperative pulmonary events. Medical databases were searched for randomized controlled trials involving adult patients, comparing the outcome in those receiving prophylactic postoperative CPAP versus no CPAP, undergoing high-risk abdominal surgeries. Our meta-analysis evaluated the relationship between the postoperative pulmonary complications and the use of CPAP. Furthermore, meta-regression was used to quantify the effect of cumulative duration and degree of CPAP on the measured outcomes. Seventy-three potentially relevant studies were identified, of which 11 had appropriate data, allowing us to compare a total of 362 and 363 patients in CPAP and control groups, respectively. Qualitatively, Odds ratio for CPAP showed protective effect for pneumonia [0.39 (0.19-0.78)], atelectasis [0.51 (0.32-0.80)] and pulmonary complications [0.37 (0.24-0.56)] with zero heterogeneity. For prevention of pulmonary complications, odds ratio was better for continuous than intermittent CPAP. Meta-regression demonstrated a positive correlation between the degree of CPAP and the incidence of pneumonia with a regression coefficient of +0.61 (95 % CI 0.02-1.21, P = 0.048, τ (2) = 0.078, r (2) = 7.87 %). Overall, adverse effects were similar with or without the use of CPAP. Prophylactic postoperative use of continuous CPAP significantly reduces the incidence of postoperative pneumonia, atelectasis and pulmonary complications in patients undergoing high-risk abdominal surgeries. Quantitatively, increasing the CPAP levels does not necessarily enhance the protective effect against pneumonia. Instead, protective effect diminishes with increasing degree of CPAP.

  13. Why do they leave? Factors associated with job termination among personal assistant workers in home care.

    PubMed

    Butler, Sandra S; Simpson, Nan; Brennan, Mark; Turner, Winston

    2010-11-01

    Recruiting and retaining an adequate number of personal support workers in home care is both challenging and essential to allowing elders to age in place. A mixed-method, longitudinal study examined turnover in a sample of 261 personal support workers in Maine; 70 workers (26.8%) left their employment in the first year of the study. Logistic regression analysis indicated that younger age and lack of health insurance were significant predictors of turnover. Analysis of telephone interviews revealed three overarching themes related to termination: job not worthwhile, personal reasons, and burnout. Implications of study findings for gerontological social workers are outlined.

  14. Building intelligent communication systems for handicapped aphasiacs.

    PubMed

    Fu, Yu-Fen; Ho, Cheng-Seen

    2010-01-01

    This paper presents an intelligent system allowing handicapped aphasiacs to perform basic communication tasks. It has the following three key features: (1) A 6-sensor data glove measures the finger gestures of a patient in terms of the bending degrees of his fingers. (2) A finger language recognition subsystem recognizes language components from the finger gestures. It employs multiple regression analysis to automatically extract proper finger features so that the recognition model can be fast and correctly constructed by a radial basis function neural network. (3) A coordinate-indexed virtual keyboard allows the users to directly access the letters on the keyboard at a practical speed. The system serves as a viable tool for natural and affordable communication for handicapped aphasiacs through continuous finger language input.

  15. Weighted functional linear regression models for gene-based association analysis.

    PubMed

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  16. Principal component regression analysis with SPSS.

    PubMed

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  17. Preoperative Thromboelastometry as a Predictor of Transfusion Requirements during Adult Living Donor Liver Transplantation.

    PubMed

    Fayed, Nirmeen; Mourad, Wessam; Yassen, Khaled; Görlinger, Klaus

    2015-03-01

    The ability to predict transfusion requirements may improve perioperative bleeding management as an integral part of a patient blood management program. Therefore, the aim of our study was to evaluate preoperative thromboelastometry as a predictor of transfusion requirements for adult living donor liver transplant recipients. The correlation between preoperative thromboelastometry variables in 100 adult living donor liver transplant recipients and intraoperative blood transfusion requirements was examined by univariate and multivariate linear regression analysis. Thresholds of thromboelastometric parameters for prediction of packed red blood cells (PRBCs), fresh frozen plasma (FFP), platelets, and cryoprecipitate transfusion requirements were determined with receiver operating characteristics analysis. The attending anesthetists were blinded to the preoperative thromboelastometric analysis. However, a thromboelastometry-guided transfusion algorithm with predefined trigger values was used intraoperatively. The transfusion triggers in this algorithm did not change during the study period. Univariate analysis confirmed significant correlations between PRBCs, FFP, platelets or cryoprecipitate transfusion requirements and most thromboelastometric variables. Backward stepwise logistic regression indicated that EXTEM coagulation time (CT), maximum clot firmness (MCF) and INTEM CT, clot formation time (CFT) and MCF are independent predictors for PRBC transfusion. EXTEM CT, CFT and FIBTEM MCF are independent predictors for FFP transfusion. Only EXTEM and INTEM MCF were independent predictors of platelet transfusion. EXTEM CFT and MCF, INTEM CT, CFT and MCF as well as FIBTEM MCF are independent predictors for cryoprecipitate transfusion. Thromboelastometry-based regression equation accounted for 63% of PRBC, 83% of FFP, 61% of cryoprecipitate, and 44% of platelet transfusion requirements. Preoperative thromboelastometric analysis is helpful to predict transfusion requirements in adult living donor liver transplant recipients. This may allow for better preparation and less cross-matching prior to surgery. The findings of our study need to be re-validated in a second prospective patient population.

  18. Fragile--Handle with Care: Regression Analyses That Include Categorical Data.

    ERIC Educational Resources Information Center

    Brown, Diane Peacock

    In education and the social sciences, problems of interest to researchers and users of research often involve variables that do not meet the assumptions of regression in the area of an equal interval scale relative to a zero point. Various coding schemes exist that allow the use of regression while still answering the researcher's questions of…

  19. Analyzing industrial energy use through ordinary least squares regression models

    NASA Astrophysics Data System (ADS)

    Golden, Allyson Katherine

    Extensive research has been performed using regression analysis and calibrated simulations to create baseline energy consumption models for residential buildings and commercial institutions. However, few attempts have been made to discuss the applicability of these methodologies to establish baseline energy consumption models for industrial manufacturing facilities. In the few studies of industrial facilities, the presented linear change-point and degree-day regression analyses illustrate ideal cases. It follows that there is a need in the established literature to discuss the methodologies and to determine their applicability for establishing baseline energy consumption models of industrial manufacturing facilities. The thesis determines the effectiveness of simple inverse linear statistical regression models when establishing baseline energy consumption models for industrial manufacturing facilities. Ordinary least squares change-point and degree-day regression methods are used to create baseline energy consumption models for nine different case studies of industrial manufacturing facilities located in the southeastern United States. The influence of ambient dry-bulb temperature and production on total facility energy consumption is observed. The energy consumption behavior of industrial manufacturing facilities is only sometimes sufficiently explained by temperature, production, or a combination of the two variables. This thesis also provides methods for generating baseline energy models that are straightforward and accessible to anyone in the industrial manufacturing community. The methods outlined in this thesis may be easily replicated by anyone that possesses basic spreadsheet software and general knowledge of the relationship between energy consumption and weather, production, or other influential variables. With the help of simple inverse linear regression models, industrial manufacturing facilities may better understand their energy consumption and production behavior, and identify opportunities for energy and cost savings. This thesis study also utilizes change-point and degree-day baseline energy models to disaggregate facility annual energy consumption into separate industrial end-user categories. The baseline energy model provides a suitable and economical alternative to sub-metering individual manufacturing equipment. One case study describes the conjoined use of baseline energy models and facility information gathered during a one-day onsite visit to perform an end-point energy analysis of an injection molding facility conducted by the Alabama Industrial Assessment Center. Applying baseline regression model results to the end-point energy analysis allowed the AIAC to better approximate the annual energy consumption of the facility's HVAC system.

  20. Regression analysis of longitudinal data with correlated censoring and observation times.

    PubMed

    Li, Yang; He, Xin; Wang, Haiying; Sun, Jianguo

    2016-07-01

    Longitudinal data occur in many fields such as the medical follow-up studies that involve repeated measurements. For their analysis, most existing approaches assume that the observation or follow-up times are independent of the response process either completely or given some covariates. In practice, it is apparent that this may not be true. In this paper, we present a joint analysis approach that allows the possible mutual correlations that can be characterized by time-dependent random effects. Estimating equations are developed for the parameter estimation and the resulted estimators are shown to be consistent and asymptotically normal. The finite sample performance of the proposed estimators is assessed through a simulation study and an illustrative example from a skin cancer study is provided.

  1. Tobacco control policies to promote awareness and smoke-free environments in residence and workplace to reduce passive tobacco smoking in Bangladesh and its correlates.

    PubMed

    Sultana, Papia; Rahman, Md Tahidur; Roy, Dulal Chandra; Akter, Shamima; Jung, Jenny; Rahman, Md Mizanur; Akter, Jahanara

    2018-01-01

    Bangladesh is one of the highest tobacco consuming countries in the world, with reported 21.2% of the population as daily smokers, 24.3% as smokeless tobacco users, and 36.3% as adult passive smoker. Given the high prevalence and established harmful effects of passive tobacco smoking, this study aimed to estimate of pattern of smoking policies in residential and work place, and to identify the associated socio-economic and demographic correlates in Bangladesh. Secondary data of sample size 9629 collected by the Global Adult Tobacco Survey (GATS) 2010 has been used. Along with descriptive analysis, binary logistic regression model has been used to analyze the socio-demographic and economic correlates to tobacco smoking policy. The prevalence of male and female passive tobacco smokers was 74.3% and 25.8% respectively. Among the passive tobacco smokers, 22.2% reported that smoking was allowed at their home and 29.8% reported that there was no such smoking policy at their home. Alternatively, 26.0% passive tobacco smokers reported that smoking was allowed and 27.5% reported that there was no such smoking policy at their work place. Logistic regression analysis indicated that for tobacco smokers group, the odds of allowing smoking at home was 4.85 times higher than the non-smoker respondent (OR = 4.85, 95% CI = 4.13, 5.71), 1.18 times more likely to be allowed at home in rural areas than urban areas (OR = 1.18, 95% CI = 1.06,1.32) and less for college/university completed and (or) higher educated respondent than no formal schooling (OR = 0.35, 95% CI = 0.24, 0.52). On the other hand, smoking was 1.70 times more likely to be allowed at work place for tobacco smokers than their counter part respondent (OR = 1.70, 95% CI = 1.36, 2.14) and was less likely to be allowed for college/university completed and (or) higher educated respondent (OR = 0.26, 95% CI = 0.14, 0.45) than respondent with no formal schooling. To reduce the passive smoking, lower educated people and people in urban areas should advocate more about the adverse effect of active and passive tobacco smoking. Also, smoking policy should reform introducing smoking zone at work places and residential buildings.

  2. Use of partial least squares regression for the multivariate calibration of hazardous air pollutants in open-path FT-IR spectrometry

    NASA Astrophysics Data System (ADS)

    Hart, Brian K.; Griffiths, Peter R.

    1998-06-01

    Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, J.; Moon, T.J.; Howell, J.R.

    This paper presents an analysis of the heat transfer occurring during an in-situ curing process for which infrared energy is provided on the surface of polymer composite during winding. The material system is Hercules prepreg AS4/3501-6. Thermoset composites have an exothermic chemical reaction during the curing process. An Eulerian thermochemical model is developed for the heat transfer analysis of helical winding. The model incorporates heat generation due to the chemical reaction. Several assumptions are made leading to a two-dimensional, thermochemical model. For simplicity, 360{degree} heating around the mandrel is considered. In order to generate the appropriate process windows, the developedmore » heat transfer model is combined with a simple winding time model. The process windows allow for a proper selection of process variables such as infrared energy input and winding velocity to give a desired end-product state. Steady-state temperatures are found for each combination of the process variables. A regression analysis is carried out to relate the process variables to the resulting steady-state temperatures. Using regression equations, process windows for a wide range of cylinder diameters are found. A general procedure to find process windows for Hercules AS4/3501-6 prepreg tape is coded in a FORTRAN program.« less

  4. Reevaluation of Stratospheric Ozone Trends From SAGE II Data Using a Simultaneous Temporal and Spatial Analysis

    NASA Technical Reports Server (NTRS)

    Damadeo, R. P.; Zawodny, J. M.; Thomason, L. W.

    2014-01-01

    This paper details a new method of regression for sparsely sampled data sets for use with time-series analysis, in particular the Stratospheric Aerosol and Gas Experiment (SAGE) II ozone data set. Non-uniform spatial, temporal, and diurnal sampling present in the data set result in biased values for the long-term trend if not accounted for. This new method is performed close to the native resolution of measurements and is a simultaneous temporal and spatial analysis that accounts for potential diurnal ozone variation. Results show biases, introduced by the way data is prepared for use with traditional methods, can be as high as 10%. Derived long-term changes show declines in ozone similar to other studies but very different trends in the presumed recovery period, with differences up to 2% per decade. The regression model allows for a variable turnaround time and reveals a hemispheric asymmetry in derived trends in the middle to upper stratosphere. Similar methodology is also applied to SAGE II aerosol optical depth data to create a new volcanic proxy that covers the SAGE II mission period. Ultimately this technique may be extensible towards the inclusion of multiple data sets without the need for homogenization.

  5. Using Social Network Analysis to Better Understand Compulsive Exercise Behavior Among a Sample of Sorority Members.

    PubMed

    Patterson, Megan S; Goodson, Patricia

    2017-05-01

    Compulsive exercise, a form of unhealthy exercise often associated with prioritizing exercise and feeling guilty when exercise is missed, is a common precursor to and symptom of eating disorders. College-aged women are at high risk of exercising compulsively compared with other groups. Social network analysis (SNA) is a theoretical perspective and methodology allowing researchers to observe the effects of relational dynamics on the behaviors of people. SNA was used to assess the relationship between compulsive exercise and body dissatisfaction, physical activity, and network variables. Descriptive statistics were conducted using SPSS, and quadratic assignment procedure (QAP) analyses were conducted using UCINET. QAP regression analysis revealed a statistically significant model (R 2 = .375, P < .0001) predicting compulsive exercise behavior. Physical activity, body dissatisfaction, and network variables were statistically significant predictor variables in the QAP regression model. In our sample, women who are connected to "important" or "powerful" people in their network are likely to have higher compulsive exercise scores. This result provides healthcare practitioners key target points for intervention within similar groups of women. For scholars researching eating disorders and associated behaviors, this study supports looking into group dynamics and network structure in conjunction with body dissatisfaction and exercise frequency.

  6. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  7. Proteomic peptide profiling for preemptive diagnosis of acute graft-versus-host disease after allogeneic stem cell transplantation.

    PubMed

    Weissinger, E M; Metzger, J; Dobbelstein, C; Wolff, D; Schleuning, M; Kuzmina, Z; Greinix, H; Dickinson, A M; Mullen, W; Kreipe, H; Hamwi, I; Morgan, M; Krons, A; Tchebotarenko, I; Ihlenburg-Schwarz, D; Dammann, E; Collin, M; Ehrlich, S; Diedrich, H; Stadler, M; Eder, M; Holler, E; Mischak, H; Krauter, J; Ganser, A

    2014-04-01

    Allogeneic hematopoietic stem cell transplantation is one curative treatment for hematological malignancies, but is compromised by life-threatening complications, such as severe acute graft-versus-host disease (aGvHD). Prediction of severe aGvHD as early as possible is crucial to allow timely initiation of treatment. Here we report on a multicentre validation of an aGvHD-specific urinary proteomic classifier (aGvHD_MS17) in 423 patients. Samples (n=1106) were collected prospectively between day +7 and day +130 and analyzed using capillary electrophoresis coupled on-line to mass spectrometry. Integration of aGvHD_MS17 analysis with demographic and clinical variables using a logistic regression model led to correct classification of patients developing severe aGvHD 14 days before any clinical signs with 82.4% sensitivity and 77.3% specificity. Multivariate regression analysis showed that aGvHD_MS17 positivity was the only strong predictor for aGvHD grade III or IV (P<0.0001). The classifier consists of 17 peptides derived from albumin, β2-microglobulin, CD99, fibronectin and various collagen α-chains, indicating inflammation, activation of T cells and changes in the extracellular matrix as early signs of GvHD-induced organ damage. This study is currently the largest demonstration of accurate and investigator-independent prediction of patients at risk for severe aGvHD, thus allowing preemptive therapy based on proteomic profiling.

  8. Role of Surgical Services in Profitability of Hospitals in California: An Analysis of Office of Statewide Health Planning and Development Annual Financial Data.

    PubMed

    Moazzez, Ashkan; de Virgilio, Christian

    2016-10-01

    With constant changes in health-care laws and payment methods, profitability, and financial sustainability of hospitals are of utmost importance. The purpose of this study is to determine the relationship between surgical services and hospital profitability. The Office of Statewide Health Planning and Development annual financial databases for the years 2009 to 2011 were used for this study. The hospitals' characteristics and income statement elements were extracted for statistical analysis using bivariate and multivariate linear regression. A total of 989 financial records of 339 hospitals were included. On bivariate analysis, the number of inpatient and ambulatory operating rooms (ORs), the number of cases done both as inpatient and outpatient in each OR, and the average minutes used in inpatient ORs were significantly related with the net income of the hospital. On multivariate regression analysis, when controlling for hospitals' payer mix and the study year, only the number of inpatient cases done in the inpatient ORs (β = 832, P = 0.037), and the number of ambulatory ORs (β = 1,485, 466, P = 0.001) were significantly related with the net income of the hospital. These findings suggest that hospitals can maximize their profitability by diverting and allocating outpatient surgeries to ambulatory ORs, to allow for more inpatient surgeries.

  9. Demonstration of a Fiber Optic Regression Probe

    NASA Technical Reports Server (NTRS)

    Korman, Valentin; Polzin, Kurt A.

    2010-01-01

    The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for empirically anchoring any analysis geared towards lifetime qualification. Erosion rate data over an operating envelope could also be useful in the modeling detailed physical processes. The sensor has been embedded in many regressing media for the purposes of proof-of-concept testing. A gross demonstration of its capabilities was performed using a sanding wheel to remove layers of metal. A longer-term demonstration measurement involved the placement of the sensor in a brake pad, monitoring the removal of pad material associated with the normal wear-and-tear of driving. It was used to measure the regression rates of the combustable media in small model rocket motors and road flares. Finally, a test was performed using a sand blaster to remove small amounts of material at a time. This test was aimed at demonstrating the unit's present resolution, and is compared with laser profilometry data obtained simultaneously. At the lowest resolution levels, this unit should be useful in locally quantifying the erosion rates of the channel walls in plasma thrusters. .

  10. Dough performance, quality and shelf life of flat bread supplemented with fractions of germinated date seed.

    PubMed

    Hejri-Zarifi, Sudiyeh; Ahmadian-Kouchaksaraei, Zahra; Pourfarzad, Amir; Khodaparast, Mohammad Hossein Haddad

    2014-12-01

    Germinated palm date seeds were milled into two fractions: germ and residue. Dough rheological characteristics, baking (specific volume and sensory evaluation), and textural properties (at first day and during storage for 5 days) were determined in Barbari flat bread. Germ and residue fractions were incorporated at various levels ranged in 0.5-3 g/100 g of wheat flour. Water absorption, arrival time and gelatination temperature were decreased by germ fraction but accompanied by an increasing effect on the mixing tolerance index and degree of softening in most levels. Although improvement in dough stability was monitored but specific volume of bread was not affected by both fractions. Texture analysis of bread samples during 5 days of storage indicated that both fractions of germinated date seeds were able to diminish bread staling. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among dough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.

  11. Detection of Butter Adulteration with Lard by Employing (1)H-NMR Spectroscopy and Multivariate Data Analysis.

    PubMed

    Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi

    2015-01-01

    The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.

  12. Biodiversity patterns along ecological gradients: unifying β-diversity indices.

    PubMed

    Szava-Kovats, Robert C; Pärtel, Meelis

    2014-01-01

    Ecologists have developed an abundance of conceptions and mathematical expressions to define β-diversity, the link between local (α) and regional-scale (γ) richness, in order to characterize patterns of biodiversity along ecological (i.e., spatial and environmental) gradients. These patterns are often realized by regression of β-diversity indices against one or more ecological gradients. This practice, however, is subject to two shortcomings that can undermine the validity of the biodiversity patterns. First, many β-diversity indices are constrained to range between fixed lower and upper limits. As such, regression analysis of β-diversity indices against ecological gradients can result in regression curves that extend beyond these mathematical constraints, thus creating an interpretational dilemma. Second, despite being a function of the same measured α- and γ-diversity, the resultant biodiversity pattern depends on the choice of β-diversity index. We propose a simple logistic transformation that rids beta-diversity indices of their mathematical constraints, thus eliminating the possibility of an uninterpretable regression curve. Moreover, this transformation results in identical biodiversity patterns for three commonly used classical beta-diversity indices. As a result, this transformation eliminates the difficulties of both shortcomings, while allowing the researcher to use whichever beta-diversity index deemed most appropriate. We believe this method can help unify the study of biodiversity patterns along ecological gradients.

  13. Biodiversity Patterns along Ecological Gradients: Unifying β-Diversity Indices

    PubMed Central

    Szava-Kovats, Robert C.; Pärtel, Meelis

    2014-01-01

    Ecologists have developed an abundance of conceptions and mathematical expressions to define β-diversity, the link between local (α) and regional-scale (γ) richness, in order to characterize patterns of biodiversity along ecological (i.e., spatial and environmental) gradients. These patterns are often realized by regression of β-diversity indices against one or more ecological gradients. This practice, however, is subject to two shortcomings that can undermine the validity of the biodiversity patterns. First, many β-diversity indices are constrained to range between fixed lower and upper limits. As such, regression analysis of β-diversity indices against ecological gradients can result in regression curves that extend beyond these mathematical constraints, thus creating an interpretational dilemma. Second, despite being a function of the same measured α- and γ-diversity, the resultant biodiversity pattern depends on the choice of β-diversity index. We propose a simple logistic transformation that rids beta-diversity indices of their mathematical constraints, thus eliminating the possibility of an uninterpretable regression curve. Moreover, this transformation results in identical biodiversity patterns for three commonly used classical beta-diversity indices. As a result, this transformation eliminates the difficulties of both shortcomings, while allowing the researcher to use whichever beta-diversity index deemed most appropriate. We believe this method can help unify the study of biodiversity patterns along ecological gradients. PMID:25330181

  14. Analyzing degradation data with a random effects spline regression model

    DOE PAGES

    Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip

    2017-03-17

    This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.

  15. Analyzing degradation data with a random effects spline regression model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fugate, Michael Lynn; Hamada, Michael Scott; Weaver, Brian Phillip

    This study proposes using a random effects spline regression model to analyze degradation data. Spline regression avoids having to specify a parametric function for the true degradation of an item. A distribution for the spline regression coefficients captures the variation of the true degradation curves from item to item. We illustrate the proposed methodology with a real example using a Bayesian approach. The Bayesian approach allows prediction of degradation of a population over time and estimation of reliability is easy to perform.

  16. Correlation of embryonic skeletal muscle myotube physical characteristics with contractile force generation on an atomic force microscope-based bio-microelectromechanical systems device

    NASA Astrophysics Data System (ADS)

    Pirozzi, K. L.; Long, C. J.; McAleer, C. W.; Smith, A. S. T.; Hickman, J. J.

    2013-08-01

    Rigorous analysis of muscle function in in vitro systems is needed for both acute and chronic biomedical applications. Forces generated by skeletal myotubes on bio-microelectromechanical cantilevers were calculated using a modified version of Stoney's thin-film equation and finite element analysis (FEA), then analyzed for regression to physical parameters. The Stoney's equation results closely matched the more intensive FEA and the force correlated to cross-sectional area (CSA). Normalizing force to measured CSA significantly improved the statistical sensitivity and now allows for close comparison of in vitro data to in vivo measurements for applications in exercise physiology, robotics, and modeling neuromuscular diseases.

  17. Assessment of heart rate variability based on mobile device for planning physical activity

    NASA Astrophysics Data System (ADS)

    Svirin, I. S.; Epishina, E. V.; Voronin, V. V.; Semenishchev, E. A.; Solodova, E. N.; Nabilskaya, N. V.

    2015-05-01

    In this paper we present a method for the functional analysis of human heart based on electrocardiography (ECG) signals. The approach using the apparatus of analytical and differential geometry and correlation and regression analysis. ECG contains information on the current condition of the cardiovascular system as well as on the pathological changes in the heart. Mathematical processing of the heart rate variability allows to obtain a great set of mathematical and statistical characteristics. These characteristics of the heart rate are used when solving research problems to study physiological changes that determine functional changes of an individual. The proposed method implemented for up-to-date mobile Android and iOS based devices.

  18. Comparison of different functional EIT approaches to quantify tidal ventilation distribution.

    PubMed

    Zhao, Zhanqi; Yun, Po-Jen; Kuo, Yen-Liang; Fu, Feng; Dai, Meng; Frerichs, Inez; Möller, Knut

    2018-01-30

    The aim of the study was to examine the pros and cons of different types of functional EIT (fEIT) to quantify tidal ventilation distribution in a clinical setting. fEIT images were calculated with (1) standard deviation of pixel time curve, (2) regression coefficients of global and local impedance time curves, or (3) mean tidal variations. To characterize temporal heterogeneity of tidal ventilation distribution, another fEIT image of pixel inspiration times is also proposed. fEIT-regression is very robust to signals with different phase information. When the respiratory signal should be distinguished from the heart-beat related signal, or during high-frequency oscillatory ventilation, fEIT-regression is superior to other types. fEIT-tidal variation is the most stable image type regarding the baseline shift. We recommend using this type of fEIT image for preliminary evaluation of the acquired EIT data. However, all these fEITs would be misleading in their assessment of ventilation distribution in the presence of temporal heterogeneity. The analysis software provided by the currently available commercial EIT equipment only offers either fEIT of standard deviation or tidal variation. Considering the pros and cons of each fEIT type, we recommend embedding more types into the analysis software to allow the physicians dealing with more complex clinical applications with on-line EIT measurements.

  19. Comparison of energy expenditure to walk or run a mile in adult normal weight and overweight men and women.

    PubMed

    Loftin, Mark; Waddell, Dwight E; Robinson, James H; Owens, Scott G

    2010-10-01

    We compared the energy expenditure to walk or run a mile in adult normal weight walkers (NWW), overweight walkers (OW), and marathon runners (MR). The sample consisted of 19 NWW, 11 OW, and 20 MR adults. Energy expenditure was measured at preferred walking speed (NWW and OW) and running speed of a recently completed marathon. Body composition was assessed via dual-energy x-ray absorptiometry. Analysis of variance was used to compare groups with the Scheffe's procedure used for post hoc analysis. Multiple regression analysis was used to predict energy expenditure. Results that indicated OW exhibited significantly higher (p < 0.05) mass and fat weight than NWW or MR. Similar values were found between NWW and MR. Absolute energy expenditure to walk or run a mile was similar between groups (NWW 93.9 ± 15.0, OW 98.4 ± 29.9, MR 99.3 ± 10.8 kcal); however, significant differences were noted when energy expenditure was expressed relative to mass (MR > NWW > OW). When energy expenditure was expressed per kilogram of fat-free mass, similar values were found across groups. Multiple regression analysis yielded mass and gender as significant predictors of energy expenditure (R = 0.795, SEE = 10.9 kcal). We suggest that walking is an excellent physical activity for energy expenditure in overweight individuals that are capable of walking without predisposed conditions such as osteoarthritis or cardiovascular risk factors. Moreover, from a practical perspective, our regression equation (kcal = mass (kg) × 0.789 - gender (men = 1, women = 2) × 7.634 + 51.109) allows for the prediction of energy expenditure for a given distance (mile) rather than predicting energy expenditure for a given time (minutes).

  20. Heterogeneous Suppression of Sequential Effects in Random Sequence Generation, but Not in Operant Learning.

    PubMed

    Shteingart, Hanan; Loewenstein, Yonatan

    2016-01-01

    There is a long history of experiments in which participants are instructed to generate a long sequence of binary random numbers. The scope of this line of research has shifted over the years from identifying the basic psychological principles and/or the heuristics that lead to deviations from randomness, to one of predicting future choices. In this paper, we used generalized linear regression and the framework of Reinforcement Learning in order to address both points. In particular, we used logistic regression analysis in order to characterize the temporal sequence of participants' choices. Surprisingly, a population analysis indicated that the contribution of the most recent trial has only a weak effect on behavior, compared to more preceding trials, a result that seems irreconcilable with standard sequential effects that decay monotonously with the delay. However, when considering each participant separately, we found that the magnitudes of the sequential effect are a monotonous decreasing function of the delay, yet these individual sequential effects are largely averaged out in a population analysis because of heterogeneity. The substantial behavioral heterogeneity in this task is further demonstrated quantitatively by considering the predictive power of the model. We show that a heterogeneous model of sequential dependencies captures the structure available in random sequence generation. Finally, we show that the results of the logistic regression analysis can be interpreted in the framework of reinforcement learning, allowing us to compare the sequential effects in the random sequence generation task to those in an operant learning task. We show that in contrast to the random sequence generation task, sequential effects in operant learning are far more homogenous across the population. These results suggest that in the random sequence generation task, different participants adopt different cognitive strategies to suppress sequential dependencies when generating the "random" sequences.

  1. Methods for estimating annual exceedance-probability discharges for streams in Iowa, based on data through water year 2010

    USGS Publications Warehouse

    Eash, David A.; Barnes, Kimberlee K.; Veilleux, Andrea G.

    2013-01-01

    A statewide study was performed to develop regional regression equations for estimating selected annual exceedance-probability statistics for ungaged stream sites in Iowa. The study area comprises streamgages located within Iowa and 50 miles beyond the State’s borders. Annual exceedance-probability estimates were computed for 518 streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data through 2010. The estimation of the selected statistics included a Bayesian weighted least-squares/generalized least-squares regression analysis to update regional skew coefficients for the 518 streamgages. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low flows. Also, geographic information system software was used to measure 59 selected basin characteristics for each streamgage. Regional regression analysis, using generalized least-squares regression, was used to develop a set of equations for each flood region in Iowa for estimating discharges for ungaged stream sites with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. A total of 394 streamgages were included in the development of regional regression equations for three flood regions (regions 1, 2, and 3) that were defined for Iowa based on landform regions and soil regions. Average standard errors of prediction range from 31.8 to 45.2 percent for flood region 1, 19.4 to 46.8 percent for flood region 2, and 26.5 to 43.1 percent for flood region 3. The pseudo coefficients of determination for the generalized least-squares equations range from 90.8 to 96.2 percent for flood region 1, 91.5 to 97.9 percent for flood region 2, and 92.4 to 96.0 percent for flood region 3. The regression equations are applicable only to stream sites in Iowa with flows not significantly affected by regulation, diversion, channelization, backwater, or urbanization and with basin characteristics within the range of those used to develop the equations. These regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the eight selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided by the Web-based tool. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these eight selected statistics are provided for the streamgage.

  2. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care.

    PubMed

    Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M

    2014-06-19

    An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.

  3. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    ERIC Educational Resources Information Center

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  4. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    PubMed

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  5. Meteorological adjustment of yearly mean values for air pollutant concentration comparison

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.; Neustadter, H. E.

    1976-01-01

    Using multiple linear regression analysis, models which estimate mean concentrations of Total Suspended Particulate (TSP), sulfur dioxide, and nitrogen dioxide as a function of several meteorologic variables, two rough economic indicators, and a simple trend in time are studied. Meteorologic data were obtained and do not include inversion heights. The goodness of fit of the estimated models is partially reflected by the squared coefficient of multiple correlation which indicates that, at the various sampling stations, the models accounted for about 23 to 47 percent of the total variance of the observed TSP concentrations. If the resulting model equations are used in place of simple overall means of the observed concentrations, there is about a 20 percent improvement in either: (1) predicting mean concentrations for specified meteorological conditions; or (2) adjusting successive yearly averages to allow for comparisons devoid of meteorological effects. An application to source identification is presented using regression coefficients of wind velocity predictor variables.

  6. Sensitive and selective cocaine electrochemical detection using disposable sensors.

    PubMed

    Asturias-Arribas, Laura; Alonso-Lomillo, M Asunción; Domínguez-Renedo, Olga; Arcos-Martínez, M Julia

    2014-06-27

    This paper describes the voltammetric determination of cocaine in presence of three different interferences that could be found in street samples using disposable sensors. The electrochemical analysis of this alkaloid can be affected by the presence of codeine, paracetamol or caffeine, whose oxidation peaks may overlap and lead to false positives. This work describes two different solutions to this problem. On one hand, the modification of disposable carbon sensors with carbon nanotubes allows the voltammetric quantification of cocaine by using ordinary least squares regressions in the concentration range from 10 to 155 μmol L(-1), with a reproducibility of 5.6% (RSD, n = 7. On the other hand, partial least squares regressions are used for the resolution of the overlapped voltammetric signals when using screen-printed carbon electrodes without any modification. Both procedures have been successfully applied to the evaluation of the purity of cocaine street samples. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Analytical and regression models of glass rod drawing process

    NASA Astrophysics Data System (ADS)

    Alekseeva, L. B.

    2018-03-01

    The process of drawing glass rods (light guides) is being studied. The parameters of the process affecting the quality of the light guide have been determined. To solve the problem, mathematical models based on general equations of continuum mechanics are used. The conditions for the stable flow of the drawing process have been found, which are determined by the stability of the motion of the glass mass in the formation zone to small uncontrolled perturbations. The sensitivity of the formation zone to perturbations of the drawing speed and viscosity is estimated. Experimental models of the drawing process, based on the regression analysis methods, have been obtained. These models make it possible to customize a specific production process to obtain light guides of the required quality. They allow one to find the optimum combination of process parameters in the chosen area and to determine the required accuracy of maintaining them at a specified level.

  8. Specific prognostic factors for secondary pancreatic infection in severe acute pancreatitis.

    PubMed

    Armengol-Carrasco, M; Oller, B; Escudero, L E; Roca, J; Gener, J; Rodríguez, N; del Moral, P; Moreno, P

    1999-01-01

    The aim of the present study was to investigate whether there are specific prognostic factors to predict the development of secondary pancreatic infection (SPI) in severe acute pancreatitis in order to perform a computed tomography-fine needle aspiration with bacteriological sampling at the right moment and confirm the diagnosis. Twenty-five clinical and laboratory parameters were determined sequentially in 150 patients with severe acute pancreatitis (SAP) and univariate, and multivariate regression analyses were done looking for correlation with the development of SPI. Only APACHE II score and C-reactive protein levels were related to the development of SPI in the multivariate analysis. A regression equation was designed using these two parameters, and empiric cut-off points defined the subgroup of patients at high risk of developing secondary pancreatic infection. The results showed that it is possible to predict SPI during SAP allowing bacteriological confirmation and early treatment of this severe condition.

  9. Automated fine structure image analysis method for discrimination of diabetic retinopathy stage using conjunctival microvasculature images

    PubMed Central

    Khansari, Maziyar M; O’Neill, William; Penn, Richard; Chau, Felix; Blair, Norman P; Shahidi, Mahnaz

    2016-01-01

    The conjunctiva is a densely vascularized mucus membrane covering the sclera of the eye with a unique advantage of accessibility for direct visualization and non-invasive imaging. The purpose of this study is to apply an automated quantitative method for discrimination of different stages of diabetic retinopathy (DR) using conjunctival microvasculature images. Fine structural analysis of conjunctival microvasculature images was performed by ordinary least square regression and Fisher linear discriminant analysis. Conjunctival images between groups of non-diabetic and diabetic subjects at different stages of DR were discriminated. The automated method’s discriminate rates were higher than those determined by human observers. The method allowed sensitive and rapid discrimination by assessment of conjunctival microvasculature images and can be potentially useful for DR screening and monitoring. PMID:27446692

  10. Time series regression model for infectious disease and weather.

    PubMed

    Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro

    2015-10-01

    Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  11. The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bihn T. Pham; Jeffrey J. Einerson

    2010-06-01

    This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less

  12. An empirical study using permutation-based resampling in meta-regression

    PubMed Central

    2012-01-01

    Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs) for interventions that have a small number of trials (herbal medicine trials). Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9) of the cases for permutation and identical for 22% (2/9) of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of trials. PMID:22587815

  13. A comparison of regression methods for model selection in individual-based landscape genetic analysis.

    PubMed

    Shirk, Andrew J; Landguth, Erin L; Cushman, Samuel A

    2018-01-01

    Anthropogenic migration barriers fragment many populations and limit the ability of species to respond to climate-induced biome shifts. Conservation actions designed to conserve habitat connectivity and mitigate barriers are needed to unite fragmented populations into larger, more viable metapopulations, and to allow species to track their climate envelope over time. Landscape genetic analysis provides an empirical means to infer landscape factors influencing gene flow and thereby inform such conservation actions. However, there are currently many methods available for model selection in landscape genetics, and considerable uncertainty as to which provide the greatest accuracy in identifying the true landscape model influencing gene flow among competing alternative hypotheses. In this study, we used population genetic simulations to evaluate the performance of seven regression-based model selection methods on a broad array of landscapes that varied by the number and type of variables contributing to resistance, the magnitude and cohesion of resistance, as well as the functional relationship between variables and resistance. We also assessed the effect of transformations designed to linearize the relationship between genetic and landscape distances. We found that linear mixed effects models had the highest accuracy in every way we evaluated model performance; however, other methods also performed well in many circumstances, particularly when landscape resistance was high and the correlation among competing hypotheses was limited. Our results provide guidance for which regression-based model selection methods provide the most accurate inferences in landscape genetic analysis and thereby best inform connectivity conservation actions. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  14. On Budyko curve as a consequence of climate-soil-vegetation equilibrium hypothesis

    NASA Astrophysics Data System (ADS)

    Pande, S.

    2012-04-01

    A hypothesis that Budyko curve is a consequence of stable equilibriums of climate-soil-vegetation co-evolution is tested at biome scale. We assume that i) distribution of vegetation, soil and climate within a biome is a distribution of equilibriums of similar soil-vegetation dynamics and that this dynamics is different across different biomes and ii) soil and vegetation are in dynamic equilibrium with climate while in static equilibrium with each other. In order to test the hypothesis, a two stage regression is considered using MOPEX/Hydrologic Synthesis Project dataset for basins in eastern United States. In the first stage, multivariate regression (Seemingly Unrelated Regression) is performed for each biome with soil (estimated porosity and slope of soil water retention curve) and vegetation characteristics (5-week NDVI gradient) as dependent variables and aridity index, vegetation and soil characteristics as independent variables for respective dependent variables. The regression residuals of the first stage along with aridity index then serve as second stage independent variables while actual vaporization to precipitation ratio (vapor index) serving as dependent variable. Insignificance, if revealed, of a first stage parameter allows us to reject the role of corresponding soil or vegetation characteristics in the co-evolution hypothesis. Meanwhile the significance of second stage regression parameter corresponding to a first stage residual allow us to reject the hypothesis that Budyko curve is a locus "solely" of climate-soil-vegetation co-evolution equilibrium points. Results suggest lack of evidence for soil-vegetation co-evolution in Prairies and Mixed/SouthEast Forests (unlike in Deciduous Forests) though climate plays a dominant role in explaining within biome soil and vegetation characteristics across all the biomes. Preliminary results indicate absence of effects beyond climate-soil-vegetation co-evolution in explaining the ratio of annual total minimum monthly flows to precipitation in Deciduous Forests though other three biome types show presence of effects beyond co-evolutionary. Such an analysis can yield insights into the nature of hydrologic change when assessed along the Budyko curve as well as non co-evolutionary effects such as anthropogenic effects on basin scale annual water balances.

  15. Estimation of pyrethroid pesticide intake using regression modeling of food groups based on composite dietary samples

    EPA Science Inventory

    Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression modeling performed on measurements of selected pesticides in composited duplicate diet samples allowed (1) estimation ...

  16. Estimation of pyrethroid pesticide intake using regression modeling of food groups based on composite dietary samples..

    EPA Science Inventory

    Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression modeling performed on measurements of selected pesticides in composited duplicate diet samples allowed (1) estimation ...

  17. Estimation of pyrethroid pesticide intake using regression modeling of food groups based on composite dietary samples.

    EPA Science Inventory

    Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression modeling performed on measurements of selected pesticides in composited duplicate diet samples allowed (1) estimation ...

  18. Epidemiologic programs for computers and calculators. A microcomputer program for multiple logistic regression by unconditional and conditional maximum likelihood methods.

    PubMed

    Campos-Filho, N; Franco, E L

    1989-02-01

    A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.

  19. Epidemiologic Evaluation of Measurement Data in the Presence of Detection Limits

    PubMed Central

    Lubin, Jay H.; Colt, Joanne S.; Camann, David; Davis, Scott; Cerhan, James R.; Severson, Richard K.; Bernstein, Leslie; Hartge, Patricia

    2004-01-01

    Quantitative measurements of environmental factors greatly improve the quality of epidemiologic studies but can pose challenges because of the presence of upper or lower detection limits or interfering compounds, which do not allow for precise measured values. We consider the regression of an environmental measurement (dependent variable) on several covariates (independent variables). Various strategies are commonly employed to impute values for interval-measured data, including assignment of one-half the detection limit to nondetected values or of “fill-in” values randomly selected from an appropriate distribution. On the basis of a limited simulation study, we found that the former approach can be biased unless the percentage of measurements below detection limits is small (5–10%). The fill-in approach generally produces unbiased parameter estimates but may produce biased variance estimates and thereby distort inference when 30% or more of the data are below detection limits. Truncated data methods (e.g., Tobit regression) and multiple imputation offer two unbiased approaches for analyzing measurement data with detection limits. If interest resides solely on regression parameters, then Tobit regression can be used. If individualized values for measurements below detection limits are needed for additional analysis, such as relative risk regression or graphical display, then multiple imputation produces unbiased estimates and nominal confidence intervals unless the proportion of missing data is extreme. We illustrate various approaches using measurements of pesticide residues in carpet dust in control subjects from a case–control study of non-Hodgkin lymphoma. PMID:15579415

  20. Handling limited datasets with neural networks in medical applications: A small-data approach.

    PubMed

    Shaikhina, Torgyn; Khovanova, Natalia A

    2017-01-01

    Single-centre studies in medical domain are often characterised by limited samples due to the complexity and high costs of patient data collection. Machine learning methods for regression modelling of small datasets (less than 10 observations per predictor variable) remain scarce. Our work bridges this gap by developing a novel framework for application of artificial neural networks (NNs) for regression tasks involving small medical datasets. In order to address the sporadic fluctuations and validation issues that appear in regression NNs trained on small datasets, the method of multiple runs and surrogate data analysis were proposed in this work. The approach was compared to the state-of-the-art ensemble NNs; the effect of dataset size on NN performance was also investigated. The proposed framework was applied for the prediction of compressive strength (CS) of femoral trabecular bone in patients suffering from severe osteoarthritis. The NN model was able to estimate the CS of osteoarthritic trabecular bone from its structural and biological properties with a standard error of 0.85MPa. When evaluated on independent test samples, the NN achieved accuracy of 98.3%, outperforming an ensemble NN model by 11%. We reproduce this result on CS data of another porous solid (concrete) and demonstrate that the proposed framework allows for an NN modelled with as few as 56 samples to generalise on 300 independent test samples with 86.5% accuracy, which is comparable to the performance of an NN developed with 18 times larger dataset (1030 samples). The significance of this work is two-fold: the practical application allows for non-destructive prediction of bone fracture risk, while the novel methodology extends beyond the task considered in this study and provides a general framework for application of regression NNs to medical problems characterised by limited dataset sizes. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  1. Time-series panel analysis (TSPA): multivariate modeling of temporal associations in psychotherapy process.

    PubMed

    Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang

    2014-10-01

    Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  2. A rotor optimization using regression analysis

    NASA Technical Reports Server (NTRS)

    Giansante, N.

    1984-01-01

    The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.

  3. The median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysis.

    PubMed

    Austin, Peter C; Wagner, Philippe; Merlo, Juan

    2017-03-15

    Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster-specific random effects which allow one to partition the total individual variance into between-cluster variation and between-individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time-to-event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., 'frailty') Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  4. Identification of patients with gout: elaboration of a questionnaire for epidemiological studies.

    PubMed

    Richette, P; Clerson, P; Bouée, S; Chalès, G; Doherty, M; Flipo, R M; Lambert, C; Lioté, F; Poiraud, T; Schaeverbeke, T; Bardin, T

    2015-09-01

    In France, the prevalence of gout is currently unknown. We aimed to design a questionnaire to detect gout that would be suitable for use in a telephone survey by non-physicians and assessed its performance. We designed a 62-item questionnaire covering comorbidities, clinical features and treatment of gout. In a case-control study, we enrolled patients with a history of arthritis who had undergone arthrocentesis for synovial fluid analysis and crystal detection. Cases were patients with crystal-proven gout and controls were patients who had arthritis and effusion with no monosodium urate crystals in synovial fluid. The questionnaire was administered by phone to cases and controls by non-physicians who were unaware of the patient diagnosis. Logistic regression analysis and classification and regression trees were used to select items discriminating cases and controls. We interviewed 246 patients (102 cases and 142 controls). Two logistic regression models (sensitivity 88.0% and 87.5%; specificity 93.0% and 89.8%, respectively) and one classification and regression tree model (sensitivity 81.4%, specificity 93.7%) revealed 11 informative items that allowed for classifying 90.0%, 88.8% and 88.5% of patients, respectively. We developed a questionnaire to detect gout containing 11 items that is fast and suitable for use in a telephone survey by non-physicians. The questionnaire demonstrated good properties for discriminating patients with and without gout. It will be administered in a large sample of the general population to estimate the prevalence of gout in France. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  5. Predicting attention-deficit/hyperactivity disorder severity from psychosocial stress and stress-response genes: a random forest regression approach

    PubMed Central

    van der Meer, D; Hoekstra, P J; van Donkelaar, M; Bralten, J; Oosterlaan, J; Heslenfeld, D; Faraone, S V; Franke, B; Buitelaar, J K; Hartman, C A

    2017-01-01

    Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression is well suited to explore this complexity, as it allows for the analysis of many predictors simultaneously, taking into account any higher-order interactions among them. Using random forest regression, we predicted ADHD severity, measured by Conners’ Parent Rating Scales, from 686 adolescents and young adults (of which 281 were diagnosed with ADHD). The analysis included 17 374 single-nucleotide polymorphisms (SNPs) across 29 genes previously linked to hypothalamic–pituitary–adrenal (HPA) axis activity, together with information on exposure to 24 individual long-term difficulties or stressful life events. The model explained 12.5% of variance in ADHD severity. The most important SNP, which also showed the strongest interaction with stress exposure, was located in a region regulating the expression of telomerase reverse transcriptase (TERT). Other high-ranking SNPs were found in or near NPSR1, ESR1, GABRA6, PER3, NR3C2 and DRD4. Chronic stressors were more influential than single, severe, life events. Top hits were partly shared with conduct problems. We conclude that random forest regression may be used to investigate how multiple genetic and environmental factors jointly contribute to ADHD. It is able to implicate novel SNPs of interest, interacting with stress exposure, and may explain inconsistent findings in ADHD genetics. This exploratory approach may be best combined with more hypothesis-driven research; top predictors and their interactions with one another should be replicated in independent samples. PMID:28585928

  6. The median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysis

    PubMed Central

    Wagner, Philippe; Merlo, Juan

    2016-01-01

    Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster‐specific random effects which allow one to partition the total individual variance into between‐cluster variation and between‐individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time‐to‐event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., ‘frailty’) Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27885709

  7. A Comparison between Multiple Regression Models and CUN-BAE Equation to Predict Body Fat in Adults

    PubMed Central

    Fuster-Parra, Pilar; Bennasar-Veny, Miquel; Tauler, Pedro; Yañez, Aina; López-González, Angel A.; Aguiló, Antoni

    2015-01-01

    Background Because the accurate measure of body fat (BF) is difficult, several prediction equations have been proposed. The aim of this study was to compare different multiple regression models to predict BF, including the recently reported CUN-BAE equation. Methods Multi regression models using body mass index (BMI) and body adiposity index (BAI) as predictors of BF will be compared. These models will be also compared with the CUN-BAE equation. For all the analysis a sample including all the participants and another one including only the overweight and obese subjects will be considered. The BF reference measure was made using Bioelectrical Impedance Analysis. Results The simplest models including only BMI or BAI as independent variables showed that BAI is a better predictor of BF. However, adding the variable sex to both models made BMI a better predictor than the BAI. For both the whole group of participants and the group of overweight and obese participants, using simple models (BMI, age and sex as variables) allowed obtaining similar correlations with BF as when the more complex CUN-BAE was used (ρ = 0:87 vs. ρ = 0:86 for the whole sample and ρ = 0:88 vs. ρ = 0:89 for overweight and obese subjects, being the second value the one for CUN-BAE). Conclusions There are simpler models than CUN-BAE equation that fits BF as well as CUN-BAE does. Therefore, it could be considered that CUN-BAE overfits. Using a simple linear regression model, the BAI, as the only variable, predicts BF better than BMI. However, when the sex variable is introduced, BMI becomes the indicator of choice to predict BF. PMID:25821960

  8. A comparison between multiple regression models and CUN-BAE equation to predict body fat in adults.

    PubMed

    Fuster-Parra, Pilar; Bennasar-Veny, Miquel; Tauler, Pedro; Yañez, Aina; López-González, Angel A; Aguiló, Antoni

    2015-01-01

    Because the accurate measure of body fat (BF) is difficult, several prediction equations have been proposed. The aim of this study was to compare different multiple regression models to predict BF, including the recently reported CUN-BAE equation. Multi regression models using body mass index (BMI) and body adiposity index (BAI) as predictors of BF will be compared. These models will be also compared with the CUN-BAE equation. For all the analysis a sample including all the participants and another one including only the overweight and obese subjects will be considered. The BF reference measure was made using Bioelectrical Impedance Analysis. The simplest models including only BMI or BAI as independent variables showed that BAI is a better predictor of BF. However, adding the variable sex to both models made BMI a better predictor than the BAI. For both the whole group of participants and the group of overweight and obese participants, using simple models (BMI, age and sex as variables) allowed obtaining similar correlations with BF as when the more complex CUN-BAE was used (ρ = 0:87 vs. ρ = 0:86 for the whole sample and ρ = 0:88 vs. ρ = 0:89 for overweight and obese subjects, being the second value the one for CUN-BAE). There are simpler models than CUN-BAE equation that fits BF as well as CUN-BAE does. Therefore, it could be considered that CUN-BAE overfits. Using a simple linear regression model, the BAI, as the only variable, predicts BF better than BMI. However, when the sex variable is introduced, BMI becomes the indicator of choice to predict BF.

  9. Shear strength of fillet welds in aluminum alloy 2219. [for use on the solid rocket motor and external tank

    NASA Technical Reports Server (NTRS)

    Lovoy, C. V.

    1978-01-01

    Fillet size is discussed in terms of theoretical or design dimensions versus as-welded dimensions, drawing attention to the inherent conservatism in the design load sustaining capabilities of fillet welds. Emphasis is placed on components for the solid rocket motor, external tank, and other aerospace applications. Problems associated with inspection of fillet welds are addresses and a comparison is drawn between defect counts obtained by radiographic inspection and by visual examination of the fracture plane. Fillet weld quality is related linearly to ultimate shear strength. Correlation coefficients are obtained by simple straight line regression analysis between the variables of ultimate shear strength and accumulative discontinuity summation. Shear strength allowables are found to be equivalent to 57 percent of butt weld A allowables (F sub tu.)

  10. Application of mathematical model methods for optimization tasks in construction materials technology

    NASA Astrophysics Data System (ADS)

    Fomina, E. V.; Kozhukhova, N. I.; Sverguzova, S. V.; Fomin, A. E.

    2018-05-01

    In this paper, the regression equations method for design of construction material was studied. Regression and polynomial equations representing the correlation between the studied parameters were proposed. The logic design and software interface of the regression equations method focused on parameter optimization to provide the energy saving effect at the stage of autoclave aerated concrete design considering the replacement of traditionally used quartz sand by coal mining by-product such as argillite. The mathematical model represented by a quadric polynomial for the design of experiment was obtained using calculated and experimental data. This allowed the estimation of relationship between the composition and final properties of the aerated concrete. The surface response graphically presented in a nomogram allowed the estimation of concrete properties in response to variation of composition within the x-space. The optimal range of argillite content was obtained leading to a reduction of raw materials demand, development of target plastic strength of aerated concrete as well as a reduction of curing time before autoclave treatment. Generally, this method allows the design of autoclave aerated concrete with required performance without additional resource and time costs.

  11. HYDRORECESSION: A toolbox for streamflow recession analysis

    NASA Astrophysics Data System (ADS)

    Arciniega, S.

    2015-12-01

    Streamflow recession curves are hydrological signatures allowing to study the relationship between groundwater storage and baseflow and/or low flows at the catchment scale. Recent studies have showed that streamflow recession analysis can be quite sensitive to the combination of different models, extraction techniques and parameter estimation methods. In order to better characterize streamflow recession curves, new methodologies combining multiple approaches have been recommended. The HYDRORECESSION toolbox, presented here, is a Matlab graphical user interface developed to analyse streamflow recession time series with the support of different tools allowing to parameterize linear and nonlinear storage-outflow relationships through four of the most useful recession models (Maillet, Boussinesq, Coutagne and Wittenberg). The toolbox includes four parameter-fitting techniques (linear regression, lower envelope, data binning and mean squared error) and three different methods to extract hydrograph recessions segments (Vogel, Brutsaert and Aksoy). In addition, the toolbox has a module that separates the baseflow component from the observed hydrograph using the inverse reservoir algorithm. Potential applications provided by HYDRORECESSION include model parameter analysis, hydrological regionalization and classification, baseflow index estimates, catchment-scale recharge and low-flows modelling, among others. HYDRORECESSION is freely available for non-commercial and academic purposes.

  12. The kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats: calibration to the indirect immunofluorescence assay and computerized standardization of results through normalization to control values.

    PubMed Central

    Barlough, J E; Jacobson, R H; Downing, D R; Lynch, T J; Scott, F W

    1987-01-01

    The computer-assisted, kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats was calibrated to the conventional indirect immunofluorescence assay by linear regression analysis and computerized interpolation (generation of "immunofluorescence assay-equivalent" titers). Procedures were developed for normalization and standardization of kinetics-based enzyme-linked immunosorbent assay results through incorporation of five different control sera of predetermined ("expected") titer in daily runs. When used with such sera and with computer assistance, the kinetics-based enzyme-linked immunosorbent assay minimized both within-run and between-run variability while allowing also for efficient data reduction and statistical analysis and reporting of results. PMID:3032390

  13. The kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats: calibration to the indirect immunofluorescence assay and computerized standardization of results through normalization to control values.

    PubMed

    Barlough, J E; Jacobson, R H; Downing, D R; Lynch, T J; Scott, F W

    1987-01-01

    The computer-assisted, kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats was calibrated to the conventional indirect immunofluorescence assay by linear regression analysis and computerized interpolation (generation of "immunofluorescence assay-equivalent" titers). Procedures were developed for normalization and standardization of kinetics-based enzyme-linked immunosorbent assay results through incorporation of five different control sera of predetermined ("expected") titer in daily runs. When used with such sera and with computer assistance, the kinetics-based enzyme-linked immunosorbent assay minimized both within-run and between-run variability while allowing also for efficient data reduction and statistical analysis and reporting of results.

  14. Estimating air drying times of lumber with multiple regression

    Treesearch

    William T. Simpson

    2004-01-01

    In this study, the applicability of a multiple regression equation for estimating air drying times of red oak, sugar maple, and ponderosa pine lumber was evaluated. The equation allows prediction of estimated air drying times from historic weather records of temperature and relative humidity at any desired location.

  15. Trust, confidence, procedural fairness, outcome fairness, moral conviction, and the acceptance of GM field experiments.

    PubMed

    Siegrist, Michael; Connor, Melanie; Keller, Carmen

    2012-08-01

    In 2005, Swiss citizens endorsed a moratorium on gene technology, resulting in the prohibition of the commercial cultivation of genetically modified crops and the growth of genetically modified animals until 2013. However, scientific research was not affected by this moratorium, and in 2008, GMO field experiments were conducted that allowed us to examine the factors that influence their acceptance by the public. In this study, trust and confidence items were analyzed using principal component analysis. The analysis revealed the following three factors: "economy/health and environment" (value similarity based trust), "trust and honesty of industry and scientists" (value similarity based trust), and "competence" (confidence). The results of a regression analysis showed that all the three factors significantly influenced the acceptance of GM field experiments. Furthermore, risk communication scholars have suggested that fairness also plays an important role in the acceptance of environmental hazards. We, therefore, included measures for outcome fairness and procedural fairness in our model. However, the impact of fairness may be moderated by moral conviction. That is, fairness may be significant for people for whom GMO is not an important issue, but not for people for whom GMO is an important issue. The regression analysis showed that, in addition to the trust and confidence factors, moral conviction, outcome fairness, and procedural fairness were significant predictors. The results suggest that the influence of procedural fairness is even stronger for persons having high moral convictions compared with persons having low moral convictions. © 2012 Society for Risk Analysis.

  16. Design and analysis of forward and reverse models for predicting defect accumulation, defect energetics, and irradiation conditions

    DOE PAGES

    Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent; ...

    2018-03-06

    The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less

  17. Design and analysis of forward and reverse models for predicting defect accumulation, defect energetics, and irradiation conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, James A.; Kohnert, Aaron A.; Capolungo, Laurent

    The complexity of radiation effects in a material’s microstructure makes developing predictive models a difficult task. In principle, a complete list of all possible reactions between defect species being considered can be used to elucidate damage evolution mechanisms and its associated impact on microstructure evolution. However, a central limitation is that many models use a limited and incomplete catalog of defect energetics and associated reactions. Even for a given model, estimating its input parameters remains a challenge, especially for complex material systems. Here, we present a computational analysis to identify the extent to which defect accumulation, energetics, and irradiation conditionsmore » can be determined via forward and reverse regression models constructed and trained from large data sets produced by cluster dynamics simulations. A global sensitivity analysis, via Sobol’ indices, concisely characterizes parameter sensitivity and demonstrates how this can be connected to variability in defect evolution. Based on this analysis and depending on the definition of what constitutes the input and output spaces, forward and reverse regression models are constructed and allow for the direct calculation of defect accumulation, defect energetics, and irradiation conditions. Here, this computational analysis, exercised on a simplified cluster dynamics model, demonstrates the ability to design predictive surrogate and reduced-order models, and provides guidelines for improving model predictions within the context of forward and reverse engineering of mathematical models for radiation effects in a materials’ microstructure.« less

  18. Applied Multiple Linear Regression: A General Research Strategy

    ERIC Educational Resources Information Center

    Smith, Brandon B.

    1969-01-01

    Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)

  19. Trait Mindfulness as a Limiting Factor for Residual Depressive Symptoms: An Explorative Study Using Quantile Regression

    PubMed Central

    Radford, Sholto; Eames, Catrin; Brennan, Kate; Lambert, Gwladys; Crane, Catherine; Williams, J. Mark G.; Duggan, Danielle S.; Barnhofer, Thorsten

    2014-01-01

    Mindfulness has been suggested to be an important protective factor for emotional health. However, this effect might vary with regard to context. This study applied a novel statistical approach, quantile regression, in order to investigate the relation between trait mindfulness and residual depressive symptoms in individuals with a history of recurrent depression, while taking into account symptom severity and number of episodes as contextual factors. Rather than fitting to a single indicator of central tendency, quantile regression allows exploration of relations across the entire range of the response variable. Analysis of self-report data from 274 participants with a history of three or more previous episodes of depression showed that relatively higher levels of mindfulness were associated with relatively lower levels of residual depressive symptoms. This relationship was most pronounced near the upper end of the response distribution and moderated by the number of previous episodes of depression at the higher quantiles. The findings suggest that with lower levels of mindfulness, residual symptoms are less constrained and more likely to be influenced by other factors. Further, the limiting effect of mindfulness on residual symptoms is most salient in those with higher numbers of episodes. PMID:24988072

  20. Toxocara infection in the United States: the relevance of poverty, geography and demography as risk factors, and implications for estimating county prevalence.

    PubMed

    Congdon, Peter; Lloyd, Patsy

    2011-02-01

    To estimate Toxocara infection rates by age, gender and ethnicity for US counties using data from the National Health and Nutrition Examination Survey (NHANES). After initial analysis to account for missing data, a binary regression model is applied to obtain relative risks of Toxocara infection for 20,396 survey subjects. The regression incorporates interplay between demographic attributes (age, ethnicity and gender), family poverty and geographic context (region, metropolitan status). Prevalence estimates for counties are then made, distinguishing between subpopulations in poverty and not in poverty. Even after allowing for elevated infection risk associated with poverty, seropositivity is elevated among Black non-Hispanics and other ethnic groups. There are also distinct effects of region. When regression results are translated into county prevalence estimates, the main influences on variation in county rates are percentages of non-Hispanic Blacks and county poverty. For targeting prevention it is important to assess implications of national survey data for small area prevalence. Using data from NHANES, the study confirms that both individual level risk factors and geographic contextual factors affect chances of Toxocara infection.

  1. Assessing Principal Component Regression Prediction of Neurochemicals Detected with Fast-Scan Cyclic Voltammetry

    PubMed Central

    2011-01-01

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586

  2. Semiparametric regression analysis of interval-censored competing risks data.

    PubMed

    Mao, Lu; Lin, Dan-Yu; Zeng, Donglin

    2017-09-01

    Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes. © 2017, The International Biometric Society.

  3. Assessing principal component regression prediction of neurochemicals detected with fast-scan cyclic voltammetry.

    PubMed

    Keithley, Richard B; Wightman, R Mark

    2011-06-07

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.

  4. Calypso: a user-friendly web-server for mining and visualizing microbiome-environment interactions.

    PubMed

    Zakrzewski, Martha; Proietti, Carla; Ellis, Jonathan J; Hasan, Shihab; Brion, Marie-Jo; Berger, Bernard; Krause, Lutz

    2017-03-01

    Calypso is an easy-to-use online software suite that allows non-expert users to mine, interpret and compare taxonomic information from metagenomic or 16S rDNA datasets. Calypso has a focus on multivariate statistical approaches that can identify complex environment-microbiome associations. The software enables quantitative visualizations, statistical testing, multivariate analysis, supervised learning, factor analysis, multivariable regression, network analysis and diversity estimates. Comprehensive help pages, tutorials and videos are provided via a wiki page. The web-interface is accessible via http://cgenome.net/calypso/ . The software is programmed in Java, PERL and R and the source code is available from Zenodo ( https://zenodo.org/record/50931 ). The software is freely available for non-commercial users. l.krause@uq.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. Studying Gene and Gene-Environment Effects of Uncommon and Common Variants on Continuous Traits: A Marker-Set Approach Using Gene-Trait Similarity Regression

    PubMed Central

    Tzeng, Jung-Ying; Zhang, Daowen; Pongpanich, Monnat; Smith, Chris; McCarthy, Mark I.; Sale, Michèle M.; Worrall, Bradford B.; Hsu, Fang-Chi; Thomas, Duncan C.; Sullivan, Patrick F.

    2011-01-01

    Genomic association analyses of complex traits demand statistical tools that are capable of detecting small effects of common and rare variants and modeling complex interaction effects and yet are computationally feasible. In this work, we introduce a similarity-based regression method for assessing the main genetic and interaction effects of a group of markers on quantitative traits. The method uses genetic similarity to aggregate information from multiple polymorphic sites and integrates adaptive weights that depend on allele frequencies to accomodate common and uncommon variants. Collapsing information at the similarity level instead of the genotype level avoids canceling signals that have the opposite etiological effects and is applicable to any class of genetic variants without the need for dichotomizing the allele types. To assess gene-trait associations, we regress trait similarities for pairs of unrelated individuals on their genetic similarities and assess association by using a score test whose limiting distribution is derived in this work. The proposed regression framework allows for covariates, has the capacity to model both main and interaction effects, can be applied to a mixture of different polymorphism types, and is computationally efficient. These features make it an ideal tool for evaluating associations between phenotype and marker sets defined by linkage disequilibrium (LD) blocks, genes, or pathways in whole-genome analysis. PMID:21835306

  6. Time-resolved perfusion imaging at the angiography suite: preclinical comparison of a new flat-detector application to computed tomography perfusion.

    PubMed

    Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver

    2015-02-01

    The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.

  7. Relationship Between Column-Density and Surface Mixing Ratio: Statistical Analysis of O3 and NO2 Data from the July 2011 Maryland DISCOVER-AQ Mission

    NASA Technical Reports Server (NTRS)

    Flynn, Clare; Pickering, Kenneth E.; Crawford, James H.; Lamsol, Lok; Krotkov, Nickolay; Herman, Jay; Weinheimer, Andrew; Chen, Gao; Liu, Xiong; Szykman, James; hide

    2014-01-01

    To investigate the ability of column (or partial column) information to represent surface air quality, results of linear regression analyses between surface mixing ratio data and column abundances for O3 and NO2 are presented for the July 2011 Maryland deployment of the DISCOVER-AQ mission. Data collected by the P-3B aircraft, ground-based Pandora spectrometers, Aura/OMI satellite instrument, and simulations for July 2011 from the CMAQ air quality model during this deployment provide a large and varied data set, allowing this problem to be approached from multiple perspectives. O3 columns typically exhibited a statistically significant and high degree of correlation with surface data (R(sup 2) > 0.64) in the P- 3B data set, a moderate degree of correlation (0.16 < R(sup 2) < 0.64) in the CMAQ data set, and a low degree of correlation (R(sup 2) < 0.16) in the Pandora and OMI data sets. NO2 columns typically exhibited a low to moderate degree of correlation with surface data in each data set. The results of linear regression analyses for O3 exhibited smaller errors relative to the observations than NO2 regressions. These results suggest that O3 partial column observations from future satellite instruments with sufficient sensitivity to the lower troposphere can be meaningful for surface air quality analysis.

  8. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    NASA Technical Reports Server (NTRS)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  9. Belief in miracles and attitudes towards voluntary euthanasia.

    PubMed

    Sharp, Shane

    2017-04-01

    Results of logistic regression analysis of data from the General Social Survey (N = 1,799) find that those who have a strong belief in miracles are more likely to say that a person with an incurable illness should not be allowed to accept medical treatments that painlessly hasten death than those who have a less strong belief in miracles or do not believe in miracles, net of respondents' religious affiliations, frequency of religious attendance, views of the Bible, and other sociodemographic controls. Results highlight the need to consider specific religious beliefs when predicting individuals' attitudes towards voluntary euthanasia.

  10. Quantifying Main Trends in Lysozyme Nucleation: The Effects of Precipitant Concentration, Supersaturation and Impurities

    NASA Technical Reports Server (NTRS)

    Burke, Michael W.; Leardi, Riccardo; Judge, Russell A.; Pusey, Marc L.; Whitaker, Ann F. (Technical Monitor)

    2001-01-01

    Full factorial experimental design incorporating multi-linear regression analysis of the experimental data allows quick identification of main trends and effects using a limited number of experiments. In this study these techniques were employed to identify the effect of precipitant concentration, supersaturation, and the presence of an impurity, the physiological lysozyme dimer, on the nucleation rate and crystal dimensions of the tetragonal forin of chicken egg white lysozyme. Decreasing precipitant concentration, increasing supers aturation, and increasing impurity, were found to increase crystal numbers. The crystal axial ratio decreased with increasing precipitant concentration, independent of impurity.

  11. Refining Stimulus Parameters in Assessing Infant Speech Perception Using Visual Reinforcement Infant Speech Discrimination: Sensation Level.

    PubMed

    Uhler, Kristin M; Baca, Rosalinda; Dudas, Emily; Fredrickson, Tammy

    2015-01-01

    Speech perception measures have long been considered an integral piece of the audiological assessment battery. Currently, a prelinguistic, standardized measure of speech perception is missing in the clinical assessment battery for infants and young toddlers. Such a measure would allow systematic assessment of speech perception abilities of infants as well as the potential to investigate the impact early identification of hearing loss and early fitting of amplification have on the auditory pathways. To investigate the impact of sensation level (SL) on the ability of infants with normal hearing (NH) to discriminate /a-i/ and /ba-da/ and to determine if performance on the two contrasts are significantly different in predicting the discrimination criterion. The design was based on a survival analysis model for event occurrence and a repeated measures logistic model for binary outcomes. The outcome for survival analysis was the minimum SL for criterion and the outcome for the logistic regression model was the presence/absence of achieving the criterion. Criterion achievement was designated when an infant's proportion correct score was >0.75 on the discrimination performance task. Twenty-two infants with NH sensitivity participated in this study. There were 9 males and 13 females, aged 6-14 mo. Testing took place over two to three sessions. The first session consisted of a hearing test, threshold assessment of the two speech sounds (/a/ and /i/), and if time and attention allowed, visual reinforcement infant speech discrimination (VRISD). The second session consisted of VRISD assessment for the two test contrasts (/a-i/ and /ba-da/). The presentation level started at 50 dBA. If the infant was unable to successfully achieve criterion (>0.75) at 50 dBA, the presentation level was increased to 70 dBA followed by 60 dBA. Data examination included an event analysis, which provided the probability of criterion distribution across SL. The second stage of the analysis was a repeated measures logistic regression where SL and contrast were used to predict the likelihood of speech discrimination criterion. Infants were able to reach criterion for the /a-i/ contrast at statistically lower SLs when compared to /ba-da/. There were six infants who never reached criterion for /ba-da/ and one never reached criterion for /a-i/. The conditional probability of not reaching criterion by 70 dB SL was 0% for /a-i/ and 21% for /ba-da/. The predictive logistic regression model showed that children were more likely to discriminate the /a-i/ even when controlling for SL. Nearly all normal-hearing infants can demonstrate discrimination criterion of a vowel contrast at 60 dB SL, while a level of ≥70 dB SL may be needed to allow all infants to demonstrate discrimination criterion of a difficult consonant contrast. American Academy of Audiology.

  12. Alcohol consumption and its related harms in The Netherlands since 1960: relationships with planned and unplanned factors.

    PubMed

    Knibbe, Ronald A; Derickx, Mieke; Allamani, Allaman; Massini, Giulia

    2014-10-01

    to establish which unplanned (social developments) and planned (alcohol policy measures) factors are related to per capita consumption and alcohol-related harms in the Netherlands. linear regression was used to establish which of the planned and unplanned factors were most strongly connected with alcohol consumption and harms. Artificial Neural Analysis (ANN) was used to inspect the interconnections between all variables. mothers age at birth was most strongly associated with increase in consumption. The ban on selling alcoholic beverages at petrol station was associated with a decrease in consumption. The linear regression of harms did not show any relation between alcohol policy measures and harms. The ANN-analyses indicate a very high interconnectedness between all variables allowing no causal inferences. Exceptions are the relation between price of beer and wine and the consumption of these beverages and the relation between a decrease in transport mortality and the increased use of breathalyzers tests and a restriction of paracommercial selling. unplanned factors are most strongly associated with per capita consumption and harms. ANN-analysis indicates that price of alcoholic beverages, breath testing, and restriction of sales may have had some influence. The study's limitations are noted.

  13. Optimisation of the formulation of a bubble bath by a chemometric approach market segmentation and optimisation.

    PubMed

    Marengo, Emilio; Robotti, Elisa; Gennaro, Maria Carla; Bertetto, Mariella

    2003-03-01

    The optimisation of the formulation of a commercial bubble bath was performed by chemometric analysis of Panel Tests results. A first Panel Test was performed to choose the best essence, among four proposed to the consumers; the best essence chosen was used in the revised commercial bubble bath. Afterwards, the effect of changing the amount of four components (the amount of primary surfactant, the essence, the hydratant and the colouring agent) of the bubble bath was studied by a fractional factorial design. The segmentation of the bubble bath market was performed by a second Panel Test, in which the consumers were requested to evaluate the samples coming from the experimental design. The results were then treated by Principal Component Analysis. The market had two segments: people preferring a product with a rich formulation and people preferring a poor product. The final target, i.e. the optimisation of the formulation for each segment, was obtained by the calculation of regression models relating the subjective evaluations given by the Panel and the compositions of the samples. The regression models allowed to identify the best formulations for the two segments ofthe market.

  14. Predictive sparse modeling of fMRI data for improved classification, regression, and visualization using the k-support norm.

    PubMed

    Belilovsky, Eugene; Gkirtzou, Katerina; Misyrlis, Michail; Konova, Anna B; Honorio, Jean; Alia-Klein, Nelly; Goldstein, Rita Z; Samaras, Dimitris; Blaschko, Matthew B

    2015-12-01

    We explore various sparse regularization techniques for analyzing fMRI data, such as the ℓ1 norm (often called LASSO in the context of a squared loss function), elastic net, and the recently introduced k-support norm. Employing sparsity regularization allows us to handle the curse of dimensionality, a problem commonly found in fMRI analysis. In this work we consider sparse regularization in both the regression and classification settings. We perform experiments on fMRI scans from cocaine-addicted as well as healthy control subjects. We show that in many cases, use of the k-support norm leads to better predictive performance, solution stability, and interpretability as compared to other standard approaches. We additionally analyze the advantages of using the absolute loss function versus the standard squared loss which leads to significantly better predictive performance for the regularization methods tested in almost all cases. Our results support the use of the k-support norm for fMRI analysis and on the clinical side, the generalizability of the I-RISA model of cocaine addiction. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Effect of acute hypoxia on cognition: A systematic review and meta-regression analysis.

    PubMed

    McMorris, Terry; Hale, Beverley J; Barwood, Martin; Costello, Joseph; Corbett, Jo

    2017-03-01

    A systematic meta-regression analysis of the effects of acute hypoxia on the performance of central executive and non-executive tasks, and the effects of the moderating variables, arterial partial pressure of oxygen (PaO 2 ) and hypobaric versus normobaric hypoxia, was undertaken. Studies were included if they were performed on healthy humans; within-subject design was used; data were reported giving the PaO 2 or that allowed the PaO 2 to be estimated (e.g. arterial oxygen saturation and/or altitude); and the duration of being in a hypoxic state prior to cognitive testing was ≤6days. Twenty-two experiments met the criteria for inclusion and demonstrated a moderate, negative mean effect size (g=-0.49, 95% CI -0.64 to -0.34, p<0.001). There were no significant differences between central executive and non-executive, perception/attention and short-term memory, tasks. Low (35-60mmHg) PaO 2 was the key predictor of cognitive performance (R 2 =0.45, p<0.001) and this was independent of whether the exposure was in hypobaric hypoxic or normobaric hypoxic conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

    PubMed

    Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

    2015-06-01

    High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.

  17. Statistical evaluation of stability data: criteria for change-over-time and data variability.

    PubMed

    Bar, Raphael

    2003-01-01

    In a recently issued ICH Q1E guidance on evaluation of stability data of drug substances and products, the need to perform a statistical extrapolation of a shelf-life of a drug product or a retest period for a drug substance is based heavily on whether data exhibit a change-over-time and/or variability. However, this document suggests neither measures nor acceptance criteria of these two parameters. This paper demonstrates a useful application of simple statistical parameters for determining whether sets of stability data from either accelerated or long-term storage programs exhibit a change-over-time and/or variability. These parameters are all derived from a simple linear regression analysis first performed on the stability data. The p-value of the slope of the regression line is taken as a measure for change-over-time, and a value of 0.25 is suggested as a limit to insignificant change of the quantitative stability attributes monitored. The minimal process capability index, Cpk, calculated from the standard deviation of the regression line, is suggested as a measure for variability with a value of 2.5 as a limit for an insignificant variability. The usefulness of the above two parameters, p-value and Cpk, was demonstrated on stability data of a refrigerated drug product and on pooled data of three batches of a drug substance. In both cases, the determined parameters allowed characterization of the data in terms of change-over-time and variability. Consequently, complete evaluation of the stability data could be pursued according to the ICH guidance. It is believed that the application of the above two parameters with their acceptance criteria will allow a more unified evaluation of stability data.

  18. Explaining the heterogeneous scrapie surveillance figures across Europe: a meta-regression approach.

    PubMed

    Del Rio Vilas, Victor J; Hopp, Petter; Nunes, Telmo; Ru, Giuseppe; Sivam, Kumar; Ortiz-Pelaez, Angel

    2007-06-28

    Two annual surveys, the abattoir and the fallen stock, monitor the presence of scrapie across Europe. A simple comparison between the prevalence estimates in different countries reveals that, in 2003, the abattoir survey appears to detect more scrapie in some countries. This is contrary to evidence suggesting the greater ability of the fallen stock survey to detect the disease. We applied meta-analysis techniques to study this apparent heterogeneity in the behaviour of the surveys across Europe. Furthermore, we conducted a meta-regression analysis to assess the effect of country-specific characteristics on the variability. We have chosen the odds ratios between the two surveys to inform the underlying relationship between them and to allow comparisons between the countries under the meta-regression framework. Baseline risks, those of the slaughtered populations across Europe, and country-specific covariates, available from the European Commission Report, were inputted in the model to explain the heterogeneity. Our results show the presence of significant heterogeneity in the odds ratios between countries and no reduction in the variability after adjustment for the different risks in the baseline populations. Three countries contributed the most to the overall heterogeneity: Germany, Ireland and The Netherlands. The inclusion of country-specific covariates did not, in general, reduce the variability except for one variable: the proportion of the total adult sheep population sampled as fallen stock by each country. A large residual heterogeneity remained in the model indicating the presence of substantial effect variability between countries. The meta-analysis approach was useful to assess the level of heterogeneity in the implementation of the surveys and to explore the reasons for the variation between countries.

  19. Malignant testicular tumour incidence and mortality trends

    PubMed Central

    Wojtyła-Buciora, Paulina; Więckowska, Barbara; Krzywinska-Wiewiorowska, Małgorzata; Gromadecka-Sutkiewicz, Małgorzata

    2016-01-01

    Aim of the study In Poland testicular tumours are the most frequent cancer among men aged 20–44 years. Testicular tumour incidence since the 1980s and 1990s has been diversified geographically, with an increased risk of mortality in Wielkopolska Province, which was highlighted at the turn of the 1980s and 1990s. The aim of the study was the comparative analysis of the tendencies in incidence and death rates due to malignant testicular tumours observed among men in Poland and in Wielkopolska Province. Material and methods Data from the National Cancer Registry were used for calculations. The incidence/mortality rates among men due to malignant testicular cancer as well as the tendencies in incidence/death ratio observed in Poland and Wielkopolska were established based on regression equation. The analysis was deepened by adopting the multiple linear regression model. A p-value < 0.05 was arbitrarily adopted as the criterion of statistical significance, and for multiple comparisons it was modified according to the Bonferroni adjustment to a value of p < 0.0028. Calculations were performed with the use of PQStat v1.4.8 package. Results The incidence of malignant testicular neoplasms observed among men in Poland and in Wielkopolska Province indicated a significant rising tendency. The multiple linear regression model confirmed that the year variable is a strong incidence forecast factor only within the territory of Poland. A corresponding analysis of mortality rates among men in Poland and in Wielkopolska Province did not show any statistically significant correlations. Conclusions Late diagnosis of Polish patients calls for undertaking appropriate educational activities that would facilitate earlier reporting of the patients, thus increasing their chances for recovery. Introducing preventive examinations in the regions of increased risk of testicular tumour may allow earlier diagnosis. PMID:27095941

  20. Pre and post PET-CT impact on oesophageal cancer management: a retrospective analysis.

    NASA Astrophysics Data System (ADS)

    Azmi, NA; Razak, HRA; Vinjamuri, S.

    2017-05-01

    Assessment of the retrospective cancer incidence, prevalence and crude survival rates of oesophageal cancer to allow comparison between pre and post PET-CT introduction are part of 4 phase cost effectiveness research. It will provide baseline data for to assess PET or PET-CT cost effective potential for staging. A total of 849 patient’s data received from NWCIS databases with various stages of oesophageal cancer between 2001 and 2008. The fundamental activities are retrospective analysis of patient data. In most cases where appropriate, results are presented with 95 percent confidence intervals (CI). Variances between patient groups and variables are assessed using chi-square test. In cases where it deems vital, multiple logistic regression are used to modify for potential confounder such as age and sex. All p-values are two-sided and any value lower than 0.05 were considered to suggest a statistically significant result. Retrospective analysis were categorised into two categories, patients from 2001-2003 considered as pre PET and post PET for 2004-2008. This categorisation allows better comparison of patients’ survival trend to be made between both groups. Rates are presented in percentages and being grouped by tumour characteristics and other variables associated with demographic profile, diagnosis, staging and treatment. Results allowed comparison of oesophageal cancer trends between the pre and post PET-CT introduction such as changes in incidence rate or changes in survival. These data were used to normalise the decision tree model so that cost-effectiveness analysis can be performed across the whole population.

  1. Forecasting daily patient volumes in the emergency department.

    PubMed

    Jones, Spencer S; Thomas, Alun; Evans, R Scott; Welch, Shari J; Haug, Peter J; Snow, Gregory L

    2008-02-01

    Shifts in the supply of and demand for emergency department (ED) resources make the efficient allocation of ED resources increasingly important. Forecasting is a vital activity that guides decision-making in many areas of economic, industrial, and scientific planning, but has gained little traction in the health care industry. There are few studies that explore the use of forecasting methods to predict patient volumes in the ED. The goals of this study are to explore and evaluate the use of several statistical forecasting methods to predict daily ED patient volumes at three diverse hospital EDs and to compare the accuracy of these methods to the accuracy of a previously proposed forecasting method. Daily patient arrivals at three hospital EDs were collected for the period January 1, 2005, through March 31, 2007. The authors evaluated the use of seasonal autoregressive integrated moving average, time series regression, exponential smoothing, and artificial neural network models to forecast daily patient volumes at each facility. Forecasts were made for horizons ranging from 1 to 30 days in advance. The forecast accuracy achieved by the various forecasting methods was compared to the forecast accuracy achieved when using a benchmark forecasting method already available in the emergency medicine literature. All time series methods considered in this analysis provided improved in-sample model goodness of fit. However, post-sample analysis revealed that time series regression models that augment linear regression models by accounting for serial autocorrelation offered only small improvements in terms of post-sample forecast accuracy, relative to multiple linear regression models, while seasonal autoregressive integrated moving average, exponential smoothing, and artificial neural network forecasting models did not provide consistently accurate forecasts of daily ED volumes. This study confirms the widely held belief that daily demand for ED services is characterized by seasonal and weekly patterns. The authors compared several time series forecasting methods to a benchmark multiple linear regression model. The results suggest that the existing methodology proposed in the literature, multiple linear regression based on calendar variables, is a reasonable approach to forecasting daily patient volumes in the ED. However, the authors conclude that regression-based models that incorporate calendar variables, account for site-specific special-day effects, and allow for residual autocorrelation provide a more appropriate, informative, and consistently accurate approach to forecasting daily ED patient volumes.

  2. Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)

    DTIC Science & Technology

    1987-10-01

    Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE

  3. Regression analysis utilizing subjective evaluation of emotional experience in PET studies on emotions.

    PubMed

    Aalto, Sargo; Wallius, Esa; Näätänen, Petri; Hiltunen, Jaana; Metsähonkala, Liisa; Sipilä, Hannu; Karlsson, Hasse

    2005-09-01

    A methodological study on subject-specific regression analysis (SSRA) exploring the correlation between the neural response and the subjective evaluation of emotional experience in eleven healthy females is presented. The target emotions, i.e., amusement and sadness, were induced using validated film clips, regional cerebral blood flow (rCBF) was measured using positron emission tomography (PET), and the subjective intensity of the emotional experience during the PET scanning was measured using a category ratio (CR-10) scale. Reliability analysis of the rating data indicated that the subjects rated the intensity of their emotional experience fairly consistently on the CR-10 scale (Cronbach alphas 0.70-0.97). A two-phase random-effects analysis was performed to ensure the generalizability and inter-study comparability of the SSRA results. Random-effects SSRAs using Statistical non-Parametric Mapping 99 (SnPM99) showed that rCBF correlated with the self-rated intensity of the emotional experience mainly in the brain regions that were identified in the random-effects subtraction analyses using the same imaging data. Our results give preliminary evidence of a linear association between the neural responses related to amusement and sadness and the self-evaluated intensity of the emotional experience in several regions involved in the emotional response. SSRA utilizing subjective evaluation of emotional experience turned out a feasible and promising method of analysis. It allows versatile exploration of the neurobiology of emotions and the neural correlates of actual and individual emotional experience. Thus, SSRA might be able to catch the idiosyncratic aspects of the emotional response better than traditional subtraction analysis.

  4. AGSuite: Software to conduct feature analysis of artificial grammar learning performance.

    PubMed

    Cook, Matthew T; Chubala, Chrissy M; Jamieson, Randall K

    2017-10-01

    To simplify the problem of studying how people learn natural language, researchers use the artificial grammar learning (AGL) task. In this task, participants study letter strings constructed according to the rules of an artificial grammar and subsequently attempt to discriminate grammatical from ungrammatical test strings. Although the data from these experiments are usually analyzed by comparing the mean discrimination performance between experimental conditions, this practice discards information about the individual items and participants that could otherwise help uncover the particular features of strings associated with grammaticality judgments. However, feature analysis is tedious to compute, often complicated, and ill-defined in the literature. Moreover, the data violate the assumption of independence underlying standard linear regression models, leading to Type I error inflation. To solve these problems, we present AGSuite, a free Shiny application for researchers studying AGL. The suite's intuitive Web-based user interface allows researchers to generate strings from a database of published grammars, compute feature measures (e.g., Levenshtein distance) for each letter string, and conduct a feature analysis on the strings using linear mixed effects (LME) analyses. The LME analysis solves the inflation of Type I errors that afflicts more common methods of repeated measures regression analysis. Finally, the software can generate a number of graphical representations of the data to support an accurate interpretation of results. We hope the ease and availability of these tools will encourage researchers to take full advantage of item-level variance in their datasets in the study of AGL. We moreover discuss the broader applicability of the tools for researchers looking to conduct feature analysis in any field.

  5. Inter-Method Reliability of School Effectiveness Measures: A Comparison of Value-Added and Regression Discontinuity Estimates

    ERIC Educational Resources Information Center

    Perry, Thomas

    2017-01-01

    Value-added (VA) measures are currently the predominant approach used to compare the effectiveness of schools. Recent educational effectiveness research, however, has developed alternative approaches including the regression discontinuity (RD) design, which also allows estimation of absolute school effects. Initial research suggests RD is a viable…

  6. LOCATING NEARBY SOURCES OF AIR POLLUTION BY NONPARAMETRIC REGRESSION OF ATMOSPHERIC CONCENTRATIONS ON WIND DIRECTION. (R826238)

    EPA Science Inventory

    The relationship of the concentration of air pollutants to wind direction has been determined by nonparametric regression using a Gaussian kernel. The results are smooth curves with error bars that allow for the accurate determination of the wind direction where the concentrat...

  7. Creep-Rupture Data Analysis - Engineering Application of Regression Techniques. Ph.D. Thesis - North Carolina State Univ.

    NASA Technical Reports Server (NTRS)

    Rummler, D. R.

    1976-01-01

    The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.

  8. Multilevel covariance regression with correlated random effects in the mean and variance structure.

    PubMed

    Quintero, Adrian; Lesaffre, Emmanuel

    2017-09-01

    Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Resting-state functional magnetic resonance imaging: the impact of regression analysis.

    PubMed

    Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi

    2015-01-01

    To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.

  10. Changes in Georgia restaurant and bar smoking policies from 2006 to 2012.

    PubMed

    Chandora, Rachna D; Whitney, Carrie F; Weaver, Scott R; Eriksen, Michael P

    2015-05-14

    The purpose of this study is to examine the change in smoking policy status among Georgia restaurants and bars from 2006 to 2012 and to identify restaurant and bar characteristics that are associated with allowing smoking. Data were obtained from similar cross-sectional indoor air surveys conducted in 2006 and 2012 in Georgia. Both surveys were designed to gather information about restaurant and bar smoking policies. Weighted χ(2) analyses were performed to identify changes in smoking policy status and other variables from 2006 to 2012. Weighted logistic regression analysis was used to test for significant associations between an establishment's smoking policy and other characteristics. The percentage of restaurants and bars in Georgia that allowed smoking nearly doubled, from 9.1% in 2006 to 17.6% in 2012. The analyses also showed a significant increase in the percentage of establishments that allow smoking when minors are present. Having a liquor license was a significant predictor of allowing smoking. The Smokefree Air Act was enacted in 2005 to protect the health and welfare of Georgia citizens, but study results suggest that policy makers should reevaluate the law and consider strengthening it to make restaurants and bars 100% smokefree without exemptions.

  11. Principal polynomial analysis.

    PubMed

    Laparra, Valero; Jiménez, Sandra; Tuia, Devis; Camps-Valls, Gustau; Malo, Jesus

    2014-11-01

    This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.

  12. Factors associated with furry pet ownership among patients with asthma.

    PubMed

    Downes, Martin J; Roy, Angkana; McGinn, Thomas G; Wisnivesky, Juan P

    2010-09-01

    Exposure to indoor allergens is an established risk factor for poor asthma control. Current guidelines recommend removing pets from the home of patients with asthma. This cross-sectional study was conducted to determine the prevalence of furry pet ownership in asthmatics compared to non-asthmatics and to identify factors associated with furry pet ownership among those with asthma. Secondary analysis assessed characteristics among asthmatics that might be associated with allowing a furry pet into the bedroom. Using data from The National Asthma Survey collected from 2003 to 2004, we carried out univariate and multiple regression analyses, in 2009, to identify independent predictors of furry pet ownership in asthma sufferers after controlling for potential confounders. Overall, asthmatics were more likely to own a furry pet than nonasthmatic individuals in the general population (49.9% versus 44.8%, p < .001). Multivariate analysis showed that female sex, older age, white race, and high income were independent predictors of furry pet ownership among asthmatics. Additionally, 68.7% of patients with asthma who own a furry pet allowed them into their bedroom. Higher income and carrying out < or =2 environmental control practices in the home were associated with increased likelihood of allowing a furry pet into the bedroom. Furry pet ownership is equally or more common among asthmatics compared to those without asthma. The majority of asthmatics with furry pets allow them into the bedroom. Recognizing and addressing these problems may help decrease asthma morbidity.

  13. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  14. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  15. Species Composition at the Sub-Meter Level in Discontinuous Permafrost in Subarctic Sweden

    NASA Astrophysics Data System (ADS)

    Anderson, S. M.; Palace, M. W.; Layne, M.; Varner, R. K.; Crill, P. M.

    2013-12-01

    Northern latitudes are experiencing rapid warming. Wetlands underlain by permafrost are particularly vulnerable to warming which results in changes in vegetative cover. Specific species have been associated with greenhouse gas emissions therefore knowledge of species compositional shift allows for the systematic change and quantification of emissions and changes in such emissions. Species composition varies on the sub-meter scale based on topography and other microsite environmental parameters. This complexity and the need to scale vegetation to the landscape level proves vital in our estimation of carbon dioxide (CO2) and methane (CH4) emissions and dynamics. Stordalen Mire (68°21'N, 18°49'E) in Abisko and is located at the edge of discontinuous permafrost zone. This provides a unique opportunity to analyze multiple vegetation communities in a close proximity. To do this, we randomly selected 25 1x1 meter plots that were representative of five major cover types: Semi-wet, wet, hummock, tall graminoid, and tall shrub. We used a quadrat with 64 sub plots and measured areal percent cover for 24 species. We collected ground based remote sensing (RS) at each plot to determine species composition using an ADC-lite (near infrared, red, green) and GoPro (red, blue, green). We normalized each image based on a Teflon white chip placed in each image. Textural analysis was conducted on each image for entropy, angular second momentum, and lacunarity. A logistic regression was developed to examine vegetation cover types and remote sensing parameters. We used a multiple linear regression using forwards stepwise variable selection. We found statistical difference in species composition and diversity indices between vegetation cover types. In addition, we were able to build regression model to significantly estimate vegetation cover type as well as percent cover for specific key vegetative species. This ground-based remote sensing allows for quick quantification of vegetation cover and species and also provides the framework for scaling to satellite image data to estimate species composition and shift on the landscape level. To determine diversity within our plots we calculated species richness and Shannon Index. We found that there were statistically different species composition within each vegetation cover type and also determined which species were indicative for cover type. Our logistical regression was able to significantly classify vegetation cover types based on RS parameters. Our multiple regression analysis indicated Betunla nana (Dwarf Birch) (r2= .48, p=<0.0001) and Sphagnum (r2=0.59, p=<0.0001) were statistically significant with respect to RS parameters. We suggest that ground based remote sensing methods may provide a unique and efficient method to quantify vegetation across the landscape in northern latitude wetlands.

  16. cp-R, an interface the R programming language for clinical laboratory method comparisons.

    PubMed

    Holmes, Daniel T

    2015-02-01

    Clinical scientists frequently need to compare two different bioanalytical methods as part of assay validation/monitoring. As a matter necessity, regression methods for quantitative comparison in clinical chemistry, hematology and other clinical laboratory disciplines must allow for error in both the x and y variables. Traditionally the methods popularized by 1) Deming and 2) Passing and Bablok have been recommended. While commercial tools exist, no simple open source tool is available. The purpose of this work was to develop and entirely open-source GUI-driven program for bioanalytical method comparisons capable of performing these regression methods and able to produce highly customized graphical output. The GUI is written in python and PyQt4 with R scripts performing regression and graphical functions. The program can be run from source code or as a pre-compiled binary executable. The software performs three forms of regression and offers weighting where applicable. Confidence bands of the regression are calculated using bootstrapping for Deming and Passing Bablok methods. Users can customize regression plots according to the tools available in R and can produced output in any of: jpg, png, tiff, bmp at any desired resolution or ps and pdf vector formats. Bland Altman plots and some regression diagnostic plots are also generated. Correctness of regression parameter estimates was confirmed against existing R packages. The program allows for rapid and highly customizable graphical output capable of conforming to the publication requirements of any clinical chemistry journal. Quick method comparisons can also be performed and cut and paste into spreadsheet or word processing applications. We present a simple and intuitive open source tool for quantitative method comparison in a clinical laboratory environment. Copyright © 2014 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  17. Estimation of variance in Cox's regression model with shared gamma frailties.

    PubMed

    Andersen, P K; Klein, J P; Knudsen, K M; Tabanera y Palacios, R

    1997-12-01

    The Cox regression model with a shared frailty factor allows for unobserved heterogeneity or for statistical dependence between the observed survival times. Estimation in this model when the frailties are assumed to follow a gamma distribution is reviewed, and we address the problem of obtaining variance estimates for regression coefficients, frailty parameter, and cumulative baseline hazards using the observed nonparametric information matrix. A number of examples are given comparing this approach with fully parametric inference in models with piecewise constant baseline hazards.

  18. Boosting structured additive quantile regression for longitudinal childhood obesity data.

    PubMed

    Fenske, Nora; Fahrmeir, Ludwig; Hothorn, Torsten; Rzehak, Peter; Höhle, Michael

    2013-07-25

    Childhood obesity and the investigation of its risk factors has become an important public health issue. Our work is based on and motivated by a German longitudinal study including 2,226 children with up to ten measurements on their body mass index (BMI) and risk factors from birth to the age of 10 years. We introduce boosting of structured additive quantile regression as a novel distribution-free approach for longitudinal quantile regression. The quantile-specific predictors of our model include conventional linear population effects, smooth nonlinear functional effects, varying-coefficient terms, and individual-specific effects, such as intercepts and slopes. Estimation is based on boosting, a computer intensive inference method for highly complex models. We propose a component-wise functional gradient descent boosting algorithm that allows for penalized estimation of the large variety of different effects, particularly leading to individual-specific effects shrunken toward zero. This concept allows us to flexibly estimate the nonlinear age curves of upper quantiles of the BMI distribution, both on population and on individual-specific level, adjusted for further risk factors and to detect age-varying effects of categorical risk factors. Our model approach can be regarded as the quantile regression analog of Gaussian additive mixed models (or structured additive mean regression models), and we compare both model classes with respect to our obesity data.

  19. [A SAS marco program for batch processing of univariate Cox regression analysis for great database].

    PubMed

    Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin

    2015-02-01

    To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.

  20. Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2009-01-01

    In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…

  1. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    USDA-ARS?s Scientific Manuscript database

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  2. Functional interaction-based nonlinear models with application to multiplatform genomics data.

    PubMed

    Davenport, Clemontina A; Maity, Arnab; Baladandayuthapani, Veerabhadran

    2018-05-07

    Functional regression allows for a scalar response to be dependent on a functional predictor; however, not much work has been done when a scalar exposure that interacts with the functional covariate is introduced. In this paper, we present 2 functional regression models that account for this interaction and propose 2 novel estimation procedures for the parameters in these models. These estimation methods allow for a noisy and/or sparsely observed functional covariate and are easily extended to generalized exponential family responses. We compute standard errors of our estimators, which allows for further statistical inference and hypothesis testing. We compare the performance of the proposed estimators to each other and to one found in the literature via simulation and demonstrate our methods using a real data example. Copyright © 2018 John Wiley & Sons, Ltd.

  3. Food adulteration analysis without laboratory prepared or determined reference food adulterant values.

    PubMed

    Kalivas, John H; Georgiou, Constantinos A; Moira, Marianna; Tsafaras, Ilias; Petrakis, Eleftherios A; Mousdis, George A

    2014-04-01

    Quantitative analysis of food adulterants is an important health and economic issue that needs to be fast and simple. Spectroscopy has significantly reduced analysis time. However, still needed are preparations of analyte calibration samples matrix matched to prediction samples which can be laborious and costly. Reported in this paper is the application of a newly developed pure component Tikhonov regularization (PCTR) process that does not require laboratory prepared or reference analysis methods, and hence, is a greener calibration method. The PCTR method requires an analyte pure component spectrum and non-analyte spectra. As a food analysis example, synchronous fluorescence spectra of extra virgin olive oil samples adulterated with sunflower oil is used. Results are shown to be better than those obtained using ridge regression with reference calibration samples. The flexibility of PCTR allows including reference samples and is generic for use with other instrumental methods and food products. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. A new device to study isoload eccentric exercise.

    PubMed

    Guilhem, Gaël; Cornu, Christophe; Nordez, Antoine; Guével, Arnaud

    2010-12-01

    This study was designed to develop a new device allowing mechanical analysis of eccentric exercise against a constant load, with a view in mind to compare isoload (IL) and isokinetic (IK) eccentric exercises. A plate-loaded resistance training device was integrated to an IK dynamometer, to perform the acquisition of mechanical parameters (i.e., external torque, angular velocity). To determine the muscular torque produced by the subject, load torque was experimentally measured (TLexp) at 11 different loads from 30° to 90° angle (0° = lever arm in horizontal position). TLexp was modeled to take friction effect and torque variations into account. Validity of modeled load torque (TLmod) was tested by determining the root mean square (RMS) error, bias, and 2SD between the descending part of TLexp (from 30° to 90°) and TLmod. Validity of TLexp was tested by a linear regression and a Passing-Bablok regression. A pilot analysis on 10 subjects was performed to determine the contribution of the torque because of the moment of inertia to the amount of external work (W). Results showed the validity of TLmod (bias = 0%; RMS error = 0.51%) and TLexp SEM = 4.1 N·m; Intraclass correlation coefficient (ICC) = 1.00; slope = 0.99; y-intercept = -0.13). External work calculation showed a satisfactory reproducibility (SEM = 38.3 J; ICC = 0.98) and moment of inertia contribution to W showed a low value (3.2 ± 2.0%). Results allow us to validate the new device developed in this study. Such a device could be used in future work to study IL eccentric exercise and to compare the effect of IL and IK eccentric exercises in standardized conditions.

  5. Magnetic Resonance Fingerprinting of Adult Brain Tumors: Initial Experience

    PubMed Central

    Badve, Chaitra; Yu, Alice; Dastmalchian, Sara; Rogers, Matthew; Ma, Dan; Jiang, Yun; Margevicius, Seunghee; Pahwa, Shivani; Lu, Ziang; Schluchter, Mark; Sunshine, Jeffrey; Griswold, Mark; Sloan, Andrew; Gulani, Vikas

    2016-01-01

    Background Magnetic resonance fingerprinting (MRF) allows rapid simultaneous quantification of T1 and T2 relaxation times. This study assesses the utility of MRF in differentiating between common types of adult intra-axial brain tumors. Methods MRF acquisition was performed in 31 patients with untreated intra-axial brain tumors: 17 glioblastomas, 6 WHO grade II lower-grade gliomas and 8 metastases. T1, T2 of the solid tumor (ST), immediate peritumoral white matter (PW), and contralateral white matter (CW) were summarized within each region of interest. Statistical comparisons on mean, standard deviation, skewness and kurtosis were performed using univariate Wilcoxon rank sum test across various tumor types. Bonferroni correction was used to correct for multiple comparisons testing. Multivariable logistic regression analysis was performed for discrimination between glioblastomas and metastases and area under the receiver operator curve (AUC) was calculated. Results Mean T2 values could differentiate solid tumor regions of lower-grade gliomas from metastases (mean±sd: 172±53ms and 105±27ms respectively, p =0.004, significant after Bonferroni correction). Mean T1 of PW surrounding lower-grade gliomas differed from PW around glioblastomas (mean±sd: 1066±218ms and 1578±331ms respectively, p=0.004, significant after Bonferroni correction). Logistic regression analysis revealed that mean T2 of ST offered best separation between glioblastomas and metastases with AUC of 0.86 (95% CI 0.69–1.00, p<0.0001). Conclusion MRF allows rapid simultaneous T1, T2 measurement in brain tumors and surrounding tissues. MRF based relaxometry can identify quantitative differences between solid-tumor regions of lower grade gliomas and metastases and between peritumoral regions of glioblastomas and lower grade gliomas. PMID:28034994

  6. MR Fingerprinting of Adult Brain Tumors: Initial Experience.

    PubMed

    Badve, C; Yu, A; Dastmalchian, S; Rogers, M; Ma, D; Jiang, Y; Margevicius, S; Pahwa, S; Lu, Z; Schluchter, M; Sunshine, J; Griswold, M; Sloan, A; Gulani, V

    2017-03-01

    MR fingerprinting allows rapid simultaneous quantification of T1 and T2 relaxation times. This study assessed the utility of MR fingerprinting in differentiating common types of adult intra-axial brain tumors. MR fingerprinting acquisition was performed in 31 patients with untreated intra-axial brain tumors: 17 glioblastomas, 6 World Health Organization grade II lower grade gliomas, and 8 metastases. T1, T2 of the solid tumor, immediate peritumoral white matter, and contralateral white matter were summarized within each ROI. Statistical comparisons on mean, SD, skewness, and kurtosis were performed by using the univariate Wilcoxon rank sum test across various tumor types. Bonferroni correction was used to correct for multiple-comparison testing. Multivariable logistic regression analysis was performed for discrimination between glioblastomas and metastases, and area under the receiver operator curve was calculated. Mean T2 values could differentiate solid tumor regions of lower grade gliomas from metastases (mean, 172 ± 53 ms, and 105 ± 27 ms, respectively; P = .004, significant after Bonferroni correction). The mean T1 of peritumoral white matter surrounding lower grade gliomas differed from peritumoral white matter around glioblastomas (mean, 1066 ± 218 ms, and 1578 ± 331 ms, respectively; P = .004, significant after Bonferroni correction). Logistic regression analysis revealed that the mean T2 of solid tumor offered the best separation between glioblastomas and metastases with an area under the curve of 0.86 (95% CI, 0.69-1.00; P < .0001). MR fingerprinting allows rapid simultaneous T1 and T2 measurement in brain tumors and surrounding tissues. MR fingerprinting-based relaxometry can identify quantitative differences between solid tumor regions of lower grade gliomas and metastases and between peritumoral regions of glioblastomas and lower grade gliomas. © 2017 by American Journal of Neuroradiology.

  7. Development of a User Interface for a Regression Analysis Software Tool

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  8. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

    PubMed Central

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

    2013-01-01

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436

  9. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

    PubMed

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

    2013-10-15

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.

  10. Confounding adjustment in comparative effectiveness research conducted within distributed research networks.

    PubMed

    Toh, Sengwee; Gagne, Joshua J; Rassen, Jeremy A; Fireman, Bruce H; Kulldorff, Martin; Brown, Jeffrey S

    2013-08-01

    A distributed research network (DRN) of electronic health care databases, in which data reside behind the firewall of each data partner, can support a wide range of comparative effectiveness research (CER) activities. An essential component of a fully functional DRN is the capability to perform robust statistical analyses to produce valid, actionable evidence without compromising patient privacy, data security, or proprietary interests. We describe the strengths and limitations of different confounding adjustment approaches that can be considered in observational CER studies conducted within DRNs, and the theoretical and practical issues to consider when selecting among them in various study settings. Several methods can be used to adjust for multiple confounders simultaneously, either as individual covariates or as confounder summary scores (eg, propensity scores and disease risk scores), including: (1) centralized analysis of patient-level data, (2) case-centered logistic regression of risk set data, (3) stratified or matched analysis of aggregated data, (4) distributed regression analysis, and (5) meta-analysis of site-specific effect estimates. These methods require different granularities of information be shared across sites and afford investigators different levels of analytic flexibility. DRNs are growing in use and sharing of highly detailed patient-level information is not always feasible in DRNs. Methods that incorporate confounder summary scores allow investigators to adjust for a large number of confounding factors without the need to transfer potentially identifiable information in DRNs. They have the potential to let investigators perform many analyses traditionally conducted through a centralized dataset with detailed patient-level information.

  11. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  12. Spatial regression analysis on 32 years of total column ozone data

    NASA Astrophysics Data System (ADS)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-08-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

  13. Efficiency in the European agricultural sector: environment and resources.

    PubMed

    Moutinho, Victor; Madaleno, Mara; Macedo, Pedro; Robaina, Margarita; Marques, Carlos

    2018-04-22

    This article intends to compute agriculture technical efficiency scores of 27 European countries during the period 2005-2012, using both data envelopment analysis (DEA) and stochastic frontier analysis (SFA) with a generalized cross-entropy (GCE) approach, for comparison purposes. Afterwards, by using the scores as dependent variable, we apply quantile regressions using a set of possible influencing variables within the agricultural sector able to explain technical efficiency scores. Results allow us to conclude that although DEA and SFA are quite distinguishable methodologies, and despite attained results are different in terms of technical efficiency scores, both are able to identify analogously the worst and better countries. They also suggest that it is important to include resources productivity and subsidies in determining technical efficiency due to its positive and significant exerted influence.

  14. A Spreadsheet Tool for Learning the Multiple Regression F-Test, T-Tests, and Multicollinearity

    ERIC Educational Resources Information Center

    Martin, David

    2008-01-01

    This note presents a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own questions related to the multiple regression F-test, the t-tests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes,…

  15. Measurement of Vibrated Bulk Density of Coke Particle Blends Using Image Texture Analysis

    NASA Astrophysics Data System (ADS)

    Azari, Kamran; Bogoya-Forero, Wilinthon; Duchesne, Carl; Tessier, Jayson

    2017-09-01

    A rapid and nondestructive machine vision sensor was developed for predicting the vibrated bulk density (VBD) of petroleum coke particles based on image texture analysis. It could be used for making corrective adjustments to a paste plant operation to reduce green anode variability (e.g., changes in binder demand). Wavelet texture analysis (WTA) and gray level co-occurrence matrix (GLCM) algorithms were used jointly for extracting the surface textural features of coke aggregates from images. These were correlated with the VBD using partial least-squares (PLS) regression. Coke samples of several sizes and from different sources were used to test the sensor. Variations in the coke surface texture introduced by coke size and source allowed for making good predictions of the VBD of individual coke samples and mixtures of them (blends involving two sources and different sizes). Promising results were also obtained for coke blends collected from an industrial-baked carbon anode manufacturer.

  16. SandiaMRCR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2012-01-05

    SandiaMCR was developed to identify pure components and their concentrations from spectral data. This software efficiently implements the multivariate calibration regression alternating least squares (MCR-ALS), principal component analysis (PCA), and singular value decomposition (SVD). Version 3.37 also includes the PARAFAC-ALS Tucker-1 (for trilinear analysis) algorithms. The alternating least squares methods can be used to determine the composition without or with incomplete prior information on the constituents and their concentrations. It allows the specification of numerous preprocessing, initialization and data selection and compression options for the efficient processing of large data sets. The software includes numerous options including the definition ofmore » equality and non-negativety constraints to realistically restrict the solution set, various normalization or weighting options based on the statistics of the data, several initialization choices and data compression. The software has been designed to provide a practicing spectroscopist the tools required to routinely analysis data in a reasonable time and without requiring expert intervention.« less

  17. Determination of urine ionic composition with potentiometric multisensor system.

    PubMed

    Yaroshenko, Irina; Kirsanov, Dmitry; Kartsova, Lyudmila; Sidorova, Alla; Borisova, Irina; Legin, Andrey

    2015-01-01

    The ionic composition of urine is a good indicator of patient's general condition and allows for diagnostics of certain medical problems such as e.g., urolithiasis. Due to environmental factors and malnutrition the number of registered urinary tract cases continuously increases. Most of the methods currently used for urine analysis are expensive, quite laborious and require skilled personnel. The present work deals with feasibility study of potentiometric multisensor system of 18 ion-selective and cross-sensitive sensors as an analytical tool for determination of urine ionic composition. In total 136 samples from patients of Urolithiasis Laboratory and healthy people were analyzed by the multisensor system as well as by capillary electrophoresis as a reference method. Various chemometric approaches were implemented to relate the data from electrochemical measurements with the reference data. Logistic regression (LR) was applied for classification of samples into healthy and unhealthy producing reasonable misclassification rates. Projection on Latent Structures (PLS) regression was applied for quantitative analysis of ionic composition from potentiometric data. Mean relative errors of simultaneous prediction of sodium, potassium, ammonium, calcium, magnesium, chloride, sulfate, phosphate, urate and creatinine from multisensor system response were in the range 3-13% for independent test sets. This shows a good promise for development of a fast and inexpensive alternative method for urine analysis. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Between-centre variability in transfer function analysis, a widely used method for linear quantification of the dynamic pressure–flow relation: The CARNet study

    PubMed Central

    Meel-van den Abeelen, Aisha S.S.; Simpson, David M.; Wang, Lotte J.Y.; Slump, Cornelis H.; Zhang, Rong; Tarumi, Takashi; Rickards, Caroline A.; Payne, Stephen; Mitsis, Georgios D.; Kostoglou, Kyriaki; Marmarelis, Vasilis; Shin, Dae; Tzeng, Yu-Chieh; Ainslie, Philip N.; Gommer, Erik; Müller, Martin; Dorado, Alexander C.; Smielewski, Peter; Yelicich, Bernardo; Puppo, Corina; Liu, Xiuyun; Czosnyka, Marek; Wang, Cheng-Yen; Novak, Vera; Panerai, Ronney B.; Claassen, Jurgen A.H.R.

    2014-01-01

    Transfer function analysis (TFA) is a frequently used method to assess dynamic cerebral autoregulation (CA) using spontaneous oscillations in blood pressure (BP) and cerebral blood flow velocity (CBFV). However, controversies and variations exist in how research groups utilise TFA, causing high variability in interpretation. The objective of this study was to evaluate between-centre variability in TFA outcome metrics. 15 centres analysed the same 70 BP and CBFV datasets from healthy subjects (n = 50 rest; n = 20 during hypercapnia); 10 additional datasets were computer-generated. Each centre used their in-house TFA methods; however, certain parameters were specified to reduce a priori between-centre variability. Hypercapnia was used to assess discriminatory performance and synthetic data to evaluate effects of parameter settings. Results were analysed using the Mann–Whitney test and logistic regression. A large non-homogeneous variation was found in TFA outcome metrics between the centres. Logistic regression demonstrated that 11 centres were able to distinguish between normal and impaired CA with an AUC > 0.85. Further analysis identified TFA settings that are associated with large variation in outcome measures. These results indicate the need for standardisation of TFA settings in order to reduce between-centre variability and to allow accurate comparison between studies. Suggestions on optimal signal processing methods are proposed. PMID:24725709

  19. Grey and white matter differences in brain energy metabolism in first episode schizophrenia: 31P-MRS chemical shift imaging at 4 Tesla.

    PubMed

    Jensen, J Eric; Miller, Jodi; Williamson, Peter C; Neufeld, Richard W J; Menon, Ravi S; Malla, Ashok; Manchanda, Rahul; Schaefer, Betsy; Densmore, Maria; Drost, Dick J

    2006-03-31

    Altered high energy and membrane metabolism, measured with phosphorus magnetic resonance spectroscopy (31P-MRS), has been inconsistently reported in schizophrenic patients in several anatomical brain regions implicated in the pathophysiology of this illness, with little attention to the effects of brain tissue type on the results. Tissue regression analysis correlates brain tissue type to measured metabolite levels, allowing for the extraction of "pure" estimated grey and white matter compartment metabolite levels. We use this tissue analysis technique on a clinical dataset of first episode schizophrenic patients and matched controls to investigate the effect of brain tissue specificity on altered energy and membrane metabolism. In vivo brain spectra from two regions, (a) the fronto-temporal-striatal region and (b) the frontal-lobes, were analyzed from 12 first episode schizophrenic patients and 11 matched controls from a (31)P chemical shift imaging (CSI) study at 4 Tesla (T) field strength. Tissue regression analyses using voxels from each region were performed relating metabolite levels to tissue content, examining phosphorus metabolite levels in grey and white matter compartments. Compared with controls, the first episode schizophrenic patient group showed significantly increased adenosine triphosphate levels (B-ATP) in white matter and decreased B-ATP levels in grey matter in the fronto-temporal-striatal region. No significant metabolite level differences were found in grey or white matter compartments in the frontal cortex. Tissue regression analysis reveals grey and white matter specific aberrations in high-energy phosphates in first episode schizophrenia. Although past studies report inconsistent regional differences in high-energy phosphate levels in schizophrenia, the present analysis suggests more widespread differences that seem to be strongly related to tissue type. Our data suggest that differences in grey and white matter tissue content between past studies may account for some of the variance in the literature.

  20. Semi-continuous mass closure of the major components of fine particulate matter in Riverside, CA

    NASA Astrophysics Data System (ADS)

    Grover, Brett D.; Eatough, Norman L.; Woolwine, Woods R.; Cannon, Justin P.; Eatough, Delbert J.; Long, Russell W.

    The application of newly developed semi-continuous aerosol monitors allows for the measurement of all the major species of PM 2.5 on a 1-h time basis. Temporal resolution of both non-volatile and semi-volatile species is possible. A suite of instruments to measure the major chemical species of PM 2.5 allows for semi-continuous mass closure. A newly developed dual-oven Sunset carbon monitor is used to measure non-volatile organic carbon, semi-volatile organic carbon and elemental carbon. Inorganic species, including sulfate and nitrate, can be measured with an ion chromatograph based sampler. Comparison of the sum of the major chemical species in an urban aerosol with mass measured by an FDMS resulted in excellent agreement. Linear regression analysis resulted in a zero-intercept slope of 0.98±0.01 with an R2=0.86. One-hour temporal resolution of the major species of PM 2.5 may reduce the uncertainty in receptor based source apportionment modeling, will allow for better forecasting of PM 2.5 episodes, and may lead to increased understanding of related health effects.

  1. Model Robust Calibration: Method and Application to Electronically-Scanned Pressure Transducers

    NASA Technical Reports Server (NTRS)

    Walker, Eric L.; Starnes, B. Alden; Birch, Jeffery B.; Mays, James E.

    2010-01-01

    This article presents the application of a recently developed statistical regression method to the controlled instrument calibration problem. The statistical method of Model Robust Regression (MRR), developed by Mays, Birch, and Starnes, is shown to improve instrument calibration by reducing the reliance of the calibration on a predetermined parametric (e.g. polynomial, exponential, logarithmic) model. This is accomplished by allowing fits from the predetermined parametric model to be augmented by a certain portion of a fit to the residuals from the initial regression using a nonparametric (locally parametric) regression technique. The method is demonstrated for the absolute scale calibration of silicon-based pressure transducers.

  2. Multivariate Regression Analysis and Slaughter Livestock,

    DTIC Science & Technology

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  3. Penalized nonparametric scalar-on-function regression via principal coordinates

    PubMed Central

    Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu

    2016-01-01

    A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963

  4. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  5. RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,

    DTIC Science & Technology

    This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)

  6. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value

  7. Design and analysis of a novel mechanical loading machine for dynamic in vivo axial loading

    NASA Astrophysics Data System (ADS)

    Macione, James; Nesbitt, Sterling; Pandit, Vaibhav; Kotha, Shiva

    2012-02-01

    This paper describes the construction of a loading machine for performing in vivo, dynamic mechanical loading of the rodent forearm. The loading machine utilizes a unique type of electromagnetic actuator with no mechanically resistive components (servotube), allowing highly accurate loads to be created. A regression analysis of the force created by the actuator with respect to the input voltage demonstrates high linear correlation (R2 = 1). When the linear correlation is used to create dynamic loading waveforms in the frequency (0.5-10 Hz) and load (1-50 N) range used for in vivo loading, less than 1% normalized root mean square error (NRMSE) is computed. Larger NRMSE is found at increased frequencies, with 5%-8% occurring at 40 Hz, and reasons are discussed. Amplifiers (strain gauge, linear voltage displacement transducer (LVDT), and load cell) are constructed, calibrated, and integrated, to allow well-resolved dynamic measurements to be recorded at each program cycle. Each of the amplifiers uses an active filter with cutoff frequency at the maximum in vivo loading frequencies (50 Hz) so that electronic noise generated by the servo drive and actuator are reduced. The LVDT and load cell amplifiers allow evaluation of stress-strain relationships to determine if in vivo bone damage is occurring. The strain gauge amplifier allows dynamic force to strain calibrations to occur for animals of different sex, age, and strain. Unique features are integrated into the loading system, including a weightless mode, which allows the limbs of anesthetized animals to be quickly positioned and removed. Although the device is constructed for in vivo axial bone loading, it can be used within constraints, as a general measurement instrument in a laboratory setting.

  8. Design and analysis of a novel mechanical loading machine for dynamic in vivo axial loading.

    PubMed

    Macione, James; Nesbitt, Sterling; Pandit, Vaibhav; Kotha, Shiva

    2012-02-01

    This paper describes the construction of a loading machine for performing in vivo, dynamic mechanical loading of the rodent forearm. The loading machine utilizes a unique type of electromagnetic actuator with no mechanically resistive components (servotube), allowing highly accurate loads to be created. A regression analysis of the force created by the actuator with respect to the input voltage demonstrates high linear correlation (R(2) = 1). When the linear correlation is used to create dynamic loading waveforms in the frequency (0.5-10 Hz) and load (1-50 N) range used for in vivo loading, less than 1% normalized root mean square error (NRMSE) is computed. Larger NRMSE is found at increased frequencies, with 5%-8% occurring at 40 Hz, and reasons are discussed. Amplifiers (strain gauge, linear voltage displacement transducer (LVDT), and load cell) are constructed, calibrated, and integrated, to allow well-resolved dynamic measurements to be recorded at each program cycle. Each of the amplifiers uses an active filter with cutoff frequency at the maximum in vivo loading frequencies (50 Hz) so that electronic noise generated by the servo drive and actuator are reduced. The LVDT and load cell amplifiers allow evaluation of stress-strain relationships to determine if in vivo bone damage is occurring. The strain gauge amplifier allows dynamic force to strain calibrations to occur for animals of different sex, age, and strain. Unique features are integrated into the loading system, including a weightless mode, which allows the limbs of anesthetized animals to be quickly positioned and removed. Although the device is constructed for in vivo axial bone loading, it can be used within constraints, as a general measurement instrument in a laboratory setting.

  9. Spatio-Temporal Regression Based Clustering of Precipitation Extremes in a Presence of Systematically Missing Covariates

    NASA Astrophysics Data System (ADS)

    Kaiser, Olga; Martius, Olivia; Horenko, Illia

    2017-04-01

    Regression based Generalized Pareto Distribution (GPD) models are often used to describe the dynamics of hydrological threshold excesses relying on the explicit availability of all of the relevant covariates. But, in real application the complete set of relevant covariates might be not available. In this context, it was shown that under weak assumptions the influence coming from systematically missing covariates can be reflected by a nonstationary and nonhomogenous dynamics. We present a data-driven, semiparametric and an adaptive approach for spatio-temporal regression based clustering of threshold excesses in a presence of systematically missing covariates. The nonstationary and nonhomogenous behavior of threshold excesses is describes by a set of local stationary GPD models, where the parameters are expressed as regression models, and a non-parametric spatio-temporal hidden switching process. Exploiting nonparametric Finite Element time-series analysis Methodology (FEM) with Bounded Variation of the model parameters (BV) for resolving the spatio-temporal switching process, the approach goes beyond strong a priori assumptions made is standard latent class models like Mixture Models and Hidden Markov Models. Additionally, the presented FEM-BV-GPD provides a pragmatic description of the corresponding spatial dependence structure by grouping together all locations that exhibit similar behavior of the switching process. The performance of the framework is demonstrated on daily accumulated precipitation series over 17 different locations in Switzerland from 1981 till 2013 - showing that the introduced approach allows for a better description of the historical data.

  10. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  11. Water quality parameter measurement using spectral signatures

    NASA Technical Reports Server (NTRS)

    White, P. E.

    1973-01-01

    Regression analysis is applied to the problem of measuring water quality parameters from remote sensing spectral signature data. The equations necessary to perform regression analysis are presented and methods of testing the strength and reliability of a regression are described. An efficient algorithm for selecting an optimal subset of the independent variables available for a regression is also presented.

  12. Characterization of Microbiota in Children with Chronic Functional Constipation.

    PubMed

    de Meij, Tim G J; de Groot, Evelien F J; Eck, Anat; Budding, Andries E; Kneepkens, C M Frank; Benninga, Marc A; van Bodegraven, Adriaan A; Savelkoul, Paul H M

    2016-01-01

    Disruption of the intestinal microbiota is considered an etiological factor in pediatric functional constipation. Scientifically based selection of potential beneficial probiotic strains in functional constipation therapy is not feasible due to insufficient knowledge of microbiota composition in affected subjects. The aim of this study was to describe microbial composition and diversity in children with functional constipation, compared to healthy controls. Fecal samples from 76 children diagnosed with functional constipation according to the Rome III criteria (median age 8.0 years; range 4.2-17.8) were analyzed by IS-pro, a PCR-based microbiota profiling method. Outcome was compared with intestinal microbiota profiles of 61 healthy children (median 8.6 years; range 4.1-17.9). Microbiota dissimilarity was depicted by principal coordinate analysis (PCoA), diversity was calculated by Shannon diversity index. To determine the most discriminative species, cross validated logistic ridge regression was performed. Applying total microbiota profiles (all phyla together) or per phylum analysis, no disease-specific separation was observed by PCoA and by calculation of diversity indices. By ridge regression, however, functional constipation and controls could be discriminated with 82% accuracy. Most discriminative species were Bacteroides fragilis, Bacteroides ovatus, Bifidobacterium longum, Parabacteroides species (increased in functional constipation) and Alistipes finegoldii (decreased in functional constipation). None of the commonly used unsupervised statistical methods allowed for microbiota-based discrimination of children with functional constipation and controls. By ridge regression, however, both groups could be discriminated with 82% accuracy. Optimization of microbiota-based interventions in constipated children warrants further characterization of microbial signatures linked to clinical subgroups of functional constipation.

  13. Sparse modeling of spatial environmental variables associated with asthma

    PubMed Central

    Chang, Timothy S.; Gangnon, Ronald E.; Page, C. David; Buckingham, William R.; Tandias, Aman; Cowan, Kelly J.; Tomasallo, Carrie D.; Arndt, Brian G.; Hanrahan, Lawrence P.; Guilbert, Theresa W.

    2014-01-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin’s Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5–50 years over a three-year period. Each patient’s home address was geocoded to one of 3,456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin’s geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. PMID:25533437

  14. Sparse modeling of spatial environmental variables associated with asthma.

    PubMed

    Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

    2015-02-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Development and initial validation of the Classroom Motivational Climate Questionnaire (CMCQ).

    PubMed

    Alonso Tapia, Jesús; Fernández Heredia, Blanca

    2008-11-01

    Research on classroom goal-structures (CGS) has shown the usefulness of assessing the classroom motivational climate to evaluate educational interventions and to promote changes in teachers' activity. So, the Classroom Motivational Climate Questionnaire for Secondary and High-School students was developed. To validate it, confirmatory factor analysis and correlation and regression analyses were performed. Results showed that the CMCQ is a highly reliable instrument that covers many of the types of teaching patterns that favour motivation to learn, correlates as expected with other measures of CGS, predicts satisfaction with teacher's work well, and allows detecting teachers who should revise their teaching.

  16. Least-squares sequential parameter and state estimation for large space structures

    NASA Technical Reports Server (NTRS)

    Thau, F. E.; Eliazov, T.; Montgomery, R. C.

    1982-01-01

    This paper presents the formulation of simultaneous state and parameter estimation problems for flexible structures in terms of least-squares minimization problems. The approach combines an on-line order determination algorithm, with least-squares algorithms for finding estimates of modal approximation functions, modal amplitudes, and modal parameters. The approach combines previous results on separable nonlinear least squares estimation with a regression analysis formulation of the state estimation problem. The technique makes use of sequential Householder transformations. This allows for sequential accumulation of matrices required during the identification process. The technique is used to identify the modal prameters of a flexible beam.

  17. Estimating biogas production of biologically treated municipal solid waste.

    PubMed

    Scaglia, Barbara; Confalonieri, Roberto; D'Imporzano, Giuliana; Adani, Fabrizio

    2010-02-01

    In this work, a respirometric approach, i.e., Dynamic Respiration Index (DRI), was used to predict the anaerobic biogas potential (ABP), studying 46 waste samples coming directly from MBT full-scale plants. A significant linear regression model was obtained by a jackknife approach: ABP=(34.4+/-2.5)+(0.109+/-0.003).DRI. The comparison of the model of this work with those of the previous works using a different respirometric approach (Sapromat-AT(4)), allowed obtaining similar results and carrying out direct comparison of different limits to accept treated waste in landfill, proposed in the literature. The results indicated that on an average, MBT treatment allowed 56% of ABP reduction after 4weeks of treatment, and 79% reduction after 12weeks of treatment. The obtainment of another regression model allowed transforming Sapromat-AT(4) limit in DRI units, and achieving a description of the kinetics of DRI and the corresponding ABP reductions vs. MBT treatment-time.

  18. Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

    PubMed

    Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

    2012-05-01

    Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  19. Logistic Regression Likelihood Ratio Test Analysis for Detecting Signals of Adverse Events in Post-market Safety Surveillance.

    PubMed

    Nam, Kijoeng; Henderson, Nicholas C; Rohan, Patricia; Woo, Emily Jane; Russek-Cohen, Estelle

    2017-01-01

    The Vaccine Adverse Event Reporting System (VAERS) and other product surveillance systems compile reports of product-associated adverse events (AEs), and these reports may include a wide range of information including age, gender, and concomitant vaccines. Controlling for possible confounding variables such as these is an important task when utilizing surveillance systems to monitor post-market product safety. A common method for handling possible confounders is to compare observed product-AE combinations with adjusted baseline frequencies where the adjustments are made by stratifying on observable characteristics. Though approaches such as these have proven to be useful, in this article we propose a more flexible logistic regression approach which allows for covariates of all types rather than relying solely on stratification. Indeed, a main advantage of our approach is that the general regression framework provides flexibility to incorporate additional information such as demographic factors and concomitant vaccines. As part of our covariate-adjusted method, we outline a procedure for signal detection that accounts for multiple comparisons and controls the overall Type 1 error rate. To demonstrate the effectiveness of our approach, we illustrate our method with an example involving febrile convulsion, and we further evaluate its performance in a series of simulation studies.

  20. Bayesian semi-parametric analysis of Poisson change-point regression models: application to policy making in Cali, Colombia.

    PubMed

    Park, Taeyoung; Krafty, Robert T; Sánchez, Alvaro I

    2012-07-27

    A Poisson regression model with an offset assumes a constant baseline rate after accounting for measured covariates, which may lead to biased estimates of coefficients in an inhomogeneous Poisson process. To correctly estimate the effect of time-dependent covariates, we propose a Poisson change-point regression model with an offset that allows a time-varying baseline rate. When the nonconstant pattern of a log baseline rate is modeled with a nonparametric step function, the resulting semi-parametric model involves a model component of varying dimension and thus requires a sophisticated varying-dimensional inference to obtain correct estimates of model parameters of fixed dimension. To fit the proposed varying-dimensional model, we devise a state-of-the-art MCMC-type algorithm based on partial collapse. The proposed model and methods are used to investigate an association between daily homicide rates in Cali, Colombia and policies that restrict the hours during which the legal sale of alcoholic beverages is permitted. While simultaneously identifying the latent changes in the baseline homicide rate which correspond to the incidence of sociopolitical events, we explore the effect of policies governing the sale of alcohol on homicide rates and seek a policy that balances the economic and cultural dependencies on alcohol sales to the health of the public.

  1. Multicollinearity in spatial genetics: separating the wheat from the chaff using commonality analyses.

    PubMed

    Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C

    2015-01-01

    Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.

  2. The estimated effect of mass or footprint reduction in recent light-duty vehicles on U.S. societal fatality risk per vehicle mile traveled.

    PubMed

    Wenzel, Tom

    2013-10-01

    The National Highway Traffic Safety Administration (NHTSA) recently updated its 2003 and 2010 logistic regression analyses of the effect of a reduction in light-duty vehicle mass on US societal fatality risk per vehicle mile traveled (VMT; Kahane, 2012). Societal fatality risk includes the risk to both the occupants of the case vehicle as well as any crash partner or pedestrians. The current analysis is the most thorough investigation of this issue to date. This paper replicates the Kahane analysis and extends it by testing the sensitivity of his results to changes in the definition of risk, and the data and control variables used in the regression models. An assessment by Lawrence Berkeley National Laboratory (LBNL) indicates that the estimated effect of mass reduction on risk is smaller than in Kahane's previous studies, and is statistically non-significant for all but the lightest cars (Wenzel, 2012a). The estimated effects of a reduction in mass or footprint (i.e. wheelbase times track width) are small relative to other vehicle, driver, and crash variables used in the regression models. The recent historical correlation between mass and footprint is not so large to prohibit including both variables in the same regression model; excluding footprint from the model, i.e. allowing footprint to decrease with mass, increases the estimated detrimental effect of mass reduction on risk in cars and crossover utility vehicles (CUVs)/minivans, but has virtually no effect on light trucks. Analysis by footprint deciles indicates that risk does not consistently increase with reduced mass for vehicles of similar footprint. Finally, the estimated effects of mass and footprint reduction are sensitive to the measure of exposure used (fatalities per induced exposure crash, rather than per VMT), as well as other changes in the data or control variables used. It appears that the safety penalty from lower mass can be mitigated with careful vehicle design, and that manufacturers can reduce mass as a strategy to increase their vehicles' fuel economy and reduce greenhouse gas emissions without necessarily compromising societal safety. Published by Elsevier Ltd.

  3. G/SPLINES: A hybrid of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm with Holland's genetic algorithm

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1991-01-01

    G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.

  4. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  5. A Quality Assessment Tool for Non-Specialist Users of Regression Analysis

    ERIC Educational Resources Information Center

    Argyrous, George

    2015-01-01

    This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…

  6. A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy.

    PubMed

    Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung

    2015-12-01

    This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.

    DTIC Science & Technology

    SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.

  8. Mixed oxidizer hybrid propulsion system optimization under uncertainty using applied response surface methodology and Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Whitehead, James Joshua

    The analysis documented herein provides an integrated approach for the conduct of optimization under uncertainty (OUU) using Monte Carlo Simulation (MCS) techniques coupled with response surface-based methods for characterization of mixture-dependent variables. This novel methodology provides an innovative means of conducting optimization studies under uncertainty in propulsion system design. Analytic inputs are based upon empirical regression rate information obtained from design of experiments (DOE) mixture studies utilizing a mixed oxidizer hybrid rocket concept. Hybrid fuel regression rate was selected as the target response variable for optimization under uncertainty, with maximization of regression rate chosen as the driving objective. Characteristic operational conditions and propellant mixture compositions from experimental efforts conducted during previous foundational work were combined with elemental uncertainty estimates as input variables. Response surfaces for mixture-dependent variables and their associated uncertainty levels were developed using quadratic response equations incorporating single and two-factor interactions. These analysis inputs, response surface equations and associated uncertainty contributions were applied to a probabilistic MCS to develop dispersed regression rates as a function of operational and mixture input conditions within design space. Illustrative case scenarios were developed and assessed using this analytic approach including fully and partially constrained operational condition sets over all of design mixture space. In addition, optimization sets were performed across an operationally representative region in operational space and across all investigated mixture combinations. These scenarios were selected as representative examples relevant to propulsion system optimization, particularly for hybrid and solid rocket platforms. Ternary diagrams, including contour and surface plots, were developed and utilized to aid in visualization. The concept of Expanded-Durov diagrams was also adopted and adapted to this study to aid in visualization of uncertainty bounds. Regions of maximum regression rate and associated uncertainties were determined for each set of case scenarios. Application of response surface methodology coupled with probabilistic-based MCS allowed for flexible and comprehensive interrogation of mixture and operating design space during optimization cases. Analyses were also conducted to assess sensitivity of uncertainty to variations in key elemental uncertainty estimates. The methodology developed during this research provides an innovative optimization tool for future propulsion design efforts.

  9. EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units

    PubMed Central

    Kam-Thong, Tony; Czamara, Darina; Tsuda, Koji; Borgwardt, Karsten; Lewis, Cathryn M; Erhardt-Lehmann, Angelika; Hemmer, Bernhard; Rieckmann, Peter; Daake, Markus; Weber, Frank; Wolf, Christiane; Ziegler, Andreas; Pütz, Benno; Holsboer, Florian; Schölkopf, Bernhard; Müller-Myhsok, Bertram

    2011-01-01

    Detection of epistatic interaction between loci has been postulated to provide a more in-depth understanding of the complex biological and biochemical pathways underlying human diseases. Studying the interaction between two loci is the natural progression following traditional and well-established single locus analysis. However, the added costs and time duration required for the computation involved have thus far deterred researchers from pursuing a genome-wide analysis of epistasis. In this paper, we propose a method allowing such analysis to be conducted very rapidly. The method, dubbed EPIBLASTER, is applicable to case–control studies and consists of a two-step process in which the difference in Pearson's correlation coefficients is computed between controls and cases across all possible SNP pairs as an indication of significant interaction warranting further analysis. For the subset of interactions deemed potentially significant, a second-stage analysis is performed using the likelihood ratio test from the logistic regression to obtain the P-value for the estimated coefficients of the individual effects and the interaction term. The algorithm is implemented using the parallel computational capability of commercially available graphical processing units to greatly reduce the computation time involved. In the current setup and example data sets (211 cases, 222 controls, 299468 SNPs; and 601 cases, 825 controls, 291095 SNPs), this coefficient evaluation stage can be completed in roughly 1 day. Our method allows for exhaustive and rapid detection of significant SNP pair interactions without imposing significant marginal effects of the single loci involved in the pair. PMID:21150885

  10. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  11. The process and utility of classification and regression tree methodology in nursing research.

    PubMed

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  12. Advantages of the net benefit regression framework for economic evaluations of interventions in the workplace: a case study of the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders.

    PubMed

    Hoch, Jeffrey S; Dewa, Carolyn S

    2014-04-01

    Economic evaluations commonly accompany trials of new treatments or interventions; however, regression methods and their corresponding advantages for the analysis of cost-effectiveness data are not well known. To illustrate regression-based economic evaluation, we present a case study investigating the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders. We implement net benefit regression to illustrate its strengths and limitations. Net benefit regression offers a simple option for cost-effectiveness analyses of person-level data. By placing economic evaluation in a regression framework, regression-based techniques can facilitate the analysis and provide simple solutions to commonly encountered challenges. Economic evaluations of person-level data (eg, from a clinical trial) should use net benefit regression to facilitate analysis and enhance results.

  13. Sedimentary sequence evolution in a Foredeep basin: Eastern Venezuela

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bejarano, C.; Funes, D.; Sarzalho, S.

    1996-08-01

    Well log-seismic sequence stratigraphy analysis in the Eastern Venezuela Foreland Basin leads to study of the evolution of sedimentary sequences onto the Cretaceous-Paleocene passive margin. This basin comprises two different foredeep sub-basins: The Guarico subbasin to the west, older, and the Maturin sub-basin to the east, younger. A foredeep switching between these two sub-basins is observed at 12.5 m.y. Seismic interpretation and well log sections across the study area show sedimentary sequences with transgressive sands and coastal onlaps to the east-southeast for the Guarico sub-basin, as well as truncations below the switching sequence (12.5 m.y.), and the Maturin sub-basin showsmore » apparent coastal onlaps to the west-northwest, as well as a marine onlap (deeper water) in the west, where it starts to establish. Sequence stratigraphy analysis of these sequences with well logs allowed the study of the evolution of stratigraphic section from Paleocene to middle Miocene (68.0-12.0 m.y.). On the basis of well log patterns, the sequences were divided in regressive-transgressive-regressive sedimentary cycles caused by changes in relative sea level. Facies distributions were analyzed and the sequences were divided into simple sequences or sub- sequences of a greater frequencies than third order depositional sequences.« less

  14. Phytotoxicity and accumulation of chromium in carrot plants and the derivation of soil thresholds for Chinese soils.

    PubMed

    Ding, Changfeng; Li, Xiaogang; Zhang, Taolin; Ma, Yibing; Wang, Xingxiang

    2014-10-01

    Soil environmental quality standards in respect of heavy metals for farmlands should be established considering both their effects on crop yield and their accumulation in the edible part. A greenhouse experiment was conducted to investigate the effects of chromium (Cr) on biomass production and Cr accumulation in carrot plants grown in a wide range of soils. The results revealed that carrot yield significantly decreased in 18 of the total 20 soils with Cr addition being the soil environmental quality standard of China. The Cr content of carrot grown in the five soils with pH>8.0 exceeded the maximum allowable level (0.5mgkg(-1)) according to the Chinese General Standard for Contaminants in Foods. The relationship between carrot Cr concentration and soil pH could be well fitted (R(2)=0.70, P<0.0001) by a linear-linear segmented regression model. The addition of Cr to soil influenced carrot yield firstly rather than the food quality. The major soil factors controlling Cr phytotoxicity and the prediction models were further identified and developed using path analysis and stepwise multiple linear regression analysis. Soil Cr thresholds for phytotoxicity meanwhile ensuring food safety were then derived on the condition of 10 percent yield reduction. Copyright © 2014 Elsevier Inc. All rights reserved.

  15. Effects of homogenization process parameters on physicochemical properties of astaxanthin nanodispersions prepared using a solvent-diffusion technique

    PubMed Central

    Anarjan, Navideh; Jafarizadeh-Malmiri, Hoda; Nehdi, Imededdine Arbi; Sbihi, Hassen Mohamed; Al-Resayes, Saud Ibrahim; Tan, Chin Ping

    2015-01-01

    Nanodispersion systems allow incorporation of lipophilic bioactives, such as astaxanthin (a fat soluble carotenoid) into aqueous systems, which can improve their solubility, bioavailability, and stability, and widen their uses in water-based pharmaceutical and food products. In this study, response surface methodology was used to investigate the influences of homogenization time (0.5–20 minutes) and speed (1,000–9,000 rpm) in the formation of astaxanthin nanodispersions via the solvent-diffusion process. The product was characterized for particle size and astaxanthin concentration using laser diffraction particle size analysis and high performance liquid chromatography, respectively. Relatively high determination coefficients (ranging from 0.896 to 0.969) were obtained for all suggested polynomial regression models. The overall optimal homogenization conditions were determined by multiple response optimization analysis to be 6,000 rpm for 7 minutes. In vitro cellular uptake of astaxanthin from the suggested individual and multiple optimized astaxanthin nanodispersions was also evaluated. The cellular uptake of astaxanthin was found to be considerably increased (by more than five times) as it became incorporated into optimum nanodispersion systems. The lack of a significant difference between predicted and experimental values confirms the suitability of the regression equations connecting the response variables studied to the independent parameters. PMID:25709435

  16. A simplified calculation procedure for mass isotopomer distribution analysis (MIDA) based on multiple linear regression.

    PubMed

    Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio

    2016-10-01

    We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Non-Intrusive Measurement Techniques Applied to the Hybrid Solid Fuel Degradation

    NASA Astrophysics Data System (ADS)

    Cauty, F.

    2004-10-01

    The knowledge of the solid fuel regression rate and the time evolution of the grain geometry are requested for hybrid motor design and control of its operating conditions. Two non-intrusive techniques (NDT) have been applied to hybrid propulsion : both are based on wave propagation, the X-rays and the ultrasounds, through the materials. X-ray techniques allow local thickness measurements (attenuated signal level) using small probes or 2D images (Real Time Radiography), with a link between the size of field of view and accuracy. Beside the safety hazards associated with the high-intensity X-ray systems, the image analysis requires the use of quite complex post-processing techniques. The ultrasound technique is more widely used in energetic material applications, including hybrid fuels. Depending upon the transducer size and the associated equipment, the application domain is large, from tiny samples to the quad-port wagon wheel grain of the 1.1 MN thrust HPDP motor. The effect of the physical quantities has to be taken into account in the wave propagation analysis. With respect to the various applications, there is no unique and perfect experimental method to measure the fuel regression rate. The best solution could be obtained by combining two techniques at the same time, each technique enhancing the quality of the global data.

  18. CADDIS Volume 4. Data Analysis: Basic Analyses

    EPA Pesticide Factsheets

    Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.

  19. ℓ(p)-Norm multikernel learning approach for stock market price forecasting.

    PubMed

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.

  20. An interactive website for analytical method comparison and bias estimation.

    PubMed

    Bahar, Burak; Tuncel, Ayse F; Holmes, Earle W; Holmes, Daniel T

    2017-12-01

    Regulatory standards mandate laboratories to perform studies to ensure accuracy and reliability of their test results. Method comparison and bias estimation are important components of these studies. We developed an interactive website for evaluating the relative performance of two analytical methods using R programming language tools. The website can be accessed at https://bahar.shinyapps.io/method_compare/. The site has an easy-to-use interface that allows both copy-pasting and manual entry of data. It also allows selection of a regression model and creation of regression and difference plots. Available regression models include Ordinary Least Squares, Weighted-Ordinary Least Squares, Deming, Weighted-Deming, Passing-Bablok and Passing-Bablok for large datasets. The server processes the data and generates downloadable reports in PDF or HTML format. Our website provides clinical laboratories a practical way to assess the relative performance of two analytical methods. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  1. Socioeconomic Predictors of the Employment of Migrant Care Workers by Italian Families Assisting Older Alzheimer's Disease Patients: Evidence From the Up-Tech Study.

    PubMed

    Barbabella, Francesco; Chiatti, Carlos; Rimland, Joseph M; Melchiorre, Maria Gabriella; Lamura, Giovanni; Lattanzio, Fabrizia

    2016-05-01

    The availability of family caregivers of older people is decreasing in Italy as the number of migrant care workers (MCWs) hired by families increases. There is little evidence on the influence of socioeconomic factors in the employment of MCWs. We analyzed baseline data from 438 older people with moderate Alzheimer's disease (AD), and their family caregivers enrolled in the Up-Tech trial. We used bivariate analysis and multilevel regressions to investigate the association between independent variables-education, social class, and the availability of a care allowance-and three outcomes-employment of a MCW, hours of care provided by the primary family caregiver, and by the family network (primary and other family caregivers). The availability of a care allowance and the educational level were independently associated with employing MCWs. A significant interaction between education and care allowance was found, suggesting that more educated families are more likely to spend the care allowance to hire a MCW. Socioeconomic inequalities negatively influenced access both to private care and to care allowance, leading disadvantaged families to directly provide more assistance to AD patients. Care allowance entitlement needs to be reformed in Italy and in countries with similar long-term care and migration systems. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. An Investigation of the Effects of Reader Characteristics on Reading Comprehension Of a General Chemistry Text

    NASA Astrophysics Data System (ADS)

    Neiles, Kelly Y.

    There is great concern in the scientific community that students in the United States, when compared with other countries, are falling behind in their scientific achievement. Increasing students' reading comprehension of scientific text may be one of the components involved in students' science achievement. To investigate students' reading comprehension this quantitative study examined the effects of different reader characteristics, namely, students' logical reasoning ability, factual chemistry knowledge, working memory capacity, and schema of the chemistry concepts, on reading comprehension of a chemistry text. Students' reading comprehension was measured through their ability to encode the text, access the meanings of words (lexical access), make bridging and elaborative inferences, and integrate the text with their existing schemas to make a lasting mental representation of the text (situational model). Students completed a series of tasks that measured the reader characteristic and reading comprehension variables. Some of the variables were measured using new technologies and software to investigate different cognitive processes. These technologies and software included eye tracking to investigate students' lexical accessing and a Pathfinder program to investigate students' schema of the chemistry concepts. The results from this study were analyzed using canonical correlation and regression analysis. The canonical correlation analysis allows for the ten variables described previously to be included in one multivariate analysis. Results indicate that the relationship between the reader characteristic variables and the reading comprehension variables is significant. The resulting canonical function accounts for a greater amount of variance in students' responses then any individual variable. Regression analysis was used to further investigate which reader characteristic variables accounted for the differences in students' responses for each reading comprehension variable. The results from this regression analysis indicated that the two schema measures (measured by the Pathfinder program) accounted for the greatest amount of variance in four of the reading comprehension variables (encoding the text, bridging and elaborative inferences, and delayed recall of a general summary). This research suggest that providing students with background information on chemistry concepts prior to having them read the text may result in better understanding and more effective incorporation of the chemistry concepts into their schema.

  3. Pelvic incidence: a predictive factor for three-dimensional acetabular orientation-a preliminary study.

    PubMed

    Boulay, Christophe; Bollini, Gérard; Legaye, Jean; Tardieu, Christine; Prat-Pradal, Dominique; Chabrol, Brigitte; Jouve, Jean-Luc; Duval-Beaupère, Ginette; Pélissier, Jacques

    2014-01-01

    Acetabular cup orientation (inclination and anteversion) is a fundamental topic in orthopaedics and depends on pelvis tilt (positional parameter) emphasising the notion of a safe range of pelvis tilt. The hypothesis was that pelvic incidence (morphologic parameter) could yield a more accurate and reliable assessment than pelvis tilt. The aim was to find out a predictive equation of acetabular 3D orientation parameters which were determined by pelvic incidence to include in the model. The second aim was to consider the asymmetry between the right and left acetabulae. Twelve pelvic anatomic specimens were measured with an electromagnetic Fastrak system (Polhemus Society) providing 3D position of anatomical landmarks to allow measurement of acetabular and pelvic parameters. Acetabulum and pelvis data were correlated by a Spearman matrix. A robust linear regression analysis provided prediction of acetabulum axes. The orientation of each acetabulum could be predicted by the incidence. The incidence is correlated with the morphology of acetabula. The asymmetry of the acetabular roof was correlated with pelvic incidence. This study allowed analysis of relationships of acetabular orientation and pelvic incidence. Pelvic incidence (morphologic parameter) could determine the safe range of pelvis tilt (positional parameter) for an individual and not a group.

  4. Pelvic Incidence: A Predictive Factor for Three-Dimensional Acetabular Orientation—A Preliminary Study

    PubMed Central

    Bollini, Gérard; Legaye, Jean; Tardieu, Christine; Prat-Pradal, Dominique; Chabrol, Brigitte; Jouve, Jean-Luc; Duval-Beaupère, Ginette; Pélissier, Jacques

    2014-01-01

    Acetabular cup orientation (inclination and anteversion) is a fundamental topic in orthopaedics and depends on pelvis tilt (positional parameter) emphasising the notion of a safe range of pelvis tilt. The hypothesis was that pelvic incidence (morphologic parameter) could yield a more accurate and reliable assessment than pelvis tilt. The aim was to find out a predictive equation of acetabular 3D orientation parameters which were determined by pelvic incidence to include in the model. The second aim was to consider the asymmetry between the right and left acetabulae. Twelve pelvic anatomic specimens were measured with an electromagnetic Fastrak system (Polhemus Society) providing 3D position of anatomical landmarks to allow measurement of acetabular and pelvic parameters. Acetabulum and pelvis data were correlated by a Spearman matrix. A robust linear regression analysis provided prediction of acetabulum axes. The orientation of each acetabulum could be predicted by the incidence. The incidence is correlated with the morphology of acetabula. The asymmetry of the acetabular roof was correlated with pelvic incidence. This study allowed analysis of relationships of acetabular orientation and pelvic incidence. Pelvic incidence (morphologic parameter) could determine the safe range of pelvis tilt (positional parameter) for an individual and not a group. PMID:25006461

  5. A Note on the Relationship between the Number of Indicators and Their Reliability in Detecting Regression Coefficients in Latent Regression Analysis

    ERIC Educational Resources Information Center

    Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.

    2004-01-01

    We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…

  6. Accounting for standard errors of vision-specific latent trait in regression models.

    PubMed

    Wong, Wan Ling; Li, Xiang; Li, Jialiang; Wong, Tien Yin; Cheng, Ching-Yu; Lamoureux, Ecosse L

    2014-07-11

    To demonstrate the effectiveness of Hierarchical Bayesian (HB) approach in a modeling framework for association effects that accounts for SEs of vision-specific latent traits assessed using Rasch analysis. A systematic literature review was conducted in four major ophthalmic journals to evaluate Rasch analysis performed on vision-specific instruments. The HB approach was used to synthesize the Rasch model and multiple linear regression model for the assessment of the association effects related to vision-specific latent traits. The effectiveness of this novel HB one-stage "joint-analysis" approach allows all model parameters to be estimated simultaneously and was compared with the frequently used two-stage "separate-analysis" approach in our simulation study (Rasch analysis followed by traditional statistical analyses without adjustment for SE of latent trait). Sixty-six reviewed articles performed evaluation and validation of vision-specific instruments using Rasch analysis, and 86.4% (n = 57) performed further statistical analyses on the Rasch-scaled data using traditional statistical methods; none took into consideration SEs of the estimated Rasch-scaled scores. The two models on real data differed for effect size estimations and the identification of "independent risk factors." Simulation results showed that our proposed HB one-stage "joint-analysis" approach produces greater accuracy (average of 5-fold decrease in bias) with comparable power and precision in estimation of associations when compared with the frequently used two-stage "separate-analysis" procedure despite accounting for greater uncertainty due to the latent trait. Patient-reported data, using Rasch analysis techniques, do not take into account the SE of latent trait in association analyses. The HB one-stage "joint-analysis" is a better approach, producing accurate effect size estimations and information about the independent association of exposure variables with vision-specific latent traits. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  7. Compressive strength of human openwedges: a selection method

    NASA Astrophysics Data System (ADS)

    Follet, H.; Gotteland, M.; Bardonnet, R.; Sfarghiu, A. M.; Peyrot, J.; Rumelhart, C.

    2004-02-01

    A series of 44 samples of bone wedges of human origin, intended for allograft openwedge osteotomy and obtained without particular precautions during hip arthroplasty were re-examined. After viral inactivity chemical treatment, lyophilisation and radio-sterilisation (intended to produce optimal health safety), the compressive strength, independent of age, sex and the height of the sample (or angle of cut), proved to be too widely dispersed [ 10{-}158 MPa] in the first study. We propose a method for selecting samples which takes into account their geometry (width, length, thicknesses, cortical surface area). Statistical methods (Principal Components Analysis PCA, Hierarchical Cluster Analysis, Multilinear regression) allowed final selection of 29 samples having a mean compressive strength σ_{max} =103 MPa ± 26 and with variation [ 61{-}158 MPa] . These results are equivalent or greater than average materials currently used in openwedge osteotomy.

  8. Time Series Expression Analyses Using RNA-seq: A Statistical Approach

    PubMed Central

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021

  9. Methodological Issues in Design and Analysis of a Matched Case-Control Study of a Vaccine’s Effectiveness

    PubMed Central

    Niccolai, Linda M.; Ogden, Lorraine G.; Muehlenbein, Catherine E.; Dziura, James D.; Vázquez, Marietta; Shapiro, Eugene D.

    2007-01-01

    Objective Case-control studies of the effectiveness of a vaccine are useful to answer important questions, such as the effectiveness of a vaccine over time, that usually are not addressed by pre-licensure clinical trials of the vaccine’s efficacy. This report describes methodological issues related to design and analysis that were used to determine the effects of time since vaccination and age at the time of vaccination. Study Design and Setting A matched case-control study of the effectiveness of varicella vaccine. Results Sampling procedures and conditional logistic regression models including interaction terms are described. Conclusion Use of these methods will allow investigators to assess the effects of a wide range of variables, such as time since vaccination and age at the time of vaccination, on the effectiveness of a vaccine. PMID:17938054

  10. Time series expression analyses using RNA-seq: a statistical approach.

    PubMed

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.

  11. Comparing least-squares and quantile regression approaches to analyzing median hospital charges.

    PubMed

    Olsen, Cody S; Clark, Amy E; Thomas, Andrea M; Cook, Lawrence J

    2012-07-01

    Emergency department (ED) and hospital charges obtained from administrative data sets are useful descriptors of injury severity and the burden to EDs and the health care system. However, charges are typically positively skewed due to costly procedures, long hospital stays, and complicated or prolonged treatment for few patients. The median is not affected by extreme observations and is useful in describing and comparing distributions of hospital charges. A least-squares analysis employing a log transformation is one approach for estimating median hospital charges, corresponding confidence intervals (CIs), and differences between groups; however, this method requires certain distributional properties. An alternate method is quantile regression, which allows estimation and inference related to the median without making distributional assumptions. The objective was to compare the log-transformation least-squares method to the quantile regression approach for estimating median hospital charges, differences in median charges between groups, and associated CIs. The authors performed simulations using repeated sampling of observed statewide ED and hospital charges and charges randomly generated from a hypothetical lognormal distribution. The median and 95% CI and the multiplicative difference between the median charges of two groups were estimated using both least-squares and quantile regression methods. Performance of the two methods was evaluated. In contrast to least squares, quantile regression produced estimates that were unbiased and had smaller mean square errors in simulations of observed ED and hospital charges. Both methods performed well in simulations of hypothetical charges that met least-squares method assumptions. When the data did not follow the assumed distribution, least-squares estimates were often biased, and the associated CIs had lower than expected coverage as sample size increased. Quantile regression analyses of hospital charges provide unbiased estimates even when lognormal and equal variance assumptions are violated. These methods may be particularly useful in describing and analyzing hospital charges from administrative data sets. © 2012 by the Society for Academic Emergency Medicine.

  12. Flood-frequency prediction methods for unregulated streams of Tennessee, 2000

    USGS Publications Warehouse

    Law, George S.; Tasker, Gary D.

    2003-01-01

    Up-to-date flood-frequency prediction methods for unregulated, ungaged rivers and streams of Tennessee have been developed. Prediction methods include the regional-regression method and the newer region-of-influence method. The prediction methods were developed using stream-gage records from unregulated streams draining basins having from 1 percent to about 30 percent total impervious area. These methods, however, should not be used in heavily developed or storm-sewered basins with impervious areas greater than 10 percent. The methods can be used to estimate 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence-interval floods of most unregulated rural streams in Tennessee. A computer application was developed that automates the calculation of flood frequency for unregulated, ungaged rivers and streams of Tennessee. Regional-regression equations were derived by using both single-variable and multivariable regional-regression analysis. Contributing drainage area is the explanatory variable used in the single-variable equations. Contributing drainage area, main-channel slope, and a climate factor are the explanatory variables used in the multivariable equations. Deleted-residual standard error for the single-variable equations ranged from 32 to 65 percent. Deleted-residual standard error for the multivariable equations ranged from 31 to 63 percent. These equations are included in the computer application to allow easy comparison of results produced by the different methods. The region-of-influence method calculates multivariable regression equations for each ungaged site and recurrence interval using basin characteristics from 60 similar sites selected from the study area. Explanatory variables that may be used in regression equations computed by the region-of-influence method include contributing drainage area, main-channel slope, a climate factor, and a physiographic-region factor. Deleted-residual standard error for the region-of-influence method tended to be only slightly smaller than those for the regional-regression method and ranged from 27 to 62 percent.

  13. A quantile regression approach can reveal the effect of fruit and vegetable consumption on plasma homocysteine levels.

    PubMed

    Verly, Eliseu; Steluti, Josiane; Fisberg, Regina Mara; Marchioni, Dirce Maria Lobo

    2014-01-01

    A reduction in homocysteine concentration due to the use of supplemental folic acid is well recognized, although evidence of the same effect for natural folate sources, such as fruits and vegetables (FV), is lacking. The traditional statistical analysis approaches do not provide further information. As an alternative, quantile regression allows for the exploration of the effects of covariates through percentiles of the conditional distribution of the dependent variable. To investigate how the associations of FV intake with plasma total homocysteine (tHcy) differ through percentiles in the distribution using quantile regression. A cross-sectional population-based survey was conducted among 499 residents of Sao Paulo City, Brazil. The participants provided food intake and fasting blood samples. Fruit and vegetable intake was predicted by adjusting for day-to-day variation using a proper measurement error model. We performed a quantile regression to verify the association between tHcy and the predicted FV intake. The predicted values of tHcy for each percentile model were calculated considering an increase of 200 g in the FV intake for each percentile. The results showed that tHcy was inversely associated with FV intake when assessed by linear regression whereas, the association was different when using quantile regression. The relationship with FV consumption was inverse and significant for almost all percentiles of tHcy. The coefficients increased as the percentile of tHcy increased. A simulated increase of 200 g in the FV intake could decrease the tHcy levels in the overall percentiles, but the higher percentiles of tHcy benefited more. This study confirms that the effect of FV intake on lowering the tHcy levels is dependent on the level of tHcy using an innovative statistical approach. From a public health point of view, encouraging people to increase FV intake would benefit people with high levels of tHcy.

  14. The Application of Censored Regression Models in Low Streamflow Analyses

    NASA Astrophysics Data System (ADS)

    Kroll, C.; Luz, J.

    2003-12-01

    Estimation of low streamflow statistics at gauged and ungauged river sites is often a daunting task. This process is further confounded by the presence of intermittent streamflows, where streamflow is sometimes reported as zero, within a region. Streamflows recorded as zero may be zero, or may be less than the measurement detection limit. Such data is often referred to as censored data. Numerous methods have been developed to characterize intermittent streamflow series. Logit regression has been proposed to develop regional models of the probability annual lowflows series (such as 7-day lowflows) are zero. In addition, Tobit regression, a method of regression that allows for censored dependent variables, has been proposed for lowflow regional regression models in regions where the lowflow statistic of interest estimated as zero at some sites in the region. While these methods have been proposed, their use in practice has been limited. Here a delete-one jackknife simulation is presented to examine the performance of Logit and Tobit models of 7-day annual minimum flows in 6 USGS water resource regions in the United States. For the Logit model, an assessment is made of whether sites are correctly classified as having at least 10% of 7-day annual lowflows equal to zero. In such a situation, the 7-day, 10-year lowflow (Q710), a commonly employed low streamflow statistic, would be reported as zero. For the Tobit model, a comparison is made between results from the Tobit model, and from performing either ordinary least squares (OLS) or principal component regression (PCR) after the zero sites are dropped from the analysis. Initial results for the Logit model indicate this method to have a high probability of correctly classifying sites into groups with Q710s as zero and non-zero. Initial results also indicate the Tobit model produces better results than PCR and OLS when more than 5% of the sites in the region have Q710 values calculated as zero.

  15. Moderation analysis using a two-level regression model.

    PubMed

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  16. Classification and regression tree (CART) analysis of endometrial carcinoma: Seeing the forest for the trees.

    PubMed

    Barlin, Joyce N; Zhou, Qin; St Clair, Caryn M; Iasonos, Alexia; Soslow, Robert A; Alektiar, Kaled M; Hensley, Martee L; Leitao, Mario M; Barakat, Richard R; Abu-Rustum, Nadeem R

    2013-09-01

    The objectives of the study are to evaluate which clinicopathologic factors influenced overall survival (OS) in endometrial carcinoma and to determine if the surgical effort to assess para-aortic (PA) lymph nodes (LNs) at initial staging surgery impacts OS. All patients diagnosed with endometrial cancer from 1/1993-12/2011 who had LNs excised were included. PALN assessment was defined by the identification of one or more PALNs on final pathology. A multivariate analysis was performed to assess the effect of PALNs on OS. A form of recursive partitioning called classification and regression tree (CART) analysis was implemented. Variables included: age, stage, tumor subtype, grade, myometrial invasion, total LNs removed, evaluation of PALNs, and adjuvant chemotherapy. The cohort included 1920 patients, with a median age of 62 years. The median number of LNs removed was 16 (range, 1-99). The removal of PALNs was not associated with OS (P=0.450). Using the CART hierarchically, stage I vs. stages II-IV and grades 1-2 vs. grade 3 emerged as predictors of OS. If the tree was allowed to grow, further branching was based on age and myometrial invasion. Total number of LNs removed and assessment of PALNs as defined in this study were not predictive of OS. This innovative CART analysis emphasized the importance of proper stage assignment and a binary grading system in impacting OS. Notably, the total number of LNs removed and specific evaluation of PALNs as defined in this study were not important predictors of OS. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Association of Sociodemographic Factors, Smoking-Related Beliefs, and Smoking Restrictions With Intention to Quit Smoking in Korean Adults: Findings From the ITC Korea Survey

    PubMed Central

    Myung, Seung-Kwon; Seo, Hong Gwan; Cheong, Yoo-Seock; Park, Sohee; Lee, Wonkyong B; Fong, Geoffrey T

    2012-01-01

    Background Few studies have reported the factors associated with intention to quit smoking among Korean adult smokers. This study aimed to examine sociodemographic characteristics, smoking-related beliefs, and smoking-restriction variables associated with intention to quit smoking among Korean adult smokers. Methods We used data from the International Tobacco Control Korea Survey, which was conducted from November through December 2005 by using random-digit dialing and computer-assisted telephone interviewing of male and female smokers aged 19 years or older in 16 metropolitan areas and provinces of Korea. We performed univariate analysis and multiple logistic regression analysis to identify predictors of intention to quit. Results A total of 995 respondents were included in the final analysis. Of those, 74.9% (n = 745) intended to quit smoking. In univariate analyses, smokers with an intention to quit were younger, smoked fewer cigarettes per day, had a higher annual income, were more educated, were more likely to have a religious affiliation, drank less alcohol per week, were less likely to have self-exempting beliefs, and were more likely to have self-efficacy beliefs regarding quitting, to believe that smoking had damaged their health, and to report that smoking was never allowed anywhere in their home. In multiple logistic regression analysis, higher education level, having a religious affiliation, and a higher self-efficacy regarding quitting were significantly associated with intention to quit. Conclusions Sociodemographic factors, smoking-related beliefs, and smoking restrictions at home were associated with intention to quit smoking among Korean adults. PMID:22186157

  18. Association of sociodemographic factors, smoking-related beliefs, and smoking restrictions with intention to quit smoking in Korean adults: findings from the ITC Korea Survey.

    PubMed

    Myung, Seung-Kwon; Seo, Hong Gwan; Cheong, Yoo-Seock; Park, Sohee; Lee, Wonkyong B; Fong, Geoffrey T

    2012-01-01

    Few studies have reported the factors associated with intention to quit smoking among Korean adult smokers. This study aimed to examine sociodemographic characteristics, smoking-related beliefs, and smoking-restriction variables associated with intention to quit smoking among Korean adult smokers. We used data from the International Tobacco Control Korea Survey, which was conducted from November through December 2005 by using random-digit dialing and computer-assisted telephone interviewing of male and female smokers aged 19 years or older in 16 metropolitan areas and provinces of Korea. We performed univariate analysis and multiple logistic regression analysis to identify predictors of intention to quit. A total of 995 respondents were included in the final analysis. Of those, 74.9% (n = 745) intended to quit smoking. In univariate analyses, smokers with an intention to quit were younger, smoked fewer cigarettes per day, had a higher annual income, were more educated, were more likely to have a religious affiliation, drank less alcohol per week, were less likely to have self-exempting beliefs, and were more likely to have self-efficacy beliefs regarding quitting, to believe that smoking had damaged their health, and to report that smoking was never allowed anywhere in their home. In multiple logistic regression analysis, higher education level, having a religious affiliation, and a higher self-efficacy regarding quitting were significantly associated with intention to quit. Sociodemographic factors, smoking-related beliefs, and smoking restrictions at home were associated with intention to quit smoking among Korean adults.

  19. Multiple Correlation versus Multiple Regression.

    ERIC Educational Resources Information Center

    Huberty, Carl J.

    2003-01-01

    Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

  20. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  1. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Beckstead, Jason W.

    2012-01-01

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

  2. General Nature of Multicollinearity in Multiple Regression Analysis.

    ERIC Educational Resources Information Center

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  3. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  4. Bayesian isotonic density regression

    PubMed Central

    Wang, Lianming; Dunson, David B.

    2011-01-01

    Density regression models allow the conditional distribution of the response given predictors to change flexibly over the predictor space. Such models are much more flexible than nonparametric mean regression models with nonparametric residual distributions, and are well supported in many applications. A rich variety of Bayesian methods have been proposed for density regression, but it is not clear whether such priors have full support so that any true data-generating model can be accurately approximated. This article develops a new class of density regression models that incorporate stochastic-ordering constraints which are natural when a response tends to increase or decrease monotonely with a predictor. Theory is developed showing large support. Methods are developed for hypothesis testing, with posterior computation relying on a simple Gibbs sampler. Frequentist properties are illustrated in a simulation study, and an epidemiology application is considered. PMID:22822259

  5. Deconvolution of antibody affinities and concentrations by non-linear regression analysis of competitive ELISA data.

    PubMed

    Stevens, F J; Bobrovnik, S A

    2007-12-01

    Physiological responses of the adaptive immune system are polyclonal in nature whether induced by a naturally occurring infection, by vaccination to prevent infection or, in the case of animals, by challenge with antigen to generate reagents of research or commercial significance. The composition of the polyclonal responses is distinct to each individual or animal and changes over time. Differences exist in the affinities of the constituents and their relative proportion of the responsive population. In addition, some of the antibodies bind to different sites on the antigen, whereas other pairs of antibodies are sterically restricted from concurrent interaction with the antigen. Even if generation of a monoclonal antibody is the ultimate goal of a project, the quality of the resulting reagent is ultimately related to the characteristics of the initial immune response. It is probably impossible to quantitatively parse the composition of a polyclonal response to antigen. However, molecular regression allows further parameterization of a polyclonal antiserum in the context of certain simplifying assumptions. The antiserum is described as consisting of two competing populations of high- and low-affinity and unknown relative proportions. This simple model allows the quantitative determination of representative affinities and proportions. These parameters may be of use in evaluating responses to vaccines, to evaluating continuity of antibody production whether in vaccine recipients or animals used for the production of antisera, or in optimizing selection of donors for the production of monoclonal antibodies.

  6. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  7. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies

    PubMed Central

    Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.

    2016-01-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis. PMID:27274911

  8. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.

    PubMed

    Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H

    2016-04-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.

  9. Quantitative analysis of triacylglycerol regioisomers in fats and oils using reversed-phase high-performance liquid chromatography and atmospheric pressure chemical ionization mass spectrometry.

    PubMed

    Fauconnot, Laëtitia; Hau, Jörg; Aeschlimann, Jean-Marc; Fay, Laurent-Bernard; Dionisi, Fabiola

    2004-01-01

    Positional distribution of fatty acyl chains of triacylglycerols (TGs) in vegetable oils and fats (palm oil, cocoa butter) and animal fats (beef, pork and chicken fats) was examined by reversed-phase high-performance liquid chromatography (RP-HPLC) coupled to atmospheric pressure chemical ionization using a quadrupole mass spectrometer. Quantification of regioisomers was achieved for TGs containing two different fatty acyl chains (palmitic (P), stearic (S), oleic (O), and/or linoleic (L)). For seven pairs of 'AAB/ABA'-type TGs, namely PPS/PSP, PPO/POP, SSO/SOS, POO/OPO, SOO/OSO, PPL/PLP and LLS/LSL, calibration curves were established on the basis of the difference in relative abundances of the fragment ions produced by preferred losses of the fatty acid from the 1/3-position compared to the 2-position. In practice the positional isomers AAB and ABA yield mass spectra showing a significant difference in relative abundance ratios of the ions AA(+) to AB(+). Statistical analysis of the validation data obtained from analysis of TG standards and spiked oils showed that, under repeatability conditions, least-squares regression can be used to establish calibration curves for all pairs. The regression models show linear behavior that allow the determination of the proportion of each regioisomer in an AAB/ABA pair, within a working range from 10 to 1000 microg/mL and a 95% confidence interval of +/-3% for three replicates. Copyright 2003 John Wiley & Sons, Ltd.

  10. 3D volumetry comparison using 3T magnetic resonance imaging between normal and adenoma-containing pituitary glands.

    PubMed

    Roldan-Valadez, Ernesto; Garcia-Ulloa, Ana Cristina; Gonzalez-Gutierrez, Omar; Martinez-Lopez, Manuel

    2011-01-01

    Computed-assisted three-dimensional data (3D) allows for an accurate evaluation of volumes compared with traditional measurements. An in vitro method comparison between geometric volume and 3D volumetry to obtain reference data for pituitary volumes in normal pituitary glands (PGs) and PGs containing adenomas. Prospective, transverse, analytical study. Forty-eight subjects underwent brain magnetic resonance imaging (MRI) with 3D sequencing for computer-aided volumetry. PG phantom volumes by both methods were compared. Using the best volumetric method, volumes of normal PGs and PGs with adenoma were compared. Statistical analysis used the Bland-Altman method, t-statistics, effect size and linear regression analysis. Method comparison between 3D volumetry and geometric volume revealed a lower bias and precision for 3D volumetry. A total of 27 patients exhibited normal PGs (mean age, 42.07 ± 16.17 years), although length, height, width, geometric volume and 3D volumetry were greater in women than in men. A total of 21 patients exhibited adenomas (mean age 39.62 ± 10.79 years), and length, height, width, geometric volume and 3D volumetry were greater in men than in women, with significant volumetric differences. Age did not influence pituitary volumes on linear regression analysis. Results from the present study showed that 3D volumetry was more accurate than the geometric method. In addition, the upper normal limits of PGs overlapped with lower volume limits during early stage microadenomas.

  11. LHCb Build and Deployment Infrastructure for run 2

    NASA Astrophysics Data System (ADS)

    Clemencic, M.; Couturier, B.

    2015-12-01

    After the successful run 1 of the LHC, the LHCb Core software team has taken advantage of the long shutdown to consolidate and improve its build and deployment infrastructure. Several of the related projects have already been presented like the build system using Jenkins, as well as the LHCb Performance and Regression testing infrastructure. Some components are completely new, like the Software Configuration Database (using the Graph DB Neo4j), or the new packaging installation using RPM packages. Furthermore all those parts are integrated to allow easier and quicker releases of the LHCb Software stack, therefore reducing the risk of operational errors. Integration and Regression tests are also now easier to implement, allowing to improve further the software checks.

  12. Immunohistological Analysis of Intracoronary Thrombus Aspirate in STEMI Patients: Clinical Implications of Pathological Findings.

    PubMed

    Blasco, Ana; Bellas, Carmen; Goicolea, Leyre; Muñiz, Ana; Abraira, Víctor; Royuela, Ana; Mingo, Susana; Oteo, Juan Francisco; García-Touchard, Arturo; Goicolea, Francisco Javier

    2017-03-01

    Thrombus aspiration allows analysis of intracoronary material in patients with ST-segment elevation myocardial infarction. Our objective was to characterize this material by immunohistology and to study its possible association with patient progress. This study analyzed a prospective cohort of 142 patients undergoing primary angioplasty with positive coronary aspiration. Histological examination of aspirated samples included immunohistochemistry stains for the detection of plaque fragments. The statistical analysis comprised histological variables (thrombus age, degree of inflammation, presence of plaque), the patients' clinical and angiographic features, estimation of survival curves, and logistic regression analysis. Among the histological markers, only the presence of plaque (63% of samples) was associated with postinfarction clinical events. Factors associated with 5-year event-free survival were the presence of plaque in the aspirate (82.2% vs 66.0%; P = .033), smoking (82.5% smokers vs 66.7% nonsmokers; P = .036), culprit coronary artery (83.3% circumflex or right coronary artery vs 68.5% anterior descending artery; P = .042), final angiographic flow (80.8% II-III vs 30.0% 0-I; P < .001) and left ventricular ejection fraction ≥ 35% at discharge (83.7% vs 26.7%; P < .001). On multivariable Cox regression analysis with these variables, independent predictors of event-free survival were the presence of plaque (hazard ratio, 0.37; 95%CI, 0.18-0.77; P = .008), and left ventricular ejection fraction (hazard ratio, 0.92; 95%CI, 0.88-0.95; P < .001). The presence of plaque in the coronary aspirate of patients with ST elevation myocardial infarction may be an independent prognostic marker. CD68 immunohistochemical stain is a good method for plaque detection. Copyright © 2016 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.

  13. Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data

    PubMed Central

    Keithley, Richard B.; Carelli, Regina M.; Wightman, R. Mark

    2010-01-01

    Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development, however this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here we propose that Malinowski’s F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski’s F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski’s F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise. PMID:20527815

  14. Stepwise versus Hierarchical Regression: Pros and Cons

    ERIC Educational Resources Information Center

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  15. Interpreting Bivariate Regression Coefficients: Going beyond the Average

    ERIC Educational Resources Information Center

    Halcoussis, Dennis; Phillips, G. Michael

    2010-01-01

    Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…

  16. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  17. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  18. Estimation of 1RM for knee extension based on the maximal isometric muscle strength and body composition.

    PubMed

    Kanada, Yoshikiyo; Sakurai, Hiroaki; Sugiura, Yoshito; Arai, Tomoaki; Koyama, Soichiro; Tanabe, Shigeo

    2017-11-01

    [Purpose] To create a regression formula in order to estimate 1RM for knee extensors, based on the maximal isometric muscle strength measured using a hand-held dynamometer and data regarding the body composition. [Subjects and Methods] Measurement was performed in 21 healthy males in their twenties to thirties. Single regression analysis was performed, with measurement values representing 1RM and the maximal isometric muscle strength as dependent and independent variables, respectively. Furthermore, multiple regression analysis was performed, with data regarding the body composition incorporated as another independent variable, in addition to the maximal isometric muscle strength. [Results] Through single regression analysis with the maximal isometric muscle strength as an independent variable, the following regression formula was created: 1RM (kg)=0.714 + 0.783 × maximal isometric muscle strength (kgf). On multiple regression analysis, only the total muscle mass was extracted. [Conclusion] A highly accurate regression formula to estimate 1RM was created based on both the maximal isometric muscle strength and body composition. Using a hand-held dynamometer and body composition analyzer, it was possible to measure these items in a short time, and obtain clinically useful results.

  19. ℓ p-Norm Multikernel Learning Approach for Stock Market Price Forecasting

    PubMed Central

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ 1-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ p-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ 1-norm multiple support vector regression model. PMID:23365561

  20. Regression Model Optimization for the Analysis of Experimental Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2009-01-01

    A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

  1. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    PubMed

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  2. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules

    PubMed Central

    Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030

  3. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules.

    PubMed

    Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.

  4. Exploratory Network Meta Regression Analysis of Stroke Prevention in Atrial Fibrillation Fails to Identify Any Interactions with Treatment Effect.

    PubMed

    Batson, Sarah; Sutton, Alex; Abrams, Keith

    2016-01-01

    Patients with atrial fibrillation are at a greater risk of stroke and therefore the main goal for treatment of patients with atrial fibrillation is to prevent stroke from occurring. There are a number of different stroke prevention treatments available to include warfarin and novel oral anticoagulants. Previous network meta-analyses of novel oral anticoagulants for stroke prevention in atrial fibrillation acknowledge the limitation of heterogeneity across the included trials but have not explored the impact of potentially important treatment modifying covariates. To explore potentially important treatment modifying covariates using network meta-regression analyses for stroke prevention in atrial fibrillation. We performed a network meta-analysis for the outcome of ischaemic stroke and conducted an exploratory regression analysis considering potentially important treatment modifying covariates. These covariates included the proportion of patients with a previous stroke, proportion of males, mean age, the duration of study follow-up and the patients underlying risk of ischaemic stroke. None of the covariates explored impacted relative treatment effects relative to placebo. Notably, the exploration of 'study follow-up' as a covariate supported the assumption that difference in trial durations is unimportant in this indication despite the variation across trials in the network. This study is limited by the quantity of data available. Further investigation is warranted, and, as justifying further trials may be difficult, it would be desirable to obtain individual patient level data (IPD) to facilitate an effort to relate treatment effects to IPD covariates in order to investigate heterogeneity. Observational data could also be examined to establish if there are potential trends elsewhere. The approach and methods presented have potentially wide applications within any indication as to highlight the potential benefit of extending decision problems to include additional comparators outside of those of primary interest to allow for the exploration of heterogeneity.

  5. Predictors of pain relief following spinal cord stimulation in chronic back and leg pain and failed back surgery syndrome: a systematic review and meta-regression analysis.

    PubMed

    Taylor, Rod S; Desai, Mehul J; Rigoard, Philippe; Taylor, Rebecca J

    2014-07-01

    We sought to assess the extent to which pain relief in chronic back and leg pain (CBLP) following spinal cord stimulation (SCS) is influenced by patient-related factors, including pain location, and technology factors. A number of electronic databases were searched with citation searching of included papers and recent systematic reviews. All study designs were included. The primary outcome was pain relief following SCS, we also sought pain score (pre- and post-SCS). Multiple predictive factors were examined: location of pain, history of back surgery, initial level of pain, litigation/worker's compensation, age, gender, duration of pain, duration of follow-up, publication year, continent of data collection, study design, quality score, method of SCS lead implant, and type of SCS lead. Between-study association in predictive factors and pain relief were assessed by meta-regression. Seventy-four studies (N = 3,025 patients with CBLP) met the inclusion criteria; 63 reported data to allow inclusion in a quantitative analysis. Evidence of substantial statistical heterogeneity (P < 0.0001) in level of pain relief following SCS was noted. The mean level of pain relief across studies was 58% (95% CI: 53% to 64%, random effects) at an average follow-up of 24 months. Multivariable meta-regression analysis showed no predictive patient or technology factors. SCS was effective in reducing pain irrespective of the location of CBLP. This review supports SCS as an effective pain relieving treatment for CBLP with predominant leg pain with or without a prior history of back surgery. Randomized controlled trials need to confirm the effectiveness and cost-effectiveness of SCS in the CLBP population with predominant low back pain. © 2013 The Authors Pain Practice Published by Wiley Periodicals, Inc. on behalf of World Institute of Pain.

  6. An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin

    Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less

  7. An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology

    DOE PAGES

    Lynch, Cary D.; Hartin, Corinne A.; Bond-Lamberty, Benjamin; ...

    2017-05-15

    Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. We exploremore » the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90°N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5°C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. As a result, this paper describes our library of least squared regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns.« less

  8. A simplified computational fluid-dynamic approach to the oxidizer injector design in hybrid rockets

    NASA Astrophysics Data System (ADS)

    Di Martino, Giuseppe D.; Malgieri, Paolo; Carmicino, Carmine; Savino, Raffaele

    2016-12-01

    Fuel regression rate in hybrid rockets is non-negligibly affected by the oxidizer injection pattern. In this paper a simplified computational approach developed in an attempt to optimize the oxidizer injector design is discussed. Numerical simulations of the thermo-fluid-dynamic field in a hybrid rocket are carried out, with a commercial solver, to investigate into several injection configurations with the aim of increasing the fuel regression rate and minimizing the consumption unevenness, but still favoring the establishment of flow recirculation at the motor head end, which is generated with an axial nozzle injector and has been demonstrated to promote combustion stability, and both larger efficiency and regression rate. All the computations have been performed on the configuration of a lab-scale hybrid rocket motor available at the propulsion laboratory of the University of Naples with typical operating conditions. After a preliminary comparison between the two baseline limiting cases of an axial subsonic nozzle injector and a uniform injection through the prechamber, a parametric analysis has been carried out by varying the oxidizer jet flow divergence angle, as well as the grain port diameter and the oxidizer mass flux to study the effect of the flow divergence on heat transfer distribution over the fuel surface. Some experimental firing test data are presented, and, under the hypothesis that fuel regression rate and surface heat flux are proportional, the measured fuel consumption axial profiles are compared with the predicted surface heat flux showing fairly good agreement, which allowed validating the employed design approach. Finally an optimized injector design is proposed.

  9. Random regression models on Legendre polynomials to estimate genetic parameters for weights from birth to adult age in Canchim cattle.

    PubMed

    Baldi, F; Albuquerque, L G; Alencar, M M

    2010-08-01

    The objective of this work was to estimate covariance functions for direct and maternal genetic effects, animal and maternal permanent environmental effects, and subsequently, to derive relevant genetic parameters for growth traits in Canchim cattle. Data comprised 49,011 weight records on 2435 females from birth to adult age. The model of analysis included fixed effects of contemporary groups (year and month of birth and at weighing) and age of dam as quadratic covariable. Mean trends were taken into account by a cubic regression on orthogonal polynomials of animal age. Residual variances were allowed to vary and were modelled by a step function with 1, 4 or 11 classes based on animal's age. The model fitting four classes of residual variances was the best. A total of 12 random regression models from second to seventh order were used to model direct and maternal genetic effects, animal and maternal permanent environmental effects. The model with direct and maternal genetic effects, animal and maternal permanent environmental effects fitted by quadric, cubic, quintic and linear Legendre polynomials, respectively, was the most adequate to describe the covariance structure of the data. Estimates of direct and maternal heritability obtained by multi-trait (seven traits) and random regression models were very similar. Selection for higher weight at any age, especially after weaning, will produce an increase in mature cow weight. The possibility to modify the growth curve in Canchim cattle to obtain animals with rapid growth at early ages and moderate to low mature cow weight is limited.

  10. Spatiotemporal analysis of the relationship between socioeconomic factors and stroke in the Portuguese mainland population under 65 years old.

    PubMed

    Oliveira, André; Cabral, António J R; Mendes, Jorge M; Martins, Maria R O; Cabral, Pedro

    2015-11-04

    Stroke risk has been shown to display varying patterns of geographic distribution amongst countries but also between regions of the same country. Traditionally a disease of older persons, a global 25% increase in incidence instead was noticed between 1990 and 2010 in persons aged 20-≤64 years, particularly in low- and medium-income countries. Understanding spatial disparities in the association between socioeconomic factors and stroke is critical to target public health initiatives aiming to mitigate or prevent this disease, including in younger persons. We aimed to identify socioeconomic determinants of geographic disparities of stroke risk in people <65 years old, in municipalities of mainland Portugal, and the spatiotemporal variation of the association between these determinants and stroke risk during two study periods (1992-1996 and 2002-2006). Poisson and negative binomial global regression models were used to explore determinants of disease risk. Geographically weighted regression (GWR) represents a distinctive approach, allowing estimation of local regression coefficients. Models for both study periods were identified. Significant variables included education attainment, work hours per week and unemployment. Local Poisson GWR models achieved the best fit and evidenced spatially varying regression coefficients. Spatiotemporal inequalities were observed in significant variables, with dissimilarities between men and women. This study contributes to a better understanding of the relationship between stroke and socioeconomic factors in the population <65 years of age, one age group seldom analysed separately. It can thus help to improve the targeting of public health initiatives, even more in a context of economic crisis.

  11. [Is there either agenesis or regression of the Mullerian duct in female bird embryos under the influence of male hormone?].

    PubMed

    Lutz-Ostertag, Y; Lutz, H

    1976-01-01

    The natural occurence of "Free-Martinism" in Birds and the chorio-allantoïc grafting experiments of testis fragments on female chick host-embryos allow to the authors to define the manner provoking the entire or partial disappearance of the müllerian ducts and to state exactly if the phenomenon is a agenesis or a regression.

  12. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.

    PubMed

    Chatzis, Sotirios P; Andreou, Andreas S

    2015-11-01

    Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.

  13. Three methods to construct predictive models using logistic regression and likelihood ratios to facilitate adjustment for pretest probability give similar results.

    PubMed

    Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les

    2008-01-01

    To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.

  14. A Three-way Decomposition of a Total Effect into Direct, Indirect, and Interactive Effects

    PubMed Central

    VanderWeele, Tyler J.

    2013-01-01

    Recent theory in causal inference has provided concepts for mediation analysis and effect decomposition that allow one to decompose a total effect into a direct and an indirect effect. Here, it is shown that what is often taken as an indirect effect can in fact be further decomposed into a “pure” indirect effect and a mediated interactive effect, thus yielding a three-way decomposition of a total effect (direct, indirect, and interactive). This three-way decomposition applies to difference scales and also to additive ratio scales and additive hazard scales. Assumptions needed for the identification of each of these three effects are discussed and simple formulae are given for each when regression models allowing for interaction are used. The three-way decomposition is illustrated by examples from genetic and perinatal epidemiology, and discussion is given to what is gained over the traditional two-way decomposition into simply a direct and an indirect effect. PMID:23354283

  15. Characterization of Microbiota in Children with Chronic Functional Constipation

    PubMed Central

    de Meij, Tim G. J.; de Groot, Evelien F. J.; Eck, Anat; Budding, Andries E.; Kneepkens, C. M. Frank; Benninga, Marc A.; van Bodegraven, Adriaan A.; Savelkoul, Paul H. M.

    2016-01-01

    Objectives Disruption of the intestinal microbiota is considered an etiological factor in pediatric functional constipation. Scientifically based selection of potential beneficial probiotic strains in functional constipation therapy is not feasible due to insufficient knowledge of microbiota composition in affected subjects. The aim of this study was to describe microbial composition and diversity in children with functional constipation, compared to healthy controls. Study Design Fecal samples from 76 children diagnosed with functional constipation according to the Rome III criteria (median age 8.0 years; range 4.2–17.8) were analyzed by IS-pro, a PCR-based microbiota profiling method. Outcome was compared with intestinal microbiota profiles of 61 healthy children (median 8.6 years; range 4.1–17.9). Microbiota dissimilarity was depicted by principal coordinate analysis (PCoA), diversity was calculated by Shannon diversity index. To determine the most discriminative species, cross validated logistic ridge regression was performed. Results Applying total microbiota profiles (all phyla together) or per phylum analysis, no disease-specific separation was observed by PCoA and by calculation of diversity indices. By ridge regression, however, functional constipation and controls could be discriminated with 82% accuracy. Most discriminative species were Bacteroides fragilis, Bacteroides ovatus, Bifidobacterium longum, Parabacteroides species (increased in functional constipation) and Alistipes finegoldii (decreased in functional constipation). Conclusions None of the commonly used unsupervised statistical methods allowed for microbiota-based discrimination of children with functional constipation and controls. By ridge regression, however, both groups could be discriminated with 82% accuracy. Optimization of microbiota-based interventions in constipated children warrants further characterization of microbial signatures linked to clinical subgroups of functional constipation. PMID:27760208

  16. Estimated prevalence of erosive tooth wear in permanent teeth of children and adolescents: an epidemiological systematic review and meta-regression analysis.

    PubMed

    Salas, M M S; Nascimento, G G; Huysmans, M C; Demarco, F F

    2015-01-01

    The main purpose of this systematic review was to estimate the prevalence of dental erosion in permanent teeth of children and adolescents. An electronic search was performed up to and including March 2014. Eligibility criteria included population-based studies in permanent teeth of children and adolescents aged 8-19-year-old reporting the prevalence or data that allowed the calculation of prevalence rates of tooth erosion. Data collection assessed information regarding geographic location, type of index used for clinical examination, sample size, year of publication, age, examined teeth and tissue exposure. The estimated prevalence of erosive wear was determined, followed by a meta-regression analysis. Twenty-two papers were included in the systematic review. The overall estimated prevalence of tooth erosion was 30.4% (95%IC 23.8-37.0). In the multivariate meta-regression model use of the Tooth Wear Index for clinical examination, studies with sample smaller than 1000 subjects and those conducted in the Middle East and Africa remained associated with higher dental erosion prevalence rates. Our results demonstrated that the estimated prevalence of erosive wear in permanent teeth of children and adolescents is 30.4% with high heterogeneity between studies. Additionally, the correct choice of a clinical index for dental erosion detection and the geographic location play an important role for the large variability of erosive tooth wear in permanent teeth of children and adolescents. The prevalence of tooth erosion observed in permanent teeth of children and adolescents was considerable high. Our results demonstrated that prevalence rate of erosive wear was influenced by methodological and diagnosis factors. When tooth erosion is assessed, the clinical index should be considered. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. Implementation of an integrating sphere for the enhancement of noninvasive glucose detection using quantum cascade laser spectroscopy

    NASA Astrophysics Data System (ADS)

    Werth, Alexandra; Liakat, Sabbir; Dong, Anqi; Woods, Callie M.; Gmachl, Claire F.

    2018-05-01

    An integrating sphere is used to enhance the collection of backscattered light in a noninvasive glucose sensor based on quantum cascade laser spectroscopy. The sphere enhances signal stability by roughly an order of magnitude, allowing us to use a thermoelectrically (TE) cooled detector while maintaining comparable glucose prediction accuracy levels. Using a smaller TE-cooled detector reduces form factor, creating a mobile sensor. Principal component analysis has predicted principal components of spectra taken from human subjects that closely match the absorption peaks of glucose. These principal components are used as regressors in a linear regression algorithm to make glucose concentration predictions, over 75% of which are clinically accurate.

  18. 3D force/torque characterization of emergency cricothyroidotomy procedure using an instrumented scalpel.

    PubMed

    Ryason, Adam; Sankaranarayanan, Ganesh; Butler, Kathryn L; DeMoya, Marc; De, Suvranu

    2016-08-01

    Emergency Cricothyroidotomy (CCT) is a surgical procedure performed to secure a patient's airway. This high-stakes, but seldom-performed procedure is an ideal candidate for a virtual reality simulator to enhance physician training. For the first time, this study characterizes the force/torque characteristics of the cricothyroidotomy procedure, to guide development of a virtual reality CCT simulator for use in medical training. We analyze the upper force and torque thresholds experienced at the human-scalpel interface. We then group individual surgical cuts based on style of cut and cut medium and perform a regression analysis to create two models that allow us to predict the style of cut performed and the cut medium.

  19. Quantifying Main Trends in Lysozyme Nucleation: The Effect of Precipitant Concentration and Impurities

    NASA Technical Reports Server (NTRS)

    Burke, Michael W.; Judge, Russell A.; Pusey, Marc L.; Rose, M. Franklin (Technical Monitor)

    2000-01-01

    Full factorial experiment design incorporating multi-linear regression analysis of the experimental data allows the main trends and effects to be quickly identified while using only a limited number of experiments. These techniques were used to identify the effect of precipitant concentration and the presence of an impurity, the physiological lysozyme dimer, on the nucleation rate and crystal dimensions of the tetragonal form of chicken egg white lysozyme. Increasing precipitant concentration was found to decrease crystal numbers, the magnitude of this effect also depending on the supersaturation. The presence of the dimer generally increased nucleation. The crystal axial ratio decreased with increasing precipitant concentration independent of impurity.

  20. A new analytical method for quantification of olive and palm oil in blends with other vegetable edible oils based on the chromatographic fingerprints from the methyl-transesterified fraction.

    PubMed

    Jiménez-Carvelo, Ana M; González-Casado, Antonio; Cuadros-Rodríguez, Luis

    2017-03-01

    A new analytical method for the quantification of olive oil and palm oil in blends with other vegetable edible oils (canola, safflower, corn, peanut, seeds, grapeseed, linseed, sesame and soybean) using normal phase liquid chromatography, and applying chemometric tools was developed. The procedure for obtaining of chromatographic fingerprint from the methyl-transesterified fraction from each blend is described. The multivariate quantification methods used were Partial Least Square-Regression (PLS-R) and Support Vector Regression (SVR). The quantification results were evaluated by several parameters as the Root Mean Square Error of Validation (RMSEV), Mean Absolute Error of Validation (MAEV) and Median Absolute Error of Validation (MdAEV). It has to be highlighted that the new proposed analytical method, the chromatographic analysis takes only eight minutes and the results obtained showed the potential of this method and allowed quantification of mixtures of olive oil and palm oil with other vegetable oils. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Disentangling WTP per QALY data: different analytical approaches, different answers.

    PubMed

    Gyrd-Hansen, Dorte; Kjaer, Trine

    2012-03-01

    A large random sample of the Danish general population was asked to value health improvements by way of both the time trade-off elicitation technique and willingness-to-pay (WTP) using contingent valuation methods. The data demonstrate a high degree of heterogeneity across respondents in their relative valuations on the two scales. This has implications for data analysis. We show that the estimates of WTP per QALY are highly sensitive to the analytical strategy. For both open-ended and dichotomous choice data we demonstrate that choice of aggregated approach (ratios of means) or disaggregated approach (means of ratios) affects estimates markedly as does the interpretation of the constant term (which allows for disproportionality across the two scales) in the regression analyses. We propose that future research should focus on why some respondents are unwilling to trade on the time trade-off scale, on how to interpret the constant value in the regression analyses, and on how best to capture the heterogeneity in preference structures when applying mixed multinomial logit. Copyright © 2011 John Wiley & Sons, Ltd.

  2. Artificial neural networks environmental forecasting in comparison with multiple linear regression technique: From heavy metals to organic micropollutants screening in agricultural soils

    NASA Astrophysics Data System (ADS)

    Bonelli, Maria Grazia; Ferrini, Mauro; Manni, Andrea

    2016-12-01

    The assessment of metals and organic micropollutants contamination in agricultural soils is a difficult challenge due to the extensive area used to collect and analyze a very large number of samples. With Dioxins and dioxin-like PCBs measurement methods and subsequent the treatment of data, the European Community advises the develop low-cost and fast methods allowing routing analysis of a great number of samples, providing rapid measurement of these compounds in the environment, feeds and food. The aim of the present work has been to find a method suitable to describe the relations occurring between organic and inorganic contaminants and use the value of the latter in order to forecast the former. In practice, the use of a metal portable soil analyzer coupled with an efficient statistical procedure enables the required objective to be achieved. Compared to Multiple Linear Regression, the Artificial Neural Networks technique has shown to be an excellent forecasting method, though there is no linear correlation between the variables to be analyzed.

  3. Quantile regression analysis of body mass and wages.

    PubMed

    Johar, Meliyanni; Katayama, Hajime

    2012-05-01

    Using the National Longitudinal Survey of Youth 1979, we explore the relationship between body mass and wages. We use quantile regression to provide a broad description of the relationship across the wage distribution. We also allow the relationship to vary by the degree of social skills involved in different jobs. Our results find that for female workers body mass and wages are negatively correlated at all points in their wage distribution. The strength of the relationship is larger at higher-wage levels. For male workers, the relationship is relatively constant across wage distribution but heterogeneous across ethnic groups. When controlling for the endogeneity of body mass, we find that additional body mass has a negative causal impact on the wages of white females earning more than the median wages and of white males around the median wages. Among these workers, the wage penalties are larger for those employed in jobs that require extensive social skills. These findings may suggest that labor markets reward white workers for good physical shape differently, depending on the level of wages and the type of job a worker has. Copyright © 2011 John Wiley & Sons, Ltd.

  4. Dreams Fulfilled and Shattered: Determinants of Segmented Assimilation in the Second Generation*

    PubMed Central

    Haller, William; Portes, Alejandro; Lynch, Scott M.

    2013-01-01

    We summarize prior theories on the adaptation process of the contemporary immigrant second generation as a prelude to presenting additive and interactive models showing the impact of family variables, school contexts and academic outcomes on the process. For this purpose, we regress indicators of educational and occupational achievement in early adulthood on predictors measured three and six years earlier. The Children of Immigrants Longitudinal Study (CILS), used for the analysis, allows us to establish a clear temporal order among exogenous predictors and the two dependent variables. We also construct a Downward Assimilation Index (DAI), based on six indicators and regress it on the same set of predictors. Results confirm a pattern of segmented assimilation in the second generation, with a significant proportion of the sample experiencing downward assimilation. Predictors of the latter are the obverse of those of educational and occupational achievement. Significant interaction effects emerge between these predictors and early school contexts, defined by different class and racial compositions. Implications of these results for theory and policy are examined. PMID:24223437

  5. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  6. Neuromuscular dose-response studies: determining sample size.

    PubMed

    Kopman, A F; Lien, C A; Naguib, M

    2011-02-01

    Investigators planning dose-response studies of neuromuscular blockers have rarely used a priori power analysis to determine the minimal sample size their protocols require. Institutional Review Boards and peer-reviewed journals now generally ask for this information. This study outlines a proposed method for meeting these requirements. The slopes of the dose-response relationships of eight neuromuscular blocking agents were determined using regression analysis. These values were substituted for γ in the Hill equation. When this is done, the coefficient of variation (COV) around the mean value of the ED₅₀ for each drug is easily calculated. Using these values, we performed an a priori one-sample two-tailed t-test of the means to determine the required sample size when the allowable error in the ED₅₀ was varied from ±10-20%. The COV averaged 22% (range 15-27%). We used a COV value of 25% in determining the sample size. If the allowable error in finding the mean ED₅₀ is ±15%, a sample size of 24 is needed to achieve a power of 80%. Increasing 'accuracy' beyond this point requires increasing greater sample sizes (e.g. an 'n' of 37 for a ±12% error). On the basis of the results of this retrospective analysis, a total sample size of not less than 24 subjects should be adequate for determining a neuromuscular blocking drug's clinical potency with a reasonable degree of assurance.

  7. An optimal set of landmarks for metopic craniosynostosis diagnosis from shape analysis of pediatric CT scans of the head

    NASA Astrophysics Data System (ADS)

    Mendoza, Carlos S.; Safdar, Nabile; Myers, Emmarie; Kittisarapong, Tanakorn; Rogers, Gary F.; Linguraru, Marius George

    2013-02-01

    Craniosynostosis (premature fusion of skull sutures) is a severe condition present in one of every 2000 newborns. Metopic craniosynostosis, accounting for 20-27% of cases, is diagnosed qualitatively in terms of skull shape abnormality, a subjective call of the surgeon. In this paper we introduce a new quantitative diagnostic feature for metopic craniosynostosis derived optimally from shape analysis of CT scans of the skull. We built a robust shape analysis pipeline that is capable of obtaining local shape differences in comparison to normal anatomy. Spatial normalization using 7-degree-of-freedom registration of the base of the skull is followed by a novel bone labeling strategy based on graph-cuts according to labeling priors. The statistical shape model built from 94 normal subjects allows matching a patient's anatomy to its most similar normal subject. Subsequently, the computation of local malformations from a normal subject allows characterization of the points of maximum malformation on each of the frontal bones adjacent to the metopic suture, and on the suture itself. Our results show that the malformations at these locations vary significantly (p<0.001) between abnormal/normal subjects and that an accurate diagnosis can be achieved using linear regression from these automatic measurements with an area under the curve for the receiver operating characteristic of 0.97.

  8. The Precision Efficacy Analysis for Regression Sample Size Method.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Barcikowski, Robert S.

    The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

  9. Regularized quantile regression for SNP marker estimation of pig growth curves.

    PubMed

    Barroso, L M A; Nascimento, M; Nascimento, A C C; Silva, F F; Serão, N V L; Cruz, C D; Resende, M D V; Silva, F L; Azevedo, C F; Lopes, P S; Guimarães, S E F

    2017-01-01

    Genomic growth curves are generally defined only in terms of population mean; an alternative approach that has not yet been exploited in genomic analyses of growth curves is the Quantile Regression (QR). This methodology allows for the estimation of marker effects at different levels of the variable of interest. We aimed to propose and evaluate a regularized quantile regression for SNP marker effect estimation of pig growth curves, as well as to identify the chromosome regions of the most relevant markers and to estimate the genetic individual weight trajectory over time (genomic growth curve) under different quantiles (levels). The regularized quantile regression (RQR) enabled the discovery, at different levels of interest (quantiles), of the most relevant markers allowing for the identification of QTL regions. We found the same relevant markers simultaneously affecting different growth curve parameters (mature weight and maturity rate): two (ALGA0096701 and ALGA0029483) for RQR(0.2), one (ALGA0096701) for RQR(0.5), and one (ALGA0003761) for RQR(0.8). Three average genomic growth curves were obtained and the behavior was explained by the curve in quantile 0.2, which differed from the others. RQR allowed for the construction of genomic growth curves, which is the key to identifying and selecting the most desirable animals for breeding purposes. Furthermore, the proposed model enabled us to find, at different levels of interest (quantiles), the most relevant markers for each trait (growth curve parameter estimates) and their respective chromosomal positions (identification of new QTL regions for growth curves in pigs). These markers can be exploited under the context of marker assisted selection while aiming to change the shape of pig growth curves.

  10. Effect of Contact Damage on the Strength of Ceramic Materials.

    DTIC Science & Technology

    1982-10-01

    variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F

  11. Methods for estimating magnitude and frequency of 1-, 3-, 7-, 15-, and 30-day flood-duration flows in Arizona

    USGS Publications Warehouse

    Kennedy, Jeffrey R.; Paretti, Nicholas V.; Veilleux, Andrea G.

    2014-01-01

    Regression equations, which allow predictions of n-day flood-duration flows for selected annual exceedance probabilities at ungaged sites, were developed using generalized least-squares regression and flood-duration flow frequency estimates at 56 streamgaging stations within a single, relatively uniform physiographic region in the central part of Arizona, between the Colorado Plateau and Basin and Range Province, called the Transition Zone. Drainage area explained most of the variation in the n-day flood-duration annual exceedance probabilities, but mean annual precipitation and mean elevation were also significant variables in the regression models. Standard error of prediction for the regression equations varies from 28 to 53 percent and generally decreases with increasing n-day duration. Outside the Transition Zone there are insufficient streamgaging stations to develop regression equations, but flood-duration flow frequency estimates are presented at select streamgaging stations.

  12. Systolic time interval v heart rate regression equations using atropine: reproducibility studies.

    PubMed Central

    Kelman, A W; Sumner, D J; Whiting, B

    1981-01-01

    1. Systolic time intervals (STI) were recorded in six normal male subjects over a period of 3 weeks. On one day per week, each subject received incremental doses of atropine intravenously to increase heart rate, allowing the determination of individual STI v HR regression equations. On the other days STI were recorded with the subjects resting, in the supine position. 2. There were highly significant regression relationships between heart rate and both LVET and QS2, but not between heart rate and PEP. 3. The regression relationships showed little intra-subject variability, but a large degree of inter-subject variability: they proved adequate to correct the STI for the daily fluctuations in heart rate. 4. Administration of small doses of atropine intravenously provides a satisfactory and convenient method of deriving individual STI v HR regression equations which can be applied over a period of weeks. PMID:7248136

  13. Systolic time interval v heart rate regression equations using atropine: reproducibility studies.

    PubMed

    Kelman, A W; Sumner, D J; Whiting, B

    1981-07-01

    1. Systolic time intervals (STI) were recorded in six normal male subjects over a period of 3 weeks. On one day per week, each subject received incremental doses of atropine intravenously to increase heart rate, allowing the determination of individual STI v HR regression equations. On the other days STI were recorded with the subjects resting, in the supine position. 2. There were highly significant regression relationships between heart rate and both LVET and QS2, but not between heart rate and PEP. 3. The regression relationships showed little intra-subject variability, but a large degree of inter-subject variability: they proved adequate to correct the STI for the daily fluctuations in heart rate. 4. Administration of small doses of atropine intravenously provides a satisfactory and convenient method of deriving individual STI v HR regression equations which can be applied over a period of weeks.

  14. Dietary patterns and risk of ductal carcinoma of the breast: a factor analysis in Uruguay.

    PubMed

    Ronco, Alvaro L; De Stefani, Eduardo; Deneo-Pellegrini, Hugo; Boffetta, Paolo; Aune, Dagfinn; Silva, Cecilia; Landó, Gabriel; Luaces, María E; Acosta, Gisele; Mendilaharsu, María

    2010-01-01

    Breast cancer (BC) shows very high incidence rates in Uruguayan women. The present factor analysis of ductal carcinoma of the breast, the most frequent histological type of this malignancy both in Uruguay and in the World, was conducted at a prepaid hospital of Montevideo, Uruguay. We identified 111 cases with ductal BC and 222 controls with normal mammograms. A factor analysis was conducted using 39 food groups, allowing retention of six factors analyzed through logistic regression in order to obtain odds ratios (OR) associated with ductal BC. The low fat and non-alcoholic beverage patterns were inversely associated (OR=0.30 and OR=0.45, respectively) with risk. Conversely, the fatty cheese pattern was positively associated (OR=4.17) as well as the fried white meat (OR=2.28) and Western patterns (OR 2.13). Ductal BC shared similar dietary risk patterns as those identified by studies not discriminating between histologic type of breast cancer.

  15. Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08

    PubMed Central

    Casey, Martin F.; Neidell, Matthew

    2013-01-01

    Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205

  16. Prediction of erodibility in Oxisols using iron oxides, soil color and diffuse reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Arantes Camargo, Livia; Marques, José, Jr.

    2015-04-01

    The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.

  17. Common pitfalls in statistical analysis: Linear regression analysis

    PubMed Central

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022

  18. Application of Fourier transform infrared spectroscopy and orthogonal projections to latent structures/partial least squares regression for estimation of procyanidins average degree of polymerisation.

    PubMed

    Passos, Cláudia P; Cardoso, Susana M; Barros, António S; Silva, Carlos M; Coimbra, Manuel A

    2010-02-28

    Fourier transform infrared (FTIR) spectroscopy has being emphasised as a widespread technique in the quick assess of food components. In this work, procyanidins were extracted with methanol and acetone/water from the seeds of white and red grape varieties. A fractionation by graded methanol/chloroform precipitations allowed to obtain 26 samples that were characterised using thiolysis as pre-treatment followed by HPLC-UV and MS detection. The average degree of polymerisation (DPn) of the procyanidins in the samples ranged from 2 to 11 flavan-3-ol residues. FTIR spectroscopy within the wavenumbers region of 1800-700 cm(-1) allowed to build a partial least squares (PLS1) regression model with 8 latent variables (LVs) for the estimation of the DPn, giving a RMSECV of 11.7%, with a R(2) of 0.91 and a RMSEP of 2.58. The application of orthogonal projection to latent structures (O-PLS1) clarifies the interpretation of the regression model vectors. Moreover, the O-PLS procedure has removed 88% of non-correlated variations with the DPn, allowing to relate the increase of the absorbance peaks at 1203 and 1099 cm(-1) with the increase of the DPn due to the higher proportion of substitutions in the aromatic ring of the polymerised procyanidin molecules. Copyright 2009 Elsevier B.V. All rights reserved.

  19. Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT).

    PubMed

    Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E

    2015-05-01

    The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Modeling absolute differences in life expectancy with a censored skew-normal regression approach

    PubMed Central

    Clough-Gorr, Kerri; Zwahlen, Marcel

    2015-01-01

    Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest. PMID:26339544

  1. Widen NomoGram for multinomial logistic regression: an application to staging liver fibrosis in chronic hepatitis C patients.

    PubMed

    Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M

    2017-04-01

    The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.

  2. MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

    PubMed

    Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

    2015-04-01

    Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  3. myPhyloDB: a local web server for the storage and analysis of metagenomic data.

    PubMed

    Manter, Daniel K; Korsa, Matthew; Tebbe, Caleb; Delgado, Jorge A

    2016-01-01

    myPhyloDB v.1.1.2 is a user-friendly personal database with a browser-interface designed to facilitate the storage, processing, analysis, and distribution of microbial community populations (e.g. 16S metagenomics data). MyPhyloDB archives raw sequencing files, and allows for easy selection of project(s)/sample(s) of any combination from all available data in the database. The data processing capabilities of myPhyloDB are also flexible enough to allow the upload and storage of pre-processed data, or use the built-in Mothur pipeline to automate the processing of raw sequencing data. myPhyloDB provides several analytical (e.g. analysis of covariance,t-tests, linear regression, differential abundance (DESeq2), and principal coordinates analysis (PCoA)) and normalization (rarefaction, DESeq2, and proportion) tools for the comparative analysis of taxonomic abundance, species richness and species diversity for projects of various types (e.g. human-associated, human gut microbiome, air, soil, and water) for any taxonomic level(s) desired. Finally, since myPhyloDB is a local web-server, users can quickly distribute data between colleagues and end-users by simply granting others access to their personal myPhyloDB database. myPhyloDB is available athttp://www.ars.usda.gov/services/software/download.htm?softwareid=472 and more information along with tutorials can be found on our websitehttp://www.myphylodb.org. Database URL:http://www.myphylodb.org. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the United States.

  4. Analysis of cohort studies with multivariate and partially observed disease classification data.

    PubMed

    Chatterjee, Nilanjan; Sinha, Samiran; Diver, W Ryan; Feigelson, Heather Spencer

    2010-09-01

    Complex diseases like cancers can often be classified into subtypes using various pathological and molecular traits of the disease. In this article, we develop methods for analysis of disease incidence in cohort studies incorporating data on multiple disease traits using a two-stage semiparametric Cox proportional hazards regression model that allows one to examine the heterogeneity in the effect of the covariates by the levels of the different disease traits. For inference in the presence of missing disease traits, we propose a generalization of an estimating equation approach for handling missing cause of failure in competing-risk data. We prove asymptotic unbiasedness of the estimating equation method under a general missing-at-random assumption and propose a novel influence-function-based sandwich variance estimator. The methods are illustrated using simulation studies and a real data application involving the Cancer Prevention Study II nutrition cohort.

  5. Where do they come from and where do they go: implications of geographic origins of medical students.

    PubMed

    Pretorius, Richard W; Lichter, Michael I; Okazaki, Goroh; Sellick, John A

    2010-10-01

    Can prior places of residence listed on a medical school application predict where a physician will practice in midcareer? Geographic data were analyzed for a cohort of 399 graduates from a single U.S. medical school. Applicants with origins in the local region had a 40.4% to 49.5% probability of practicing locally in midcareer--an increased likelihood of 6.1 to 7.3 (P < .001) by bivariate analysis. In a logistic regression analysis, residence at birth (odds ratio [OR] = 2.6, P = .019) and at college graduation (OR = 2.8, P = .001) were significant predictors of midcareer practice location, but residence at high school graduation and on application to medical school were not. Midcareer practice location is related to geographic origins. Using multiple indicators of geographic origins available at the time of application can allow admissions committees to make higher-quality decisions.

  6. Low iron stores: a risk factor for excessive hair loss in non-menopausal women.

    PubMed

    Deloche, Claire; Bastien, Philippe; Chadoutaud, Stéphanie; Galan, Pilar; Bertrais, Sandrine; Hercberg, Serge; de Lacharrière, Olivier

    2007-01-01

    Iron deficiency has been suspected to represent one of the possible causes of excessive hair loss in women. The aim of our study was to assess this relationship in a very large population of 5110 women aged between 35 and 60 years. Hair loss was evaluated using a standardized questionnaire sent to all volunteers. The iron status was assessed by a serum ferritin assay carried out in each volunteer. Multivariate analysis allowed us to identify three categories: "absence of hair loss" (43%), "moderate hair loss" (48%) and "excessive hair loss" (9%). Among the women affected by excessive hair loss, a larger proportion of women (59%) had low iron stores (< 40 microg/L) compared to the remainder of the population (48%). Analysis of variance and logistic regression show that a low iron store represents a risk factor for hair loss in non-menopausal women.

  7. Comprehensive Chemical Fingerprinting of High-Quality Cocoa at Early Stages of Processing: Effectiveness of Combined Untargeted and Targeted Approaches for Classification and Discrimination.

    PubMed

    Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara

    2017-08-02

    This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.

  8. Impact of LANDSAT MSS sensor differences on change detection analysis

    NASA Technical Reports Server (NTRS)

    Likens, W. C.; Wrigley, R. C.

    1983-01-01

    Some 512 by 512 pixel subwindows for simultaneously acquired scene pairs obtained by LANDSAT 2,3 and 4 multispectral band scanners were coregistered using LANDSAT 4 scenes as the base to which the other images were registered. Scattergrams between the coregistered scenes (a form of contingency analysis) were used to radiometrically compare data from the various sensors. Mode values were derived and used to visually fit a linear regression. Root mean square errors of the registration varied between .1 and 1.5 pixels. There appear to be no major problem preventing the use of LANDSAT 4 MSS with previous MSS sensors for change detection, provided the noise interference can be removed or minimized. Data normalizations for change detection should be based on the data rather than solely on calibration information. This allows simultaneous normalization of the atmosphere as well as the radiometry.

  9. Metabolomic approach for determination of key volatile compounds related to beef flavor in glutathione-Maillard reaction products.

    PubMed

    Lee, Sang Mi; Kwon, Goo Young; Kim, Kwang-Ok; Kim, Young-Suk

    2011-10-10

    The non-targeted analysis, combining gas chromatography coupled with time-of-flight mass spectrometry (GC-TOF/MS) and sensory evaluation, was applied to investigate the relationship between volatile compounds and the sensory attributes of glutathione-Maillard reaction products (GSH-MRPs) prepared under different reaction conditions. Volatile compounds in GSH-MRPs correlating to the sensory attributes were determined using partial least-squares (PLS) regression. Volatile compounds such as 2-methylfuran-3-thiol, 3-sulfanylpentan-2-one, furan-2-ylmethanethiol, 2-propylpyrazine, 1-furan-2-ylpropan-2-one, 1H-pyrrole, 2-methylthiophene, and 2-(furan-2-ylmethyldisulfanylmethyl)furan could be identified as possible key contributors to the beef-related attributes of GSH-MRPs. In this study, we demonstrated that the unbiased non-targeted analysis based on metabolomic approach allows the identification of key volatile compounds related to beef flavor in GSH-MRPs. Copyright © 2011 Elsevier B.V. All rights reserved.

  10. The effect of hospital organizational characteristics on postoperative complications.

    PubMed

    Knight, Margaret

    2013-12-01

    To determine if there is a relationship between the risk of postoperative complications and the nonclinical hospital characteristics of bed size, ownership structure, relative urbanicity, regional location, teaching status, and area income status. This study involved a secondary analysis of 2006 administrative hospital data from a number of U.S. states. This data, gathered annually by the Agency for Healthcare Research and Quality (AHRQ) via the National Inpatient Sample (NIS) Healthcare Utilization Project (HCUP), was analyzed using probit regressions to measure the effects of several nonclinical hospital categories on seven diagnostic groupings. The study model included postoperative complications as well as additional potentially confounding variables. The results showed mixed outcomes for each of the hospital characteristic groupings. Subdividing these groupings to correspond with the HCUP data analysis allowed a greater understanding of how hospital characteristics' may affect postoperative outcomes. Nonclinical hospital characteristics do affect the various postoperative complications, but they do so inconsistently.

  11. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  12. Statistical tools for transgene copy number estimation based on real-time PCR.

    PubMed

    Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

    2007-11-01

    As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.

  13. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  14. The Dissolution Behavior of Borosilicate Glasses in Far-From Equilibrium Conditions

    DOE PAGES

    Neeway, James J.; Rieke, Peter C.; Parruzot, Benjamin P.; ...

    2018-02-10

    An area of agreement in the waste glass corrosion community is that, at far-from-equilibrium conditions, the dissolution of borosilicate glasses used to immobilize nuclear waste is known to be a function of both temperature and pH. The aim of this work is to study the effects of temperature and pH on the dissolution rate of three model nuclear waste glasses (SON68, ISG, AFCI). The dissolution rate data are then used to parameterize a kinetic rate model based on Transition State Theory that has been developed to model glass corrosion behavior in dilute conditions. To do this, experiments were conducted atmore » temperatures of 23, 40, 70, and 90 °C and pH(22 °C) values of 9, 10, 11, and 12 with the single-pass flow-through (SPFT) test method. Both the absolute dissolution rates and the rate model parameters are compared with previous results. Rate model parameters for the three glasses studied here are nearly equivalent within error and in relative agreement with previous studies though quantifiable differences exist. The glass dissolution rates were analyzed with a linear multivariate regression (LMR) and a nonlinear multivariate regression performed with the use of the Glass Corrosion Modeling Tool (GCMT), with which a robust uncertainty analysis is performed. This robust analysis highlights the high degree of correlation of various parameters in the kinetic rate model. As more data are obtained on borosilicate glasses with varying compositions, a mathematical description of the effect of glass composition on the rate parameter values should be possible. This would allow for the possibility of calculating the forward dissolution rate of glass based solely on composition. In addition, the method of determination of parameter uncertainty and correlation provides a framework for other rate models that describe the dissolution rates of other amorphous and crystalline materials in a wide range of chemical conditions. As a result, the higher level of uncertainty analysis would provide a basis for comparison of different rate models and allow for a better means of quantifiably comparing the various models.« less

  15. The dissolution behavior of borosilicate glasses in far-from equilibrium conditions

    NASA Astrophysics Data System (ADS)

    Neeway, James J.; Rieke, Peter C.; Parruzot, Benjamin P.; Ryan, Joseph V.; Asmussen, R. Matthew

    2018-04-01

    An area of agreement in the waste glass corrosion community is that, at far-from-equilibrium conditions, the dissolution of borosilicate glasses used to immobilize nuclear waste is known to be a function of both temperature and pH. The aim of this work is to study the effects of temperature and pH on the dissolution rate of three model nuclear waste glasses (SON68, ISG, AFCI). The dissolution rate data are then used to parameterize a kinetic rate model based on Transition State Theory that has been developed to model glass corrosion behavior in dilute conditions. To do this, experiments were conducted at temperatures of 23, 40, 70, and 90 °C and pH (22 °C) values of 9, 10, 11, and 12 with the single-pass flow-through (SPFT) test method. Both the absolute dissolution rates and the rate model parameters are compared with previous results. Rate model parameters for the three glasses studied here are nearly equivalent within error and in relative agreement with previous studies though quantifiable differences exist. The glass dissolution rates were analyzed with a linear multivariate regression (LMR) and a nonlinear multivariate regression performed with the use of the Glass Corrosion Modeling Tool (GCMT), with which a robust uncertainty analysis is performed. This robust analysis highlights the high degree of correlation of various parameters in the kinetic rate model. As more data are obtained on borosilicate glasses with varying compositions, a mathematical description of the effect of glass composition on the rate parameter values should be possible. This would allow for the possibility of calculating the forward dissolution rate of glass based solely on composition. In addition, the method of determination of parameter uncertainty and correlation provides a framework for other rate models that describe the dissolution rates of other amorphous and crystalline materials in a wide range of chemical conditions. The higher level of uncertainty analysis would provide a basis for comparison of different rate models and allow for a better means of quantifiably comparing the various models.

  16. The Dissolution Behavior of Borosilicate Glasses in Far-From Equilibrium Conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Neeway, James J.; Rieke, Peter C.; Parruzot, Benjamin P.

    An area of agreement in the waste glass corrosion community is that, at far-from-equilibrium conditions, the dissolution of borosilicate glasses used to immobilize nuclear waste is known to be a function of both temperature and pH. The aim of this work is to study the effects of temperature and pH on the dissolution rate of three model nuclear waste glasses (SON68, ISG, AFCI). The dissolution rate data are then used to parameterize a kinetic rate model based on Transition State Theory that has been developed to model glass corrosion behavior in dilute conditions. To do this, experiments were conducted atmore » temperatures of 23, 40, 70, and 90 °C and pH(22 °C) values of 9, 10, 11, and 12 with the single-pass flow-through (SPFT) test method. Both the absolute dissolution rates and the rate model parameters are compared with previous results. Rate model parameters for the three glasses studied here are nearly equivalent within error and in relative agreement with previous studies though quantifiable differences exist. The glass dissolution rates were analyzed with a linear multivariate regression (LMR) and a nonlinear multivariate regression performed with the use of the Glass Corrosion Modeling Tool (GCMT), with which a robust uncertainty analysis is performed. This robust analysis highlights the high degree of correlation of various parameters in the kinetic rate model. As more data are obtained on borosilicate glasses with varying compositions, a mathematical description of the effect of glass composition on the rate parameter values should be possible. This would allow for the possibility of calculating the forward dissolution rate of glass based solely on composition. In addition, the method of determination of parameter uncertainty and correlation provides a framework for other rate models that describe the dissolution rates of other amorphous and crystalline materials in a wide range of chemical conditions. As a result, the higher level of uncertainty analysis would provide a basis for comparison of different rate models and allow for a better means of quantifiably comparing the various models.« less

  17. Predicting SF-6D utility scores from the Oswestry disability index and numeric rating scales for back and leg pain.

    PubMed

    Carreon, Leah Y; Glassman, Steven D; McDonough, Christine M; Rampersaud, Raja; Berven, Sigurd; Shainline, Michael

    2009-09-01

    Cross-sectional cohort. The purpose of this study is to provide a model to allow estimation of utility from the Short Form (SF)-6D using data from the Oswestry Disability Index (ODI), Back Pain Numeric Rating Scale (BPNRS), and the Leg Pain Numeric Rating Scale (LPNRS). Cost-utility analysis provides important information about the relative value of interventions and requires a measure of utility not often available from clinical trial data. The ODI and numeric rating scales for back (BPNRS) and leg pain (LPNRS), are widely used disease-specific measures for health-related quality of life in patients with lumbar degenerative disorders. The purpose of this study is to provide a model to allow estimation of utility from the SF-6D using data from the ODI, BPNRS, and the LPNRS. SF-36, ODI, BPNRS, and LPNRS were prospectively collected before surgery, at 12 and 24 months after surgery in 2640 patients undergoing lumbar fusion for degenerative disorders. Spearman correlation coefficients for paired observations from multiple time points between ODI, BPNRS, and LPNRS, and SF-6D utility scores were determined. Regression modeling was done to compute the SF-6D score from the ODI, BPNRS, and LPNRS. Using a separate, independent dataset of 2174 patients in which actual SF-6D and ODI scores were available, the SF-6D was estimated for each subject and compared to their actual SF-6D. In the development sample, the mean age was 52.5 +/- 15 years and 34% were male. In the validation sample, the mean age was 52.9 +/- 14.2 years and 44% were male. Correlations between the SF-6D and the ODI, BPNRS, and LPNRS were statistically significant (P < 0.0001) with correlation coefficients of 0.82, 0.78, and 0.72, respectively. The regression equation using ODI, BPNRS,and LPNRS to predict SF-6D had an R of 0.69 and a root mean square error of 0.076. The model using ODI alone had an R of 0.67 and a root mean square error of 0.078. The correlation coefficient between the observed and estimated SF-6D score was 0.80. In the validation analysis, there was no statistically significant difference (P = 0.11) between actual mean SF-6D (0.55 +/- 0.12) and the estimated mean SF-6D score (0.55 +/- 0.10) using the ODI regression model. This regression-based algorithm may be used to predict SF-6D scores in studies of lumbar degenerative disease that have collected ODI but not utility scores.

  18. Predicting SF-6D utility scores from the Oswestry Disability Index and Numeric Rating Scales for Back and Leg Pain

    PubMed Central

    Carreon, Leah Y.; Glassman, Steven D.; McDonough, Christine M.; Rampersaud, Raja; Berven, Sigurd; Shainline, Michael

    2012-01-01

    Study Design Cross-sectional cohort Objective The purpose of this study is to provide a model to allow estimation of utility from the SF-6D using data from the ODI, BPNRS, and the LPNRS. Summary of Background Data Cost-utility analysis provides important information about the relative value of interventions and requires a measure of utility not often available from clinical trial data. The Oswestry Disability Index (ODI) and numeric rating scales for back (BPNRS) and leg pain (LPNRS), are widely used disease-specific measures for health-related quality of life in patients with lumbar degenerative disorders. The purpose of this study is to provide a model to allow estimation of utility from the SF-6D using data from the ODI, BPNRS, and the LPNRS. Methods SF-36, ODI, BPNRS and LPNRS were prospectively collected pre-operatively, at 12 and 24 months post-operatively in 2640 patients undergoing lumbar fusion for degenerative disorders. Spearman correlation coefficients for paired observations from multiple time points between ODI, BPNRS and LPNRS and SF-6D utility scores were determined. Regression modeling was done to compute the SF-6D score from the ODI, BPNRS and LPNRS. Using a separate, independent dataset of 2174 patients in which actual SF-6D and ODI scores were available, the SF-6D was estimated for each subject and compared to their actual SF-6D. Results In the development sample, the mean age was 52.5 ± 15 years and 34% were male. In the validation sample the mean age was 52.9 ± 14.2 years and 44% were male. Correlations between the SF-6D and the ODI, BPNRS and LPNRS were statistically significant (p<0.0001) with correlation coefficients of 0.82, 0.78, and 0.72 respectively. The regression equation using ODI, BPNRS and LPNRS to predict SF-6D had an R2 of 0.69 and a root mean square error (RMSE) of 0.076. The model using ODI alone had an R2 of 0.67 and a RMSE of 0.078. The correlation coefficient between the observed and estimated SF-6D score was 0.80. In the validation analysis, there was no statistically significant difference (p=0.11) between actual mean SF-6D (0.55 ± 0.12) and the estimated mean SF-6D score (0.55 ± 0.10) using the ODI regression model. Conclusion This regression-based algorithm may be used to predict SF-6D scores in studies of lumbar degenerative disease that have collected ODI but not utility scores. PMID:19730215

  19. USAF (United States Air Force) Stability and Control DATCOM (Data Compendium)

    DTIC Science & Technology

    1978-04-01

    regression analysis involves the study of a group of variables to determine their effect on a given parameter. Because of the empirical nature of this...regression analysis of mathematical statistics. In general, a regression analysis involves the study of a group of variables to determine their effect on a...Excperiment, OSR TN 58-114, MIT Fluid Dynamics Research Group Rapt. 57-5, 1957. (U) 90. Kennet, H., and Ashley, H.: Review of Unsteady Aerodynamic Studies in

  20. Regression-Based Approach For Feature Selection In Classification Issues. Application To Breast Cancer Detection And Recurrence

    NASA Astrophysics Data System (ADS)

    Belciug, Smaranda; Serbanescu, Mircea-Sebastian

    2015-09-01

    Feature selection is considered a key factor in classifications/decision problems. It is currently used in designing intelligent decision systems to choose the best features which allow the best performance. This paper proposes a regression-based approach to select the most important predictors to significantly increase the classification performance. Application to breast cancer detection and recurrence using publically available datasets proved the efficiency of this technique.

  1. A Method of Calculating Functional Independence Measure at Discharge from Functional Independence Measure Effectiveness Predicted by Multiple Regression Analysis Has a High Degree of Predictive Accuracy.

    PubMed

    Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru

    2017-09-01

    Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  2. Use of Multiple Regression and Use-Availability Analyses in Determining Habitat Selection by Gray Squirrels (Sciurus Carolinensis)

    Treesearch

    John W. Edwards; Susan C. Loeb; David C. Guynn

    1994-01-01

    Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...

  3. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  4. Logistic regression model can reduce unnecessary artificial liver support in hepatitis B virus-associated acute-on-chronic liver failure: decision curve analysis.

    PubMed

    Qin, Gang; Bian, Zhao-Lian; Shen, Yi; Zhang, Lei; Zhu, Xiao-Hong; Liu, Yan-Mei; Shao, Jian-Guo

    2016-06-04

    Several models have been proposed to predict the short-term outcome of acute-on-chronic liver failure (ACLF) after treatment. We aimed to determine whether better decisions for artificial liver support system (ALSS) treatment could be made with a model than without, through decision curve analysis (DCA). The medical profiles of a cohort of 232 patients with hepatitis B virus (HBV)-associated ACLF were retrospectively analyzed to explore the role of plasma prothrombin activity (PTA), model for end-stage liver disease (MELD) and logistic regression model (LRM) in identifying patients who could benefit from ALSS. The accuracy and reliability of PTA, MELD and LRM were evaluated with previously reported cutoffs. DCA was performed to evaluate the clinical role of these models in predicting the treatment outcome. With the cut-off value of 0.2, LRM had sensitivity of 92.6 %, specificity of 42.3 % and an area under the receiving operating characteristic curve (AUC) of 0.68, which showed superior discrimination over PTA and MELD. DCA revealed that the LRM-guided ALSS treatment was superior over other strategies including "treating all" and MELD-guided therapy, for the midrange threshold probabilities of 16 to 64 %. The use of LRM-guided ALSS treatment could increase both the accuracy and efficiency of this procedure, allowing the avoidance of unnecessary ALSS.

  5. Risk factors for indications of intraoperative blood transfusion among patients undergoing surgical treatment for colorectal adenocarcinoma.

    PubMed

    Gonçalves, Iara; Linhares, Marcelo; Bordin, Jose; Matos, Delcio

    2009-01-01

    Identification of risk factors for requiring transfusions during surgery for colorectal cancer may lead to preventive actions or alternative measures, towards decreasing the use of blood components in these procedures, and also rationalization of resources use in hemotherapy services. This was a retrospective case-control study using data from 383 patients who were treated surgically for colorectal adenocarcinoma at 'Fundação Pio XII', in Barretos-SP, Brazil, between 1999 and 2003. To recognize significant risk factors for requiring intraoperative blood transfusion in colorectal cancer surgical procedures. Univariate analyses were performed using Fisher's exact test or the chi-squared test for dichotomous variables and Student's t test for continuous variables, followed by multivariate analysis using multiple logistic regression. In the univariate analyses, height (P = 0.06), glycemia (P = 0.05), previous abdominal or pelvic surgery (P = 0.031), abdominoperineal surgery (P<0.001), extended surgery (P<0.001) and intervention with radical intent (P<0.001) were considered significant. In the multivariate analysis using logistic regression, intervention with radical intent (OR = 10.249, P<0.001, 95% CI = 3.071-34.212) and abdominoperineal amputation (OR = 3.096, P = 0.04, 95% CI = 1.445-6.623) were considered to be independently significant. This investigation allows the conclusion that radical intervention and the abdominoperineal procedure in the surgical treatment of colorectal adenocarcinoma are risk factors for requiring intraoperative blood transfusion.

  6. Determination of 1-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydrochloride by flow-injection analysis based on a specific condensation reaction between malonic acid and ethylenediamine.

    PubMed

    Seno, Kunihiko; Matumura, Kazuki; Oshita, Koji; Oshima, Mitsuko; Motomizu, Shoji

    2009-03-01

    A sensitive and rapid flow-injection analysis was developed for the determination of 1-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydrochloride (EDC.HCl), which was used for the formation of amide (peptide) and esters as a dehydration or condensation reagent. The EDC.HCl could be determined by the flow-injection analysis based on a specific condensation reaction between malonic acid and ethylenediamine in aquatic media. The reaction was accelerated at 60 degrees C, and the absorbance of the product was detected at 262 nm. The calibration graph of EDC.HCl showed good linearity in the range from 0 to 0.1% (0 to 0.0005 M), whose regression equation was y = 1.52 x 10(9)x (y, peak area; x, % concentration of EDC.HCl). The proposed method allowed high-throughput analysis; the sample throughput was 12 samples per hour. The limit of detection (LOD) and the relative standard deviation (RSD) were 2 x 10(-6) M and 1.0%, respectively. This reaction is proceeded in aqueous solution and specific for EDC.HCl.

  7. Genetic aspect of Alzheimer disease: Results of complex segregation analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sadonvick, A.D.; Lee, I.M.L.; Bailey-Wilson, J.E.

    1994-09-01

    The study was designed to evaluate the possibility that a single major locus will explain the segregation of Alzheimer disease (AD). The data were from the population-based AD Genetic Database and consisted of 402 consecutive, unrelated probands, diagnosed to have either `probable` or `autopsy confirmed` AD and their 2,245 first-degree relatives. In this analysis, a relative was considered affected with AD only when there were sufficient medical/autopsy data to support diagnosis of AD being the most likely cause of the dementia. Transmission probability models allowing for a genotype-dependent and logistically distributed age-of-onset were used. The program REGTL in the S.A.G.E.more » computer program package was used for a complex segregation analysis. The models included correction for single ascertainment. Regressive familial effects were not estimated. The data were analyzed to test for single major locus (SML), random transmission and no transmission (environmental) hypotheses. The results of the complex segregation analysis showed that (1) the SML was the best fit, and (2) the non-genetic models could be rejected.« less

  8. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-01

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.

  9. Prediction by regression and intrarange data scatter in surface-process studies

    USGS Publications Warehouse

    Toy, T.J.; Osterkamp, W.R.; Renard, K.G.

    1993-01-01

    Modeling is a major component of contemporary earth science, and regression analysis occupies a central position in the parameterization, calibration, and validation of geomorphic and hydrologic models. Although this methodology can be used in many ways, we are primarily concerned with the prediction of values for one variable from another variable. Examination of the literature reveals considerable inconsistency in the presentation of the results of regression analysis and the occurrence of patterns in the scatter of data points about the regression line. Both circumstances confound utilization and evaluation of the models. Statisticians are well aware of various problems associated with the use of regression analysis and offer improved practices; often, however, their guidelines are not followed. After a review of the aforementioned circumstances and until standard criteria for model evaluation become established, we recommend, as a minimum, inclusion of scatter diagrams, the standard error of the estimate, and sample size in reporting the results of regression analyses for most surface-process studies. ?? 1993 Springer-Verlag.

  10. Quantile regression for the statistical analysis of immunological data with many non-detects.

    PubMed

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  11. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  12. Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis

    PubMed Central

    Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.

    2006-01-01

    In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709

  13. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    EPA Pesticide Factsheets

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  14. Subjective Global Assessment-Dialysis Malnutrition Score and cardiovascular risk in hemodialysis patients: an observational cohort study.

    PubMed

    Spatola, Leonardo; Finazzi, Silvia; Calvetta, Albania; Reggiani, Francesco; Morenghi, Emanuela; Santostasi, Silvia; Angelini, Claudio; Badalamenti, Salvatore; Mugnai, Giacomo

    2018-06-23

    Malnutrition is an important risk factor for cardiovascular mortality in hemodialysis (HD) patients. However, current malnutrition biomarkers seem unable to accurately estimate the role of malnutrition in predicting cardiovascular risk. Our aim was to investigate the role of the Subjective Global Assessment-Dialysis Malnutrition Score (SGA-DMS) compared to two well-recognized comorbidity scores-Charlson Comorbidity Index (CCI) and modified CCI (excluding age-factor) (mCCI)-in predicting cardiovascular events in HD patients. In 86 maintenance HD patients followed from June 2015 to June 2017, we analyzed biohumoral data and clinical scores as risk factors for cardiovascular events (acute heart failure, acute coronary syndrome and stroke). Their impact on outcome was investigated by linear regression, Cox regression models and ROC analysis. Cardiovascular events occurred in 26/86 (30%) patients during the 2-year follow-up. Linear regression showed only age and dialysis vintage to be positively related to SGA-DMS: B 0.21 (95% CI 0.01; 0.30) p 0.05, and B 0.24 (0.09; 0.34) p 0.02, respectively, while serum albumin, normalized protein catabolic rate (nPCR) and dialysis dose (Kt/V) were negatively related to SGA-DMS: B - 1.29 (- 3.29; - 0.81) p 0.02; B - 0.08 (- 1.52; - 0.35) p 0.04 and B - 2.63 (- 5.25; - 0.22) p 0.03, respectively. At Cox regression analysis, SGA-DMS was not a risk predictor for cardiovascular events: HR 1.09 (0.9; 1.22), while both CCI and mCCI were significant predictors: HR 1.43 (1.13; 1.87) and HR 1.57 (1.20; 2.06) also in Cox adjusted models. ROC analysis reported similar AUCs for CCI and mCCI: 0.72 (0.60; 0.89) p 0.00 and 0.70 (0.58; 0.82) p 0.00, respectively, compared to SGA-DMS 0.56 (0.49; 0.72) p 0.14. SGA-DMS is not a superior and significant prognostic tool compared to CCI and mCCI in assessing cardiovascular risk in HD patients, even it allows to appraise both malnutrition and comorbidity status.

  15. Local spatial variations analysis of smear-positive tuberculosis in Xinjiang using Geographically Weighted Regression model.

    PubMed

    Wei, Wang; Yuan-Yuan, Jin; Ci, Yan; Ahan, Alayi; Ming-Qin, Cao

    2016-10-06

    The spatial interplay between socioeconomic factors and tuberculosis (TB) cases contributes to the understanding of regional tuberculosis burdens. Historically, local Poisson Geographically Weighted Regression (GWR) has allowed for the identification of the geographic disparities of TB cases and their relevant socioeconomic determinants, thereby forecasting local regression coefficients for the relations between the incidence of TB and its socioeconomic determinants. Therefore, the aims of this study were to: (1) identify the socioeconomic determinants of geographic disparities of smear positive TB in Xinjiang, China (2) confirm if the incidence of smear positive TB and its associated socioeconomic determinants demonstrate spatial variability (3) compare the performance of two main models: one is Ordinary Least Square Regression (OLS), and the other local GWR model. Reported smear-positive TB cases in Xinjiang were extracted from the TB surveillance system database during 2004-2010. The average number of smear-positive TB cases notified in Xinjiang was collected from 98 districts/counties. The population density (POPden), proportion of minorities (PROmin), number of infectious disease network reporting agencies (NUMagen), proportion of agricultural population (PROagr), and per capita annual gross domestic product (per capita GDP) were gathered from the Xinjiang Statistical Yearbook covering a period from 2004 to 2010. The OLS model and GWR model were then utilized to investigate socioeconomic determinants of smear-positive TB cases. Geoda 1.6.7, and GWR 4.0 software were used for data analysis. Our findings indicate that the relations between the average number of smear-positive TB cases notified in Xinjiang and their socioeconomic determinants (POPden, PROmin, NUMagen, PROagr, and per capita GDP) were significantly spatially non-stationary. This means that in some areas more smear-positive TB cases could be related to higher socioeconomic determinant regression coefficients, but in some areas more smear-positive TB cases were found to do with lower socioeconomic determinant regression coefficients. We also found out that the GWR model could be better exploited to geographically differentiate the relationships between the average number of smear-positive TB cases and their socioeconomic determinants, which could interpret the dataset better (adjusted R 2  = 0.912, AICc = 1107.22) than the OLS model (adjusted R 2  = 0.768, AICc = 1196.74). POPden, PROmin, NUMagen, PROagr, and per capita GDP are socioeconomic determinants of smear-positive TB cases. Comprehending the spatial heterogeneity of POPden, PROmin, NUMagen, PROagr, per capita GDP, and smear-positive TB cases could provide valuable information for TB precaution and control strategies.

  16. VoxelStats: A MATLAB Package for Multi-Modal Voxel-Wise Brain Image Analysis.

    PubMed

    Mathotaarachchi, Sulantha; Wang, Seqian; Shin, Monica; Pascoal, Tharick A; Benedet, Andrea L; Kang, Min Su; Beaudry, Thomas; Fonov, Vladimir S; Gauthier, Serge; Labbe, Aurélie; Rosa-Neto, Pedro

    2016-01-01

    In healthy individuals, behavioral outcomes are highly associated with the variability on brain regional structure or neurochemical phenotypes. Similarly, in the context of neurodegenerative conditions, neuroimaging reveals that cognitive decline is linked to the magnitude of atrophy, neurochemical declines, or concentrations of abnormal protein aggregates across brain regions. However, modeling the effects of multiple regional abnormalities as determinants of cognitive decline at the voxel level remains largely unexplored by multimodal imaging research, given the high computational cost of estimating regression models for every single voxel from various imaging modalities. VoxelStats is a voxel-wise computational framework to overcome these computational limitations and to perform statistical operations on multiple scalar variables and imaging modalities at the voxel level. VoxelStats package has been developed in Matlab(®) and supports imaging formats such as Nifti-1, ANALYZE, and MINC v2. Prebuilt functions in VoxelStats enable the user to perform voxel-wise general and generalized linear models and mixed effect models with multiple volumetric covariates. Importantly, VoxelStats can recognize scalar values or image volumes as response variables and can accommodate volumetric statistical covariates as well as their interaction effects with other variables. Furthermore, this package includes built-in functionality to perform voxel-wise receiver operating characteristic analysis and paired and unpaired group contrast analysis. Validation of VoxelStats was conducted by comparing the linear regression functionality with existing toolboxes such as glim_image and RMINC. The validation results were identical to existing methods and the additional functionality was demonstrated by generating feature case assessments (t-statistics, odds ratio, and true positive rate maps). In summary, VoxelStats expands the current methods for multimodal imaging analysis by allowing the estimation of advanced regional association metrics at the voxel level.

  17. Efficacy of a numerical value of a fixed-effect estimator in stochastic frontier analysis as an indicator of hospital production structure.

    PubMed

    Kawaguchi, Hiroyuki; Hashimoto, Hideki; Matsuda, Shinya

    2012-09-22

    The casemix-based payment system has been adopted in many countries, although it often needs complementary adjustment taking account of each hospital's unique production structure such as teaching and research duties, and non-profit motives. It has been challenging to numerically evaluate the impact of such structural heterogeneity on production, separately of production inefficiency. The current study adopted stochastic frontier analysis and proposed a method to assess unique components of hospital production structures using a fixed-effect variable. There were two stages of analyses in this study. In the first stage, we estimated the efficiency score from the hospital production function using a true fixed-effect model (TFEM) in stochastic frontier analysis. The use of a TFEM allowed us to differentiate the unobserved heterogeneity of individual hospitals as hospital-specific fixed effects. In the second stage, we regressed the obtained fixed-effect variable for structural components of hospitals to test whether the variable was explicitly related to the characteristics and local disadvantages of the hospitals. In the first analysis, the estimated efficiency score was approximately 0.6. The mean value of the fixed-effect estimator was 0.784, the standard deviation was 0.137, the range was between 0.437 and 1.212. The second-stage regression confirmed that the value of the fixed effect was significantly correlated with advanced technology and local conditions of the sample hospitals. The obtained fixed-effect estimator may reflect hospitals' unique structures of production, considering production inefficiency. The values of fixed-effect estimators can be used as evaluation tools to improve fairness in the reimbursement system for various functions of hospitals based on casemix classification.

  18. Using "Excel" for White's Test--An Important Technique for Evaluating the Equality of Variance Assumption and Model Specification in a Regression Analysis

    ERIC Educational Resources Information Center

    Berenson, Mark L.

    2013-01-01

    There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…

  19. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  20. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    NASA Astrophysics Data System (ADS)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  1. Sensitivity of regression calibration to non-perfect validation data with application to the Norwegian Women and Cancer Study.

    PubMed

    Buonaccorsi, John P; Dalen, Ingvild; Laake, Petter; Hjartåker, Anette; Engeset, Dagrun; Thoresen, Magne

    2015-04-15

    Measurement error occurs when we observe error-prone surrogates, rather than true values. It is common in observational studies and especially so in epidemiology, in nutritional epidemiology in particular. Correcting for measurement error has become common, and regression calibration is the most popular way to account for measurement error in continuous covariates. We consider its use in the context where there are validation data, which are used to calibrate the true values given the observed covariates. We allow for the case that the true value itself may not be observed in the validation data, but instead, a so-called reference measure is observed. The regression calibration method relies on certain assumptions.This paper examines possible biases in regression calibration estimators when some of these assumptions are violated. More specifically, we allow for the fact that (i) the reference measure may not necessarily be an 'alloyed gold standard' (i.e., unbiased) for the true value; (ii) there may be correlated random subject effects contributing to the surrogate and reference measures in the validation data; and (iii) the calibration model itself may not be the same in the validation study as in the main study; that is, it is not transportable. We expand on previous work to provide a general result, which characterizes potential bias in the regression calibration estimators as a result of any combination of the violations aforementioned. We then illustrate some of the general results with data from the Norwegian Women and Cancer Study. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Kernel machine methods for integrative analysis of genome-wide methylation and genotyping studies.

    PubMed

    Zhao, Ni; Zhan, Xiang; Huang, Yen-Tsung; Almli, Lynn M; Smith, Alicia; Epstein, Michael P; Conneely, Karen; Wu, Michael C

    2018-03-01

    Many large GWAS consortia are expanding to simultaneously examine the joint role of DNA methylation in addition to genotype in the same subjects. However, integrating information from both data types is challenging. In this paper, we propose a composite kernel machine regression model to test the joint epigenetic and genetic effect. Our approach works at the gene level, which allows for a common unit of analysis across different data types. The model compares the pairwise similarities in the phenotype to the pairwise similarities in the genotype and methylation values; and high correspondence is suggestive of association. A composite kernel is constructed to measure the similarities in the genotype and methylation values between pairs of samples. We demonstrate through simulations and real data applications that the proposed approach can correctly control type I error, and is more robust and powerful than using only the genotype or methylation data in detecting trait-associated genes. We applied our method to investigate the genetic and epigenetic regulation of gene expression in response to stressful life events using data that are collected from the Grady Trauma Project. Within the kernel machine testing framework, our methods allow for heterogeneity in effect sizes, nonlinear, and interactive effects, as well as rapid P-value computation. © 2017 WILEY PERIODICALS, INC.

  3. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  4. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  5. Flexible link functions in nonparametric binary regression with Gaussian process priors.

    PubMed

    Li, Dan; Wang, Xia; Lin, Lizhen; Dey, Dipak K

    2016-09-01

    In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes. A critical issue in modeling binary response data is the appropriate choice of link functions. Commonly adopted link functions such as probit or logit links have fixed skewness and lack the flexibility to allow the data to determine the degree of the skewness. To address this limitation, we propose a flexible binary regression model which combines a generalized extreme value link function with a Gaussian process prior on the latent structure. Bayesian computation is employed in model estimation. Posterior consistency of the resulting posterior distribution is demonstrated. The flexibility and gains of the proposed model are illustrated through detailed simulation studies and two real data examples. Empirical results show that the proposed model outperforms a set of alternative models, which only have either a Gaussian process prior on the latent regression function or a Dirichlet prior on the link function. © 2015, The International Biometric Society.

  6. Flexible Link Functions in Nonparametric Binary Regression with Gaussian Process Priors

    PubMed Central

    Li, Dan; Lin, Lizhen; Dey, Dipak K.

    2015-01-01

    Summary In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes. A critical issue in modeling binary response data is the appropriate choice of link functions. Commonly adopted link functions such as probit or logit links have fixed skewness and lack the flexibility to allow the data to determine the degree of the skewness. To address this limitation, we propose a flexible binary regression model which combines a generalized extreme value link function with a Gaussian process prior on the latent structure. Bayesian computation is employed in model estimation. Posterior consistency of the resulting posterior distribution is demonstrated. The flexibility and gains of the proposed model are illustrated through detailed simulation studies and two real data examples. Empirical results show that the proposed model outperforms a set of alternative models, which only have either a Gaussian process prior on the latent regression function or a Dirichlet prior on the link function. PMID:26686333

  7. Generating patient specific pseudo-CT of the head from MR using atlas-based regression

    NASA Astrophysics Data System (ADS)

    Sjölund, J.; Forsberg, D.; Andersson, M.; Knutsson, H.

    2015-01-01

    Radiotherapy planning and attenuation correction of PET images require simulation of radiation transport. The necessary physical properties are typically derived from computed tomography (CT) images, but in some cases, including stereotactic neurosurgery and combined PET/MR imaging, only magnetic resonance (MR) images are available. With these applications in mind, we describe how a realistic, patient-specific, pseudo-CT of the head can be derived from anatomical MR images. We refer to the method as atlas-based regression, because of its similarity to atlas-based segmentation. Given a target MR and an atlas database comprising MR and CT pairs, atlas-based regression works by registering each atlas MR to the target MR, applying the resulting displacement fields to the corresponding atlas CTs and, finally, fusing the deformed atlas CTs into a single pseudo-CT. We use a deformable registration algorithm known as the Morphon and augment it with a certainty mask that allows a tailoring of the influence certain regions are allowed to have on the registration. Moreover, we propose a novel method of fusion, wherein the collection of deformed CTs is iteratively registered to their joint mean and find that the resulting mean CT becomes more similar to the target CT. However, the voxelwise median provided even better results; at least as good as earlier work that required special MR imaging techniques. This makes atlas-based regression a good candidate for clinical use.

  8. Empirical methods for assessing meaningful neuropsychological change following epilepsy surgery.

    PubMed

    Sawrie, S M; Chelune, G J; Naugle, R I; Lüders, H O

    1996-11-01

    Traditional methods for assessing the neurocognitive effects of epilepsy surgery are confounded by practice effects, test-retest reliability issues, and regression to the mean. This study employs 2 methods for assessing individual change that allow direct comparison of changes across both individuals and test measures. Fifty-one medically intractable epilepsy patients completed a comprehensive neuropsychological battery twice, approximately 8 months apart, prior to any invasive monitoring or surgical intervention. First, a Reliable Change (RC) index score was computed for each test score to take into account the reliability of that measure, and a cutoff score was empirically derived to establish the limits of statistically reliable change. These indices were subsequently adjusted for expected practice effects. The second approach used a regression technique to establish "change norms" along a common metric that models both expected practice effects and regression to the mean. The RC index scores provide the clinician with a statistical means of determining whether a patient's retest performance is "significantly" changed from baseline. The regression norms for change allow the clinician to evaluate the magnitude of a given patient's change on 1 or more variables along a common metric that takes into account the reliability and stability of each test measure. Case data illustrate how these methods provide an empirically grounded means for evaluating neurocognitive outcomes following medical interventions such as epilepsy surgery.

  9. Association between molecular markers and behavioral phenotypes in the immatures of a butterfly.

    PubMed

    De Nardin, Janaína; Buffon, Vanessa; Revers, Luís Fernando; de Araújo, Aldo Mellender

    2018-01-01

    Newly hatched caterpillars of the butterfly Heliconius erato phyllis routinely cannibalize eggs. In a manifestation of kin recognition they cannibalize sibling eggs less frequently than unrelated eggs. Previous work has estimated the heritability of kin recognition in H. erato phyllis to lie between 14 and 48%. It has furthermore been shown that the inheritance of kin recognition is compatible with a quantitative model with a threshold. Here we present the results of a preliminary study, in which we tested for associations between behavioral kin recognition phenotypes and AFLP and SSR markers. We implemented two experimental approaches: (1) a cannibalism test using sibling eggs only, which allowed for only two behavioral outcomes (cannibal and non-cannibal), and (2) a cannibalism test using two sibling eggs and one unrelated egg, which allowed four outcomes [cannibal who does not recognize siblings, cannibal who recognizes siblings, "super-cannibal" (cannibal of both eggs), and "super non-cannibal" (does not cannibalize eggs at all)]. Single-marker analyses were performed using χ2 tests and logistic regression with null markers as covariates. Results of the χ2 tests identified 72 associations for experimental design 1 and 73 associations for design 2. Logistic regression analysis of the markers found to be significant in the χ2 test resulted in 20 associations for design 1 and 11 associations for design 2. Experiment 2 identified markers that were more frequently present or absent in cannibals who recognize siblings and super non-cannibals; i.e. in both phenotypes capable of kin recognition.

  10. Association between molecular markers and behavioral phenotypes in the immatures of a butterfly

    PubMed Central

    De Nardin, Janaína; Buffon, Vanessa; Revers, Luís Fernando; de Araújo, Aldo Mellender

    2018-01-01

    Abstract Newly hatched caterpillars of the butterfly Heliconius erato phyllis routinely cannibalize eggs. In a manifestation of kin recognition they cannibalize sibling eggs less frequently than unrelated eggs. Previous work has estimated the heritability of kin recognition in H. erato phyllis to lie between 14 and 48%. It has furthermore been shown that the inheritance of kin recognition is compatible with a quantitative model with a threshold. Here we present the results of a preliminary study, in which we tested for associations between behavioral kin recognition phenotypes and AFLP and SSR markers. We implemented two experimental approaches: (1) a cannibalism test using sibling eggs only, which allowed for only two behavioral outcomes (cannibal and non-cannibal), and (2) a cannibalism test using two sibling eggs and one unrelated egg, which allowed four outcomes [cannibal who does not recognize siblings, cannibal who recognizes siblings, “super-cannibal” (cannibal of both eggs), and “super non-cannibal” (does not cannibalize eggs at all)]. Single-marker analyses were performed using χ2 tests and logistic regression with null markers as covariates. Results of the χ2 tests identified 72 associations for experimental design 1 and 73 associations for design 2. Logistic regression analysis of the markers found to be significant in the χ2 test resulted in 20 associations for design 1 and 11 associations for design 2. Experiment 2 identified markers that were more frequently present or absent in cannibals who recognize siblings and super non-cannibals; i.e. in both phenotypes capable of kin recognition. PMID:29583155

  11. Phenomapping of rangelands in South Africa using time series of RapidEye data

    NASA Astrophysics Data System (ADS)

    Parplies, André; Dubovyk, Olena; Tewes, Andreas; Mund, Jan-Peter; Schellberg, Jürgen

    2016-12-01

    Phenomapping is an approach which allows the derivation of spatial patterns of vegetation phenology and rangeland productivity based on time series of vegetation indices. In our study, we propose a new spatial mapping approach which combines phenometrics derived from high resolution (HR) satellite time series with spatial logistic regression modeling to discriminate land management systems in rangelands. From the RapidEye time series for selected rangelands in South Africa, we calculated bi-weekly noise reduced Normalized Difference Vegetation Index (NDVI) images. For the growing season of 2011⿿2012, we further derived principal phenology metrics such as start, end and length of growing season and related phenological variables such as amplitude, left derivative and small integral of the NDVI curve. We then mapped these phenometrics across two different tenure systems, communal and commercial, at the very detailed spatial resolution of 5 m. The result of a binary logistic regression (BLR) has shown that the amplitude and the left derivative of the NDVI curve were statistically significant. These indicators are useful to discriminate commercial from communal rangeland systems. We conclude that phenomapping combined with spatial modeling is a powerful tool that allows efficient aggregation of phenology and productivity metrics for spatially explicit analysis of the relationships of crop phenology with site conditions and management. This approach has particular potential for disaggregated and patchy environments such as in farming systems in semi-arid South Africa, where phenology varies considerably among and within years. Further, we see a strong perspective for phenomapping to support spatially explicit modelling of vegetation.

  12. Optimizing methods for linking cinematic features to fMRI data.

    PubMed

    Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia

    2015-04-15

    One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.

  13. Housing Archetype Analysis for Home Energy-Efficient Retrofit in the Great Lakes Region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, S. K.; Mrozowski, T.; Harrell-Seyburn, A.

    This project report details activities and results of the 'Market Characterization' project undertaken by the Cost Effective Energy Retrofit (CEER) team targeted toward the DOE goal of achieving 30%-50% reduction in existing building energy use. CEER consists of members from the Dow Chemical Company, Michigan State University, Ferris State University and Habitat for Humanity Kent County. The purpose of this market characterization project was to identify housing archetypes which are dominant within Great Lakes region and therefore offer significant potential for energy-efficient retrofit research and implementation due to the substantial number of homes possessing similar characteristics. Understanding the characteristics ofmore » housing groups referred to as 'archetypes' by vintage, style, and construction characteristics can allow research teams to focus their retrofit research and develop prescriptive solutions for those structure types which are prevalent and offer high potential uptake within a region or market. Key research activities included; literature review, statistical analysis of national and regional data of the American Housing Survey (AHS) collected by the U.S. Census Bureau, analysis of Michigan specific data, development of a housing taxonomy of architectural styles, case studies of two local markets (i.e., Ann Arbor and Grand Rapids in Michigan) and development of a suggested framework (or process) for characterizing local markets. In order to gain a high level perspective, national and regional data from the U.S. Census Bureau was analyzed using cross tabulations, multiple regression models, and logistic regression to characterize the housing stock and determine dominant house types using 21 variables.« less

  14. Analysis of hyperspectral field radiometric data for monitoring nitrogen concentration in rice crops

    NASA Astrophysics Data System (ADS)

    Stroppiana, D.; Boschetti, M.; Confalonieri, R.; Bocchi, S.; Brivio, P. A.

    2005-10-01

    Monitoring crop conditions and assessing nutrition requirements is fundamental for implementing sustainable agriculture. Rational nitrogen fertilization is of particular importance in rice crops in order to guarantee high production levels while minimising the impact on the environment. In fact, the typical flooded condition of rice fields can be a significant source of greenhouse gasses. Information on plant nitrogen concentration can be used, coupled with information about the phenological stage, to plan strategies for a rational and spatially differentiated fertilization schedule. A field experiment was carried out in a rice field Northern Italy, in order to evaluate the potential of field radiometric measurements for the prediction of rice nitrogen concentration. The results indicate that rice reflectance is influenced by nitrogen supply at certain wavelengths although N concentration cannot be accurately predicted based on the reflectance measured at a given wavelength. Regression analysis highlighted that the visible region of the spectrum is most sensitive to plant nitrogen concentration when reflectance measures are combined into a spectral index. An automated procedure allowed the analysis of all the possible combinations into a Normalized Difference Index (NDI) of the narrow spectral bands derived by spectral resampling of field measurements. The derived index appeared to be least influenced by plant biomass and Leaf Area Index (LAI) providing a useful approach to detect rice nutritional status. The validation of the regressive model showed that the model is able to predict rice N concentration (R2=0.55 [p<0.01] RRMSE=29.4; modelling efficiency close to the optimum value).

  15. [Predicting very early rebleeding after acute variceal bleeding based in classification and regression tree analysis (CRTA).].

    PubMed

    Altamirano, J; Augustin, S; Muntaner, L; Zapata, L; González-Angulo, A; Martínez, B; Flores-Arroyo, A; Camargo, L; Genescá, J

    2010-01-01

    Variceal bleeding (VB) is the main cause of death among cirrhotic patients. About 30-50% of early rebleeding is encountered few days after the acute episode of VB. It is necessary to stratify patients with high risk of very early rebleeding (VER) for more aggressive therapies. However, there are few and incompletely understood prognostic models for this purpose. To determine the risk factors associated with VER after an acute VB. Assessment and comparison of a novel prognostic model generated by Classification and Regression Tree Analysis (CART) with classic-used models (MELD and Child-Pugh [CP]). Sixty consecutive cirrhotic patients with acute variceal bleeding. CART analysis, MELD and Child-Pugh scores were performed at admission. Receiver operating characteristic (ROC) curves were constructed to evaluate the predictive performance of the models. Very early rebleeding rate was 13%. Variables associated with VER were: serum albumin (p = 0.027), creatinine (p = 0.021) and transfused blood units in the first 24 hrs (p = 0.05). The area under the ROC for MELD, CHILD-Pugh and CART were 0.46, 0.50 and 0.82, respectively. The value of cut analyzed by CART for the significant variables were: 1) Albumin 2.85 mg/dL, 2) Packed red cells 2 units and 3) Creatinine 1.65 mg/dL the ABC-ROC. Serum albumin, creatinine and number of transfused blood units were associated with VER. A simple CART algorithm combining these variables allows an accurate predictive assessment of VER after acute variceal bleeding. Key words: cirrhosis, variceal bleeding, esophageal varices, prognosis, portal hypertension.

  16. An open-access CMIP5 pattern library for temperature and precipitation: description and methodology

    NASA Astrophysics Data System (ADS)

    Lynch, Cary; Hartin, Corinne; Bond-Lamberty, Ben; Kravitz, Ben

    2017-05-01

    Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file would be beneficial, but such a library does not presently exist. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squares regression methods. We explore the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Differences in patterns across seasons between methods and epochs were largest in high latitudes (60-90° N/S). Bias and mean errors between modeled and pattern-predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5 °C, but the choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. This paper describes our library of least squares regression patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns. The dataset and netCDF data generation code are available at doi:10.5281/zenodo.495632.

  17. Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

    PubMed Central

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915

  18. Predictive factors of early moderate/severe ovarian hyperstimulation syndrome in non-polycystic ovarian syndrome patients: a statistical model.

    PubMed

    Ashrafi, Mahnaz; Bahmanabadi, Akram; Akhond, Mohammad Reza; Arabipoor, Arezoo

    2015-11-01

    To evaluate demographic, medical history and clinical cycle characteristics of infertile non-polycystic ovary syndrome (NPCOS) women with the purpose of investigating their associations with the prevalence of moderate-to-severe OHSS. In this retrospective study, among 7073 in vitro fertilization and/or intracytoplasmic sperm injection (IVF/ICSI) cycles, 86 cases of NPCO patients who developed moderate-to-severe OHSS while being treated with IVF/ICSI cycles were analyzed during the period of January 2008 to December 2010 at Royan Institute. To review the OHSS risk factors, 172 NPCOS patients without developing OHSS, treated at the same period of time, were selected randomly by computer as control group. We used multiple logistic regression in a backward manner to build a prediction model. The regression analysis revealed that the variables, including age [odds ratio (OR) 0.9, confidence interval (CI) 0.81-0.99], antral follicles count (OR 4.3, CI 2.7-6.9), infertility cause (tubal factor, OR 11.5, CI 1.1-51.3), hypothyroidism (OR 3.8, CI 1.5-9.4) and positive history of ovarian surgery (OR 0.2, CI 0.05-0.9) were the most important predictors of OHSS. The regression model had an area under curve of 0.94, presenting an allowable discriminative performance that was equal with two strong predictive variables, including the number of follicles and serum estradiol level on human chorionic gonadotropin day. The predictive regression model based on primary characteristics of NPCOS patients had equal specificity in comparison with two mentioned strong predictive variables. Therefore, it may be beneficial to apply this model before the beginning of ovarian stimulation protocol.

  19. Calibration Adjustment of the Mid-infrared Analyzer for an Accurate Determination of the Macronutrient Composition of Human Milk.

    PubMed

    Billard, Hélène; Simon, Laure; Desnots, Emmanuelle; Sochard, Agnès; Boscher, Cécile; Riaublanc, Alain; Alexandre-Gouabau, Marie-Cécile; Boquien, Clair-Yves

    2016-08-01

    Human milk composition analysis seems essential to adapt human milk fortification for preterm neonates. The Miris human milk analyzer (HMA), based on mid-infrared methodology, is convenient for a unique determination of macronutrients. However, HMA measurements are not totally comparable with reference methods (RMs). The primary aim of this study was to compare HMA results with results from biochemical RMs for a large range of protein, fat, and carbohydrate contents and to establish a calibration adjustment. Human milk was fractionated in protein, fat, and skim milk by covering large ranges of protein (0-3 g/100 mL), fat (0-8 g/100 mL), and carbohydrate (5-8 g/100 mL). For each macronutrient, a calibration curve was plotted by linear regression using measurements obtained using HMA and RMs. For fat, 53 measurements were performed, and the linear regression equation was HMA = 0.79RM + 0.28 (R(2) = 0.92). For true protein (29 measurements), the linear regression equation was HMA = 0.9RM + 0.23 (R(2) = 0.98). For carbohydrate (15 measurements), the linear regression equation was HMA = 0.59RM + 1.86 (R(2) = 0.95). A homogenization step with a disruptor coupled to a sonication step was necessary to obtain better accuracy of the measurements. Good repeatability (coefficient of variation < 7%) and reproducibility (coefficient of variation < 17%) were obtained after calibration adjustment. New calibration curves were developed for the Miris HMA, allowing accurate measurements in large ranges of macronutrient content. This is necessary for reliable use of this device in individualizing nutrition for preterm newborns. © The Author(s) 2015.

  20. Using structured additive regression models to estimate risk factors of malaria: analysis of 2010 Malawi malaria indicator survey data.

    PubMed

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities.

  1. INNOVATIVE INSTRUMENTATION AND ANALYSIS OF THE TEMPERATURE MEASUREMENT FOR HIGH TEMPERATURE GASIFICATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seong W. Lee

    During this reporting period, the literature survey including the gasifier temperature measurement literature, the ultrasonic application and its background study in cleaning application, and spray coating process are completed. The gasifier simulator (cold model) testing has been successfully conducted. Four factors (blower voltage, ultrasonic application, injection time intervals, particle weight) were considered as significant factors that affect the temperature measurement. The Analysis of Variance (ANOVA) was applied to analyze the test data. The analysis shows that all four factors are significant to the temperature measurements in the gasifier simulator (cold model). The regression analysis for the case with the normalizedmore » room temperature shows that linear model fits the temperature data with 82% accuracy (18% error). The regression analysis for the case without the normalized room temperature shows 72.5% accuracy (27.5% error). The nonlinear regression analysis indicates a better fit than that of the linear regression. The nonlinear regression model's accuracy is 88.7% (11.3% error) for normalized room temperature case, which is better than the linear regression analysis. The hot model thermocouple sleeve design and fabrication are completed. The gasifier simulator (hot model) design and the fabrication are completed. The system tests of the gasifier simulator (hot model) have been conducted and some modifications have been made. Based on the system tests and results analysis, the gasifier simulator (hot model) has met the proposed design requirement and the ready for system test. The ultrasonic cleaning method is under evaluation and will be further studied for the gasifier simulator (hot model) application. The progress of this project has been on schedule.« less

  2. The Economic Value of Mangroves: A Meta-Analysis

    Treesearch

    Marwa Salem; D. Evan Mercer

    2012-01-01

    This paper presents a synthesis of the mangrove ecosystem valuation literature through a meta-regression analysis. The main contribution of this study is that it is the first meta-analysis focusing solely on mangrove forests, whereas previous studies have included different types of wetlands. The number of studies included in the regression analysis is 44 for a total...

  3. Reducing the Complexity of an Agent-Based Local Heroin Market Model

    PubMed Central

    Heard, Daniel; Bobashev, Georgiy V.; Morris, Robert J.

    2014-01-01

    This project explores techniques for reducing the complexity of an agent-based model (ABM). The analysis involved a model developed from the ethnographic research of Dr. Lee Hoffer in the Larimer area heroin market, which involved drug users, drug sellers, homeless individuals and police. The authors used statistical techniques to create a reduced version of the original model which maintained simulation fidelity while reducing computational complexity. This involved identifying key summary quantities of individual customer behavior as well as overall market activity and replacing some agents with probability distributions and regressions. The model was then extended to allow external market interventions in the form of police busts. Extensions of this research perspective, as well as its strengths and limitations, are discussed. PMID:25025132

  4. Pentobarbital quantitation using EMIT serum barbiturate assay reagents: application to monitoring of high-dose pentobarbital therapy.

    PubMed

    Pape, B E; Cary, P L; Clay, L C; Godolphin, W

    1983-01-01

    Pentobarbital serum concentrations associated with a high-dose therapeutic regimen were determined using EMIT immunoassay reagents. Replicate analyses of serum controls resulted in a within-assay coefficient of variation of 5.0% and a between-assay coefficient of variation of 10%. Regression analysis of 44 serum samples analyzed by this technique (y) and a reference procedure (x) were y = 0.98x + 3.6 (r = 0.98; x = ultraviolet spectroscopy) and y = 1.04x + 2.4 (r = 0.96; x = high-performance liquid chromatography). Clinical evaluation of the results indicates the immunoassay is sufficiently sensitive and selective for pentobarbital to allow accurate quantitation within the therapeutic range associated with high-dose therapy.

  5. Computer-aided molecular modeling techniques for predicting the stability of drug cyclodextrin inclusion complexes in aqueous solutions

    NASA Astrophysics Data System (ADS)

    Faucci, Maria Teresa; Melani, Fabrizio; Mura, Paola

    2002-06-01

    Molecular modeling was used to investigate factors influencing complex formation between cyclodextrins and guest molecules and predict their stability through a theoretical model based on the search for a correlation between experimental stability constants ( Ks) and some theoretical parameters describing complexation (docking energy, host-guest contact surfaces, intermolecular interaction fields) calculated from complex structures at a minimum conformational energy, obtained through stochastic methods based on molecular dynamic simulations. Naproxen, ibuprofen, ketoprofen and ibuproxam were used as model drug molecules. Multiple Regression Analysis allowed identification of the significant factors for the complex stability. A mathematical model ( r=0.897) related log Ks with complex docking energy and lipophilic molecular fields of cyclodextrin and drug.

  6. The Influence of Assay Design, Blinding, and Gymnema sylvestre on Sucrose Detection by Humans.

    PubMed

    Aleman, Max G; Marconi, Lauren J; Nguyen, Nam H; Park, Jae M; Patino, Maria M; Wang, Yuchi; Watkins, Celeste S; Shelley, Chris

    2016-01-01

    The detection and grading of tastes corresponding to different taste modalities can be tested in engaging laboratory sessions using students themselves as test subjects. This article describes a series of experiments in which data pertaining to the detection of salty and sweet tastes are obtained, and the ability of the herb Gymnema sylvestre to disrupt the detection of sucrose is quantified. The effects of blinding and different assay designs on EC50 estimation are also investigated. The data obtained allow for substantial data analysis, including non-linear regression using fixed and free parameters to quantify dose-response relationships, and the use of often under-utilized permutation tests to determine significant differences when the underlying data display heteroscedasticity.

  7. The Influence of Assay Design, Blinding, and Gymnema sylvestre on Sucrose Detection by Humans

    PubMed Central

    Aleman, Max G.; Marconi, Lauren J.; Nguyen, Nam H.; Park, Jae M.; Patino, Maria M.; Wang, Yuchi; Watkins, Celeste S.; Shelley, Chris

    2016-01-01

    The detection and grading of tastes corresponding to different taste modalities can be tested in engaging laboratory sessions using students themselves as test subjects. This article describes a series of experiments in which data pertaining to the detection of salty and sweet tastes are obtained, and the ability of the herb Gymnema sylvestre to disrupt the detection of sucrose is quantified. The effects of blinding and different assay designs on EC50 estimation are also investigated. The data obtained allow for substantial data analysis, including non-linear regression using fixed and free parameters to quantify dose-response relationships, and the use of often under-utilized permutation tests to determine significant differences when the underlying data display heteroscedasticity. PMID:27980466

  8. Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression

    PubMed Central

    2014-01-01

    Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463

  9. Introduction to methodology of dose-response meta-analysis for binary outcome: With application on software.

    PubMed

    Zhang, Chao; Jia, Pengli; Yu, Liu; Xu, Chang

    2018-05-01

    Dose-response meta-analysis (DRMA) is widely applied to investigate the dose-specific relationship between independent and dependent variables. Such methods have been in use for over 30 years and are increasingly employed in healthcare and clinical decision-making. In this article, we give an overview of the methodology used in DRMA. We summarize the commonly used regression model and the pooled method in DRMA. We also use an example to illustrate how to employ a DRMA by these methods. Five regression models, linear regression, piecewise regression, natural polynomial regression, fractional polynomial regression, and restricted cubic spline regression, were illustrated in this article to fit the dose-response relationship. And two types of pooling approaches, that is, one-stage approach and two-stage approach are illustrated to pool the dose-response relationship across studies. The example showed similar results among these models. Several dose-response meta-analysis methods can be used for investigating the relationship between exposure level and the risk of an outcome. However the methodology of DRMA still needs to be improved. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.

  10. Computationally efficient algorithm for Gaussian Process regression in case of structured samples

    NASA Astrophysics Data System (ADS)

    Belyaev, M.; Burnaev, E.; Kapushev, Y.

    2016-04-01

    Surrogate modeling is widely used in many engineering problems. Data sets often have Cartesian product structure (for instance factorial design of experiments with missing points). In such case the size of the data set can be very large. Therefore, one of the most popular algorithms for approximation-Gaussian Process regression-can be hardly applied due to its computational complexity. In this paper a computationally efficient approach for constructing Gaussian Process regression in case of data sets with Cartesian product structure is presented. Efficiency is achieved by using a special structure of the data set and operations with tensors. Proposed algorithm has low computational as well as memory complexity compared to existing algorithms. In this work we also introduce a regularization procedure allowing to take into account anisotropy of the data set and avoid degeneracy of regression model.

  11. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    PubMed

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  12. Drug treatment rates with beta-blockers and ACE-inhibitors/angiotensin receptor blockers and recurrences in takotsubo cardiomyopathy: A meta-regression analysis.

    PubMed

    Brunetti, Natale Daniele; Santoro, Francesco; De Gennaro, Luisa; Correale, Michele; Gaglione, Antonio; Di Biase, Matteo

    2016-07-01

    In a recent paper Singh et al. analyzed the effect of drug treatment on recurrence of takotsubo cardiomyopathy (TTC) in a comprehensive meta-analysis. The study found that recurrence rates were independent of clinic utilization of BB prescription, but inversely correlated with ACEi/ARB prescription: authors therefore conclude that ACEi/ARB rather than BB may reduce risk of recurrence. We aimed to re-analyze data reported in the study, now weighted for populations' size, in a meta-regression analysis. After multiple meta-regression analysis, we found a significant regression between rates of prescription of ACEi and rates of recurrence of TTC; regression was not statistically significant for BBs. On the bases of our re-analysis, we confirm that rates of recurrence of TTC are lower in populations of patients with higher rates of treatment with ACEi/ARB. That could not necessarily imply that ACEi may prevent recurrence of TTC, but barely that, for example, rates of recurrence are lower in cohorts more compliant with therapy or more prescribed with ACEi because more carefully followed. Randomized prospective studies are surely warranted. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  13. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.

  14. A pocket-sized metabolic analyzer for assessment of resting energy expenditure.

    PubMed

    Zhao, Di; Xian, Xiaojun; Terrera, Mirna; Krishnan, Ranganath; Miller, Dylan; Bridgeman, Devon; Tao, Kevin; Zhang, Lihua; Tsow, Francis; Forzani, Erica S; Tao, Nongjian

    2014-04-01

    The assessment of metabolic parameters related to energy expenditure has a proven value for weight management; however these measurements remain too difficult and costly for monitoring individuals at home. The objective of this study is to evaluate the accuracy of a new pocket-sized metabolic analyzer device for assessing energy expenditure at rest (REE) and during sedentary activities (EE). The new device performs indirect calorimetry by measuring an individual's oxygen consumption (VO2) and carbon dioxide production (VCO2) rates, which allows the determination of resting- and sedentary activity-related energy expenditure. VO2 and VCO2 values of 17 volunteer adult subjects were measured during resting and sedentary activities in order to compare the metabolic analyzer with the Douglas bag method. The Douglas bag method is considered the Gold Standard method for indirect calorimetry. Metabolic parameters of VO2, VCO2, and energy expenditure were compared using linear regression analysis, paired t-tests, and Bland-Altman plots. Linear regression analysis of measured VO2 and VCO2 values, as well as calculated energy expenditure assessed with the new analyzer and Douglas bag method, had the following linear regression parameters (linear regression slope LRS0, and R-squared coefficient, r(2)) with p = 0: LRS0 (SD) = 1.00 (0.01), r(2) = 0.9933 for VO2; LRS0 (SD) = 1.00 (0.01), r(2) = 0.9929 for VCO2; and LRS0 (SD) = 1.00 (0.01), r(2) = 0.9942 for energy expenditure. In addition, results from paired t-tests did not show statistical significant difference between the methods with a significance level of α = 0.05 for VO2, VCO2, REE, and EE. Furthermore, the Bland-Altman plot for REE showed good agreement between methods with 100% of the results within ±2SD, which was equivalent to ≤10% error. The findings demonstrate that the new pocket-sized metabolic analyzer device is accurate for determining VO2, VCO2, and energy expenditure. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.

  15. Examination of influential observations in penalized spline regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2013-10-01

    In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.

  16. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less

  17. Methods for estimating magnitude and frequency of floods in Arizona, developed with unregulated and rural peak-flow data through water year 2010

    USGS Publications Warehouse

    Paretti, Nicholas V.; Kennedy, Jeffrey R.; Turney, Lovina A.; Veilleux, Andrea G.

    2014-01-01

    The regional regression equations were integrated into the U.S. Geological Survey’s StreamStats program. The StreamStats program is a national map-based web application that allows the public to easily access published flood frequency and basin characteristic statistics. The interactive web application allows a user to select a point within a watershed (gaged or ungaged) and retrieve flood-frequency estimates derived from the current regional regression equations and geographic information system data within the selected basin. StreamStats provides users with an efficient and accurate means for retrieving the most up to date flood frequency and basin characteristic data. StreamStats is intended to provide consistent statistics, minimize user error, and reduce the need for large datasets and costly geographic information system software.

  18. Enhancement of Local Climate Analysis Tool

    NASA Astrophysics Data System (ADS)

    Horsfall, F. M.; Timofeyeva, M. M.; Dutton, J.

    2012-12-01

    The National Oceanographic and Atmospheric Administration (NOAA) National Weather Service (NWS) will enhance its Local Climate Analysis Tool (LCAT) to incorporate specific capabilities to meet the needs of various users including energy, health, and other communities. LCAT is an online interactive tool that provides quick and easy access to climate data and allows users to conduct analyses at the local level such as time series analysis, trend analysis, compositing, correlation and regression techniques, with others to be incorporated as needed. LCAT uses principles of Artificial Intelligence in connecting human and computer perceptions on application of data and scientific techniques in multiprocessing simultaneous users' tasks. Future development includes expanding the type of data currently imported by LCAT (historical data at stations and climate divisions) to gridded reanalysis and General Circulation Model (GCM) data, which are available on global grids and thus will allow for climate studies to be conducted at international locations. We will describe ongoing activities to incorporate NOAA Climate Forecast System (CFS) reanalysis data (CFSR), NOAA model output data, including output from the National Multi Model Ensemble Prediction System (NMME) and longer term projection models, and plans to integrate LCAT into the Earth System Grid Federation (ESGF) and its protocols for accessing model output and observational data to ensure there is no redundancy in development of tools that facilitate scientific advancements and use of climate model information in applications. Validation and inter-comparison of forecast models will be included as part of the enhancement to LCAT. To ensure sustained development, we will investigate options for open sourcing LCAT development, in particular, through the University Corporation for Atmospheric Research (UCAR).

  19. Medical Geography: a Promising Field of Application for Geostatistics.

    PubMed

    Goovaerts, P

    2009-01-01

    The analysis of health data and putative covariates, such as environmental, socio-economic, behavioral or demographic factors, is a promising application for geostatistics. It presents, however, several methodological challenges that arise from the fact that data are typically aggregated over irregular spatial supports and consist of a numerator and a denominator (i.e. population size). This paper presents an overview of recent developments in the field of health geostatistics, with an emphasis on three main steps in the analysis of areal health data: estimation of the underlying disease risk, detection of areas with significantly higher risk, and analysis of relationships with putative risk factors. The analysis is illustrated using age-adjusted cervix cancer mortality rates recorded over the 1970-1994 period for 118 counties of four states in the Western USA. Poisson kriging allows the filtering of noisy mortality rates computed from small population sizes, enhancing the correlation with two putative explanatory variables: percentage of habitants living below the federally defined poverty line, and percentage of Hispanic females. Area-to-point kriging formulation creates continuous maps of mortality risk, reducing the visual bias associated with the interpretation of choropleth maps. Stochastic simulation is used to generate realizations of cancer mortality maps, which allows one to quantify numerically how the uncertainty about the spatial distribution of health outcomes translates into uncertainty about the location of clusters of high values or the correlation with covariates. Last, geographically-weighted regression highlights the non-stationarity in the explanatory power of covariates: the higher mortality values along the coast are better explained by the two covariates than the lower risk recorded in Utah.

  20. Marginal analysis in assessing factors contributing time to physician in the Emergency Department using operations data.

    PubMed

    Pathan, Sameer A; Bhutta, Zain A; Moinudheen, Jibin; Jenkins, Dominic; Silva, Ashwin D; Sharma, Yogdutt; Saleh, Warda A; Khudabakhsh, Zeenat; Irfan, Furqan B; Thomas, Stephen H

    2016-01-01

    Background: Standard Emergency Department (ED) operations goals include minimization of the time interval (tMD) between patients' initial ED presentation and initial physician evaluation. This study assessed factors known (or suspected) to influence tMD with a two-step goal. The first step was generation of a multivariate model identifying parameters associated with prolongation of tMD at a single study center. The second step was the use of a study center-specific multivariate tMD model as a basis for predictive marginal probability analysis; the marginal model allowed for prediction of the degree of ED operations benefit that would be affected with specific ED operations improvements. Methods: The study was conducted using one month (May 2015) of data obtained from an ED administrative database (EDAD) in an urban academic tertiary ED with an annual census of approximately 500,000; during the study month, the ED saw 39,593 cases. The EDAD data were used to generate a multivariate linear regression model assessing the various demographic and operational covariates' effects on the dependent variable tMD. Predictive marginal probability analysis was used to calculate the relative contributions of key covariates as well as demonstrate the likely tMD impact on modifying those covariates with operational improvements. Analyses were conducted with Stata 14MP, with significance defined at p  < 0.05 and confidence intervals (CIs) reported at the 95% level. Results: In an acceptable linear regression model that accounted for just over half of the overall variance in tMD (adjusted r 2 0.51), important contributors to tMD included shift census ( p  = 0.008), shift time of day ( p  = 0.002), and physician coverage n ( p  = 0.004). These strong associations remained even after adjusting for each other and other covariates. Marginal predictive probability analysis was used to predict the overall tMD impact (improvement from 50 to 43 minutes, p  < 0.001) of consistent staffing with 22 physicians. Conclusions: The analysis identified expected variables contributing to tMD with regression demonstrating significance and effect magnitude of alterations in covariates including patient census, shift time of day, and number of physicians. Marginal analysis provided operationally useful demonstration of the need to adjust physician coverage numbers, prompting changes at the study ED. The methods used in this analysis may prove useful in other EDs wishing to analyze operations information with the goal of predicting which interventions may have the most benefit.

Top