Sample records for standard regression techniques

  1. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Treesearch

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  2. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  3. Heterogeneity in drug abuse among juvenile offenders: is mixture regression more informative than standard regression?

    PubMed

    Montgomery, Katherine L; Vaughn, Michael G; Thompson, Sanna J; Howard, Matthew O

    2013-11-01

    Research on juvenile offenders has largely treated this population as a homogeneous group. However, recent findings suggest that this at-risk population may be considerably more heterogeneous than previously believed. This study compared mixture regression analyses with standard regression techniques in an effort to explain how known factors such as distress, trauma, and personality are associated with drug abuse among juvenile offenders. Researchers recruited 728 juvenile offenders from Missouri juvenile correctional facilities for participation in this study. Researchers investigated past-year substance use in relation to the following variables: demographic characteristics (gender, ethnicity, age, familial use of public assistance), antisocial behavior, and mental illness symptoms (psychopathic traits, psychiatric distress, and prior trauma). Results indicated that standard and mixed regression approaches identified significant variables related to past-year substance use among this population; however, the mixture regression methods provided greater specificity in results. Mixture regression analytic methods may help policy makers and practitioners better understand and intervene with the substance-related subgroups of juvenile offenders.

  4. Statistical Evaluation of Time Series Analysis Techniques

    NASA Technical Reports Server (NTRS)

    Benignus, V. A.

    1973-01-01

    The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.

  5. On the Bias-Amplifying Effect of Near Instruments in Observational Studies

    ERIC Educational Resources Information Center

    Steiner, Peter M.; Kim, Yongnam

    2014-01-01

    In contrast to randomized experiments, the estimation of unbiased treatment effects from observational data requires an analysis that conditions on all confounding covariates. Conditioning on covariates can be done via standard parametric regression techniques or nonparametric matching like propensity score (PS) matching. The regression or…

  6. Poisson Mixture Regression Models for Heart Disease Prediction.

    PubMed

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  7. Poisson Mixture Regression Models for Heart Disease Prediction

    PubMed Central

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  8. Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.

    PubMed

    Ritz, Christian; Van der Vliet, Leana

    2009-09-01

    The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.

  9. Prediction models for clustered data: comparison of a random intercept and standard regression model

    PubMed Central

    2013-01-01

    Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436

  10. Prediction models for clustered data: comparison of a random intercept and standard regression model.

    PubMed

    Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne

    2013-02-15

    When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.

  11. Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.

    DTIC Science & Technology

    1985-01-03

    Edmonton, Alberta CANADA 1 Dr. Frederic M. Lord Educational Testing Service 1 Dr. Earl Hunt Princeton, NJ 08541 Dept, of Psychology University of...111111-1.6 MICROCOPY RESOLUTION TEST CHART NATIONAL BUREAU OF STANDARDS-1963-A SIMIULTANEOUS ESTIMATION OF REGRESSION FUNCTIONS FOR MARINE CORPS...Bayesian techniques for simul- taneous estimation to the specification of regression weights for selection tests used in various technical training courses

  12. Measurement of lung volumes from supine portable chest radiographs.

    PubMed

    Ries, A L; Clausen, J L; Friedman, P J

    1979-12-01

    Lung volumes in supine nonambulatory patients are physiological parameters often difficult to measure with current techniques (plethysmograph, gas dilution). Existing radiographic methods for measuring lung volumes require standard upright chest radiographs. Accordingly, in 31 normal supine adults, we determined helium-dilution functional residual and total lung capacities and measured planimetric lung field areas (LFA) from corresponding portable anteroposterior and lateral radiographs. Low radiation dose methods, which delivered less than 10% of that from standard portable X-ray technique, were utilized. Correlation between lung volume and radiographic LFA was highly significant (r = 0.96, SEE = 10.6%). Multiple-step regressions using height and chest diameter correction factors reduced variance, but weight and radiographic magnification factors did not. In 17 additional subjects studied for validation, the regression equations accurately predicted radiographic lung volume. Thus, this technique can provide accurate and rapid measurement of lung volume in studies involving supine patients.

  13. Estimating standard errors in feature network models.

    PubMed

    Frank, Laurence E; Heiser, Willem J

    2007-05-01

    Feature network models are graphical structures that represent proximity data in a discrete space while using the same formalism that is the basis of least squares methods employed in multidimensional scaling. Existing methods to derive a network model from empirical data only give the best-fitting network and yield no standard errors for the parameter estimates. The additivity properties of networks make it possible to consider the model as a univariate (multiple) linear regression problem with positivity restrictions on the parameters. In the present study, both theoretical and empirical standard errors are obtained for the constrained regression parameters of a network model with known features. The performance of both types of standard error is evaluated using Monte Carlo techniques.

  14. Normalization Approaches for Removing Systematic Biases Associated with Mass Spectrometry and Label-Free Proteomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Callister, Stephen J.; Barry, Richard C.; Adkins, Joshua N.

    2006-02-01

    Central tendency, linear regression, locally weighted regression, and quantile techniques were investigated for normalization of peptide abundance measurements obtained from high-throughput liquid chromatography-Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR MS). Arbitrary abundances of peptides were obtained from three sample sets, including a standard protein sample, two Deinococcus radiodurans samples taken from different growth phases, and two mouse striatum samples from control and methamphetamine-stressed mice (strain C57BL/6). The selected normalization techniques were evaluated in both the absence and presence of biological variability by estimating extraneous variability prior to and following normalization. Prior to normalization, replicate runs from each sample setmore » were observed to be statistically different, while following normalization replicate runs were no longer statistically different. Although all techniques reduced systematic bias, assigned ranks among the techniques revealed significant trends. For most LC-FTICR MS analyses, linear regression normalization ranked either first or second among the four techniques, suggesting that this technique was more generally suitable for reducing systematic biases.« less

  15. Techniques for estimating monthly mean streamflow at gaged sites and monthly streamflow duration characteristics at ungaged sites in central Nevada

    USGS Publications Warehouse

    Hess, G.W.; Bohman, L.R.

    1996-01-01

    Techniques for estimating monthly mean streamflow at gaged sites and monthly streamflow duration characteristics at ungaged sites in central Nevada were developed using streamflow records at six gaged sites and basin physical and climatic characteristics. Streamflow data at gaged sites were related by regression techniques to concurrent flows at nearby gaging stations so that monthly mean streamflows for periods of missing or no record can be estimated for gaged sites in central Nevada. The standard error of estimate for relations at these sites ranged from 12 to 196 percent. Also, monthly streamflow data for selected percent exceedence levels were used in regression analyses with basin and climatic variables to determine relations for ungaged basins for annual and monthly percent exceedence levels. Analyses indicate that the drainage area and percent of drainage area at altitudes greater than 10,000 feet are the most significant variables. For the annual percent exceedence, the standard error of estimate of the relations for ungaged sites ranged from 51 to 96 percent and standard error of prediction for ungaged sites ranged from 96 to 249 percent. For the monthly percent exceedence values, the standard error of estimate of the relations ranged from 31 to 168 percent, and the standard error of prediction ranged from 115 to 3,124 percent. Reliability and limitations of the estimating methods are described.

  16. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    NASA Astrophysics Data System (ADS)

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  17. Estimating peak discharges, flood volumes, and hydrograph shapes of small ungaged urban streams in Ohio

    USGS Publications Warehouse

    Sherwood, J.M.

    1986-01-01

    Methods are presented for estimating peak discharges, flood volumes and hydrograph shapes of small (less than 5 sq mi) urban streams in Ohio. Examples of how to use the various regression equations and estimating techniques also are presented. Multiple-regression equations were developed for estimating peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The significant independent variables affecting peak discharge are drainage area, main-channel slope, average basin-elevation index, and basin-development factor. Standard errors of regression and prediction for the peak discharge equations range from +/-37% to +/-41%. An equation also was developed to estimate the flood volume of a given peak discharge. Peak discharge, drainage area, main-channel slope, and basin-development factor were found to be the significant independent variables affecting flood volumes for given peak discharges. The standard error of regression for the volume equation is +/-52%. A technique is described for estimating the shape of a runoff hydrograph by applying a specific peak discharge and the estimated lagtime to a dimensionless hydrograph. An equation for estimating the lagtime of a basin was developed. Two variables--main-channel length divided by the square root of the main-channel slope and basin-development factor--have a significant effect on basin lagtime. The standard error of regression for the lagtime equation is +/-48%. The data base for the study was established by collecting rainfall-runoff data at 30 basins distributed throughout several metropolitan areas of Ohio. Five to eight years of data were collected at a 5-min record interval. The USGS rainfall-runoff model A634 was calibrated for each site. The calibrated models were used in conjunction with long-term rainfall records to generate a long-term streamflow record for each site. Each annual peak-discharge record was fitted to a Log-Pearson Type III frequency curve. Multiple-regression techniques were then used to analyze the peak discharge data as a function of the basin characteristics of the 30 sites. (Author 's abstract)

  18. Multiple regression technique for Pth degree polynominals with and without linear cross products

    NASA Technical Reports Server (NTRS)

    Davis, J. W.

    1973-01-01

    A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.

  19. Robust logistic regression to narrow down the winner's curse for rare and recessive susceptibility variants.

    PubMed

    Kesselmeier, Miriam; Lorenzo Bermejo, Justo

    2017-11-01

    Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Technique for estimating the 2- to 500-year flood discharges on unregulated streams in rural Missouri

    USGS Publications Warehouse

    Alexander, Terry W.; Wilson, Gary L.

    1995-01-01

    A generalized least-squares regression technique was used to relate the 2- to 500-year flood discharges from 278 selected streamflow-gaging stations to statistically significant basin characteristics. The regression relations (estimating equations) were defined for three hydrologic regions (I, II, and III) in rural Missouri. Ordinary least-squares regression analyses indicate that drainage area (Regions I, II, and III) and main-channel slope (Regions I and II) are the only basin characteristics needed for computing the 2- to 500-year design-flood discharges at gaged or ungaged stream locations. The resulting generalized least-squares regression equations provide a technique for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood discharges on unregulated streams in rural Missouri. The regression equations for Regions I and II were developed from stream-flow-gaging stations with drainage areas ranging from 0.13 to 11,500 square miles and 0.13 to 14,000 square miles, and main-channel slopes ranging from 1.35 to 150 feet per mile and 1.20 to 279 feet per mile. The regression equations for Region III were developed from streamflow-gaging stations with drainage areas ranging from 0.48 to 1,040 square miles. Standard errors of estimate for the generalized least-squares regression equations in Regions I, II, and m ranged from 30 to 49 percent.

  1. Cost-effectiveness of the streamflow-gaging program in Wyoming

    USGS Publications Warehouse

    Druse, S.A.; Wahl, K.L.

    1988-01-01

    This report documents the results of a cost-effectiveness study of the streamflow-gaging program in Wyoming. Regression analysis or hydrologic flow-routing techniques were considered for 24 combinations of stations from a 139-station network operated in 1984 to investigate suitability of techniques for simulating streamflow records. Only one station was determined to have sufficient accuracy in the regression analysis to consider discontinuance of the gage. The evaluation of the gaging-station network, which included the use of associated uncertainty in streamflow records, is limited to the nonwinter operation of the 47 stations operated by the Riverton Field Office of the U.S. Geological Survey. The current (1987) travel routes and measurement frequencies require a budget of $264,000 and result in an average standard error in streamflow records of 13.2%. Changes in routes and station visits using the same budget, could optimally reduce the standard error by 1.6%. Budgets evaluated ranged from $235,000 to $400,000. A $235,000 budget increased the optimal average standard error/station from 11.6 to 15.5%, and a $400,000 budget could reduce it to 6.6%. For all budgets considered, lost record accounts for about 40% of the average standard error. (USGS)

  2. Analysis of the Magnitude and Frequency of Peak Discharges for the Navajo Nation in Arizona, Utah, Colorado, and New Mexico

    USGS Publications Warehouse

    Waltemeyer, Scott D.

    2006-01-01

    Estimates of the magnitude and frequency of peak discharges are necessary for the reliable flood-hazard mapping in the Navajo Nation in Arizona, Utah, Colorado, and New Mexico. The Bureau of Indian Affairs, U.S. Army Corps of Engineers, and Navajo Nation requested that the U.S. Geological Survey update estimates of peak discharge magnitude for gaging stations in the region and update regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites using data collected through 1999 at 146 gaging stations, an additional 13 years of peak-discharge data since a 1997 investigation, which used gaging-station data through 1986. The equations for estimation of peak discharges at ungaged sites were developed for flood regions 8, 11, high elevation, and 6 and are delineated on the basis of the hydrologic codes from the 1997 investigation. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 82 of the 146 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge having a recurrence interval of less than 1.4 years in the probability-density function. Within each region, logarithms of the peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then was applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction for a peak discharge have a recurrence interval of 100-years for region 8 was 53 percent (average) for the 100-year flood. The average standard of prediction, which includes average sampling error and average standard error of regression, ranged from 45 to 83 percent for the 100-year flood. Estimated standard error of prediction for a hybrid method for region 11 was large in the 1997 investigation. No distinction of floods produced from a high-elevation region was presented in the 1997 investigation. Overall, the equations based on generalized least-squares regression techniques are considered to be more reliable than those in the 1997 report because of the increased length of record and improved GIS method. Techniques for transferring flood-frequency relations to ungaged sites on the same stream can be estimated at an ungaged site by a direct application of the regional regression equation or at an ungaged site on a stream that has a gaging station upstream or downstream by using the drainage-area ratio and the drainage-area exponent from the regional regression equation of the respective region.

  3. Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

    PubMed

    Dinç, Erdal; Ozdemir, Abdil

    2005-01-01

    Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.

  4. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    NASA Astrophysics Data System (ADS)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  5. Practical Guidance for Conducting Mediation Analysis With Multiple Mediators Using Inverse Odds Ratio Weighting

    PubMed Central

    Nguyen, Quynh C.; Osypuk, Theresa L.; Schmidt, Nicole M.; Glymour, M. Maria; Tchetgen Tchetgen, Eric J.

    2015-01-01

    Despite the recent flourishing of mediation analysis techniques, many modern approaches are difficult to implement or applicable to only a restricted range of regression models. This report provides practical guidance for implementing a new technique utilizing inverse odds ratio weighting (IORW) to estimate natural direct and indirect effects for mediation analyses. IORW takes advantage of the odds ratio's invariance property and condenses information on the odds ratio for the relationship between the exposure (treatment) and multiple mediators, conditional on covariates, by regressing exposure on mediators and covariates. The inverse of the covariate-adjusted exposure-mediator odds ratio association is used to weight the primary analytical regression of the outcome on treatment. The treatment coefficient in such a weighted regression estimates the natural direct effect of treatment on the outcome, and indirect effects are identified by subtracting direct effects from total effects. Weighting renders treatment and mediators independent, thereby deactivating indirect pathways of the mediators. This new mediation technique accommodates multiple discrete or continuous mediators. IORW is easily implemented and is appropriate for any standard regression model, including quantile regression and survival analysis. An empirical example is given using data from the Moving to Opportunity (1994–2002) experiment, testing whether neighborhood context mediated the effects of a housing voucher program on obesity. Relevant Stata code (StataCorp LP, College Station, Texas) is provided. PMID:25693776

  6. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    PubMed

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  7. Regression analysis of current-status data: an application to breast-feeding.

    PubMed

    Grummer-strawn, L M

    1993-09-01

    "Although techniques for calculating mean survival time from current-status data are well known, their use in multiple regression models is somewhat troublesome. Using data on current breast-feeding behavior, this article considers a number of techniques that have been suggested in the literature, including parametric, nonparametric, and semiparametric models as well as the application of standard schedules. Models are tested in both proportional-odds and proportional-hazards frameworks....I fit [the] models to current status data on breast-feeding from the Demographic and Health Survey (DHS) in six countries: two African (Mali and Ondo State, Nigeria), two Asian (Indonesia and Sri Lanka), and two Latin American (Colombia and Peru)." excerpt

  8. Differences in head impulse test results due to analysis techniques.

    PubMed

    Cleworth, Taylor W; Carpenter, Mark G; Honegger, Flurin; Allum, John H J

    2017-01-01

    Different analysis techniques are used to define vestibulo-ocular reflex (VOR) gain between eye and head angular velocity during the video head impulse test (vHIT). Comparisons would aid selection of gain techniques best related to head impulse characteristics and promote standardisation. Compare and contrast known methods of calculating vHIT VOR gain. We examined lateral canal vHIT responses recorded from 20 patients twice within 13 weeks of acute unilateral peripheral vestibular deficit onset. Ten patients were tested with an ICS Impulse system (GN Otometrics) and 10 with an EyeSeeCam (ESC) system (Interacoustics). Mean gain and variance were computed with area, average sample gain, and regression techniques over specific head angular velocity (HV) and acceleration (HA) intervals. Results for the same gain technique were not different between measurement systems. Area and average sample gain yielded equally lower variances than regression techniques. Gains computed over the whole impulse duration were larger than those computed for increasing HV. Gain over decreasing HV was associated with larger variances. Gains computed around peak HV were smaller than those computed around peak HA. The median gain over 50-70 ms was not different from gain around peak HV. However, depending on technique used, the gain over increasing HV was different from gain around peak HA. Conversion equations between gains obtained with standard ICS and ESC methods were computed. For low gains, the conversion was dominated by a constant that needed to be added to ESC gains to equal ICS gains. We recommend manufacturers standardize vHIT gain calculations using 2 techniques: area gain around peak HA and peak HV.

  9. Microbioassay of Antimicrobial Agents

    PubMed Central

    Simon, Harold J.; Yin, E. Jong

    1970-01-01

    A previously described agar-diffusion technique for microbioassay of antimicrobial agents has been modified to increase sensitivity of the technique and to extend the range of antimicrobial agents to which it is applicable. This microtechnique requires only 0.02 ml of an unknown test sample for assay, and is capable of measuring minute concentrations of antibiotics in buffer, serum, and urine. In some cases, up to a 20-fold increase in sensitivity is gained relative to other published standardized methods and the error of this method is less than ±5%. Buffer standard curves have been established for this technique, concurrently with serum standard curves, yielding information on antimicrobial serum-binding and demonstrating linearity of the data points compared to the estimated regression line for the microconcentration ranges covered by this technique. This microassay technique is particularly well suited for pediatric research and for other investigations where sample volumes are small and quantitative accuracy is desired. Dilution of clinical samples to attain concentrations falling with the range of this assay makes the technique readily adaptable and suitable for general clinical pharmacological studies. The microassay technique has been standardized in buffer solutions and in normal human serum pools for the following antimicrobials: ampicillin, methicillin, penicillin G, oxacillin, cloxacillin, dicloxacillin, cephaloglycin, cephalexin, cephaloridine, cephalothin, erythromycin, rifamycin amino methyl piperazine, kanamycin, neomycin, streptomycin, colistin, polymyxin B, doxycycline, minocycline, oxytetracycline, tetracycline, and chloramphenicol. PMID:4986725

  10. Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation

    NASA Astrophysics Data System (ADS)

    Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.

    A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.

  11. Something old, something new, something borrowed, something blue: a framework for the marriage of health econometrics and cost-effectiveness analysis.

    PubMed

    Hoch, Jeffrey S; Briggs, Andrew H; Willan, Andrew R

    2002-07-01

    Economic evaluation is often seen as a branch of health economics divorced from mainstream econometric techniques. Instead, it is perceived as relying on statistical methods for clinical trials. Furthermore, the statistic of interest in cost-effectiveness analysis, the incremental cost-effectiveness ratio is not amenable to regression-based methods, hence the traditional reliance on comparing aggregate measures across the arms of a clinical trial. In this paper, we explore the potential for health economists undertaking cost-effectiveness analysis to exploit the plethora of established econometric techniques through the use of the net-benefit framework - a recently suggested reformulation of the cost-effectiveness problem that avoids the reliance on cost-effectiveness ratios and their associated statistical problems. This allows the formulation of the cost-effectiveness problem within a standard regression type framework. We provide an example with empirical data to illustrate how a regression type framework can enhance the net-benefit method. We go on to suggest that practical advantages of the net-benefit regression approach include being able to use established econometric techniques, adjust for imperfect randomisation, and identify important subgroups in order to estimate the marginal cost-effectiveness of an intervention. Copyright 2002 John Wiley & Sons, Ltd.

  12. Practical guidance for conducting mediation analysis with multiple mediators using inverse odds ratio weighting.

    PubMed

    Nguyen, Quynh C; Osypuk, Theresa L; Schmidt, Nicole M; Glymour, M Maria; Tchetgen Tchetgen, Eric J

    2015-03-01

    Despite the recent flourishing of mediation analysis techniques, many modern approaches are difficult to implement or applicable to only a restricted range of regression models. This report provides practical guidance for implementing a new technique utilizing inverse odds ratio weighting (IORW) to estimate natural direct and indirect effects for mediation analyses. IORW takes advantage of the odds ratio's invariance property and condenses information on the odds ratio for the relationship between the exposure (treatment) and multiple mediators, conditional on covariates, by regressing exposure on mediators and covariates. The inverse of the covariate-adjusted exposure-mediator odds ratio association is used to weight the primary analytical regression of the outcome on treatment. The treatment coefficient in such a weighted regression estimates the natural direct effect of treatment on the outcome, and indirect effects are identified by subtracting direct effects from total effects. Weighting renders treatment and mediators independent, thereby deactivating indirect pathways of the mediators. This new mediation technique accommodates multiple discrete or continuous mediators. IORW is easily implemented and is appropriate for any standard regression model, including quantile regression and survival analysis. An empirical example is given using data from the Moving to Opportunity (1994-2002) experiment, testing whether neighborhood context mediated the effects of a housing voucher program on obesity. Relevant Stata code (StataCorp LP, College Station, Texas) is provided. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. The measurement of linear frequency drift in oscillators

    NASA Astrophysics Data System (ADS)

    Barnes, J. A.

    1985-04-01

    A linear drift in frequency is an important element in most stochastic models of oscillator performance. Quartz crystal oscillators often have drifts in excess of a part in ten to the tenth power per day. Even commercial cesium beam devices often show drifts of a few parts in ten to the thirteenth per year. There are many ways to estimate the drift rates from data samples (e.g., regress the phase on a quadratic; regress the frequency on a linear; compute the simple mean of the first difference of frequency; use Kalman filters with a drift term as one element in the state vector; and others). Although most of these estimators are unbiased, they vary in efficiency (i.e., confidence intervals). Further, the estimation of confidence intervals using the standard analysis of variance (typically associated with the specific estimating technique) can give amazingly optimistic results. The source of these problems is not an error in, say, the regressions techniques, but rather the problems arise from correlations within the residuals. That is, the oscillator model is often not consistent with constraints on the analysis technique or, in other words, some specific analysis techniques are often inappropriate for the task at hand. The appropriateness of a specific analysis technique is critically dependent on the oscillator model and can often be checked with a simple whiteness test on the residuals.

  14. Considerations for monitoring raptor population trends based on counts of migrants

    USGS Publications Warehouse

    Titus, K.; Fuller, M.R.; Ruos, J.L.; Meyburg, B-U.; Chancellor, R.D.

    1989-01-01

    Various problems were identified with standardized hawk count data as annually collected at six sites. Some of the hawk lookouts increased their hours of observation from 1979-1985, thereby confounding the total counts. Data recording and missing data hamper coding of data and their use with modern analytical techniques. Coefficients of variation among years in counts averaged about 40%. The advantages and disadvantages of various analytical techniques are discussed including regression, non-parametric rank correlation trend analysis, and moving averages.

  15. Estimating the magnitude of peak flows for streams in Kentucky for selected recurrence intervals

    USGS Publications Warehouse

    Hodgkins, Glenn A.; Martin, Gary R.

    2003-01-01

    This report gives estimates of, and presents techniques for estimating, the magnitude of peak flows for streams in Kentucky for recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years. A flowchart in this report guides the user to the appropriate estimates and (or) estimating techniques for a site on a specific stream. Estimates of peak flows are given for 222 U.S. Geological Survey streamflow-gaging stations in Kentucky. In the development of the peak-flow estimates at gaging stations, a new generalized skew coefficient was calculated for the State. This single statewide value of 0.011 (with a standard error of prediction of 0.520) is more appropriate for Kentucky than the national skew isoline map in Bulletin 17B of the Interagency Advisory Committee on Water Data. Regression equations are presented for estimating the peak flows on ungaged, unregulated streams in rural drainage basins. The equations were developed by use of generalized-least-squares regression procedures at 187 U.S. Geological Survey gaging stations in Kentucky and 51 stations in surrounding States. Kentucky was divided into seven flood regions. Total drainage area is used in the final regression equations as the sole explanatory variable, except in Regions 1 and 4 where main-channel slope also was used. The smallest average standard errors of prediction were in Region 3 (from -13.1 to +15.0 percent) and the largest average standard errors of prediction were in Region 5 (from -37.6 to +60.3 percent). One section of this report describes techniques for estimating peak flows for ungaged sites on gaged, unregulated streams in rural drainage basins. Another section references two previous U.S. Geological Survey reports for peak-flow estimates on ungaged, unregulated, urban streams. Estimating peak flows at ungaged sites on regulated streams is beyond the scope of this report, because peak flows on regulated streams are dependent upon variable human activities.

  16. Validity and reliability of dental age estimation of teeth root translucency based on digital luminance determination.

    PubMed

    Ramsthaler, Frank; Kettner, Mattias; Verhoff, Marcel A

    2014-01-01

    In forensic anthropological casework, estimating age-at-death is key to profiling unknown skeletal remains. The aim of this study was to examine the reliability of a new, simple, fast, and inexpensive digital odontological method for age-at-death estimation. The method is based on the original Lamendin method, which is a widely used technique in the repertoire of odontological aging methods in forensic anthropology. We examined 129 single root teeth employing a digital camera and imaging software for the measurement of the luminance of the teeth's translucent root zone. Variability in luminance detection was evaluated using statistical technical error of measurement analysis. The method revealed stable values largely unrelated to observer experience, whereas requisite formulas proved to be camera-specific and should therefore be generated for an individual recording setting based on samples of known chronological age. Multiple regression analysis showed a highly significant influence of the coefficients of the variables "arithmetic mean" and "standard deviation" of luminance for the regression formula. For the use of this primer multivariate equation for age-at-death estimation in casework, a standard error of the estimate of 6.51 years was calculated. Step-by-step reduction of the number of embedded variables to linear regression analysis employing the best contributor "arithmetic mean" of luminance yielded a regression equation with a standard error of 6.72 years (p < 0.001). The results of this study not only support the premise of root translucency as an age-related phenomenon, but also demonstrate that translucency reflects a number of other influencing factors in addition to age. This new digital measuring technique of the zone of dental root luminance can broaden the array of methods available for estimating chronological age, and furthermore facilitate measurement and age classification due to its low dependence on observer experience.

  17. Human Language Technology: Opportunities and Challenges

    DTIC Science & Technology

    2005-01-01

    because of the connections to and reliance on signal processing. Audio diarization critically includes indexing of speakers [12], since speaker ...to reduce inter- speaker variability in training. Standard techniques include vocal-tract length normalization, adaptation of acoustic models using...maximum likelihood linear regression (MLLR), and speaker -adaptive training based on MLLR. The acoustic models are mixtures of Gaussians, typically with

  18. An Ecological Study of Community-Level Correlates of Suicide Mortality Rates in the Flemish Region of Belgium, 1996-2005

    ERIC Educational Resources Information Center

    Hooghe, Marc; Vanhoutte, Bram

    2011-01-01

    An ecological study of age-standardized suicide rates in Belgian communities (1996-2005) was conducted using spatial regression techniques. Community characteristics were significantly related to suicide rates. There was mixed support for the social integration perspective: single person households were associated with higher suicide rates, while…

  19. The Effects of Equal Status Cross-Sex Contact on Students' Sex Stereotyped Attitudes and Behavior.

    ERIC Educational Resources Information Center

    Lockheed, Marlaine E.; Harris, Abigail M.

    Standard least squares regression techniques are used to estimate the effects of non-sex-role stereotypes, equal-status cross-sex interaction and female leadership on changes in children's sex stereotyped attitudes. Included are a pretest, experimental treatment, and post-test. Teachers of approximately 400 fourth and fifth grade children received…

  20. Estimation of Flood Discharges at Selected Recurrence Intervals for Streams in New Hampshire

    USGS Publications Warehouse

    Olson, Scott A.

    2009-01-01

    This report provides estimates of flood discharges at selected recurrence intervals for streamgages in and adjacent to New Hampshire and equations for estimating flood discharges at recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, and 500-years for ungaged, unregulated, rural streams in New Hampshire. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 117 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, mean April precipitation, percentage of wetland area, and main channel slope. The average standard error of prediction for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence interval flood discharges with these equations are 30.0, 30.8, 32.0, 34.2, 36.0, 38.1, and 43.4 percent, respectively. Flood discharges at selected recurrence intervals for selected streamgages were computed following the guidelines in Bulletin 17B of the U.S. Interagency Advisory Committee on Water Data. To determine the flood-discharge exceedence probabilities at streamgages in New Hampshire, a new generalized skew coefficient map covering the State was developed. The standard error of the data on new map is 0.298. To improve estimates of flood discharges at selected recurrence intervals for 20 streamgages with short-term records (10 to 15 years), record extension using the two-station comparison technique was applied. The two-station comparison method uses data from a streamgage with long-term record to adjust the frequency characteristics at a streamgage with a short-term record. A technique for adjusting a flood-discharge frequency curve computed from a streamgage record with results from the regression equations is described in this report. Also, a technique is described for estimating flood discharge at a selected recurrence interval for an ungaged site upstream or downstream from a streamgage using a drainage-area adjustment. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.

  1. Estimation of Magnitude and Frequency of Floods for Streams on the Island of Oahu, Hawaii

    USGS Publications Warehouse

    Wong, Michael F.

    1994-01-01

    This report describes techniques for estimating the magnitude and frequency of floods for the island of Oahu. The log-Pearson Type III distribution and methodology recommended by the Interagency Committee on Water Data was used to determine the magnitude and frequency of floods at 79 gaging stations that had 11 to 72 years of record. Multiple regression analysis was used to construct regression equations to transfer the magnitude and frequency information from gaged sites to ungaged sites. Oahu was divided into three hydrologic regions to define relations between peak discharge and drainage-basin and climatic characteristics. Regression equations are provided to estimate the 2-, 5-, 10-, 25-, 50-, and 100-year peak discharges at ungaged sites. Significant basin and climatic characteristics included in the regression equations are drainage area, median annual rainfall, and the 2-year, 24-hour rainfall intensity. Drainage areas for sites used in this study ranged from 0.03 to 45.7 square miles. Standard error of prediction for the regression equations ranged from 34 to 62 percent. Peak-discharge data collected through water year 1988, geographic information system (GIS) technology, and generalized least-squares regression were used in the analyses. The use of GIS seems to be a more flexible and consistent means of defining and calculating basin and climatic characteristics than using manual methods. Standard errors of estimate for the regression equations in this report are an average of 8 percent less than those published in previous studies.

  2. Data Mining Methods Applied to Flight Operations Quality Assurance Data: A Comparison to Standard Statistical Methods

    NASA Technical Reports Server (NTRS)

    Stolzer, Alan J.; Halford, Carl

    2007-01-01

    In a previous study, multiple regression techniques were applied to Flight Operations Quality Assurance-derived data to develop parsimonious model(s) for fuel consumption on the Boeing 757 airplane. The present study examined several data mining algorithms, including neural networks, on the fuel consumption problem and compared them to the multiple regression results obtained earlier. Using regression methods, parsimonious models were obtained that explained approximately 85% of the variation in fuel flow. In general data mining methods were more effective in predicting fuel consumption. Classification and Regression Tree methods reported correlation coefficients of .91 to .92, and General Linear Models and Multilayer Perceptron neural networks reported correlation coefficients of about .99. These data mining models show great promise for use in further examining large FOQA databases for operational and safety improvements.

  3. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

    PubMed Central

    Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760

  4. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

    PubMed

    Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.

  5. The Highly Adaptive Lasso Estimator

    PubMed Central

    Benkeser, David; van der Laan, Mark

    2017-01-01

    Estimation of a regression functions is a common goal of statistical learning. We propose a novel nonparametric regression estimator that, in contrast to many existing methods, does not rely on local smoothness assumptions nor is it constructed using local smoothing techniques. Instead, our estimator respects global smoothness constraints by virtue of falling in a class of right-hand continuous functions with left-hand limits that have variation norm bounded by a constant. Using empirical process theory, we establish a fast minimal rate of convergence of our proposed estimator and illustrate how such an estimator can be constructed using standard software. In simulations, we show that the finite-sample performance of our estimator is competitive with other popular machine learning techniques across a variety of data generating mechanisms. We also illustrate competitive performance in real data examples using several publicly available data sets. PMID:29094111

  6. Hyperspectral face recognition with spatiospectral information fusion and PLS regression.

    PubMed

    Uzair, Muhammad; Mahmood, Arif; Mian, Ajmal

    2015-03-01

    Hyperspectral imaging offers new opportunities for face recognition via improved discrimination along the spectral dimension. However, it poses new challenges, including low signal-to-noise ratio, interband misalignment, and high data dimensionality. Due to these challenges, the literature on hyperspectral face recognition is not only sparse but is limited to ad hoc dimensionality reduction techniques and lacks comprehensive evaluation. We propose a hyperspectral face recognition algorithm using a spatiospectral covariance for band fusion and partial least square regression for classification. Moreover, we extend 13 existing face recognition techniques, for the first time, to perform hyperspectral face recognition.We formulate hyperspectral face recognition as an image-set classification problem and evaluate the performance of seven state-of-the-art image-set classification techniques. We also test six state-of-the-art grayscale and RGB (color) face recognition algorithms after applying fusion techniques on hyperspectral images. Comparison with the 13 extended and five existing hyperspectral face recognition techniques on three standard data sets show that the proposed algorithm outperforms all by a significant margin. Finally, we perform band selection experiments to find the most discriminative bands in the visible and near infrared response spectrum.

  7. Body Composition of Bangladeshi Children: Comparison and Development of Leg-to-Leg Bioelectrical Impedance Equation

    PubMed Central

    Khan, I.; Hawlader, Sophie Mohammad Delwer Hossain; Arifeen, Shams El; Moore, Sophie; Hills, Andrew P.; Wells, Jonathan C.; Persson, Lars-Åke; Kabir, Iqbal

    2012-01-01

    The aim of this study was to investigate the validity of the Tanita TBF 300A leg-to-leg bioimpedance analyzer for estimating fat-free mass (FFM) in Bangladeshi children aged 4-10 years and to develop novel prediction equations for use in this population, using deuterium dilution as the reference method. Two hundred Bangladeshi children were enrolled. The isotope dilution technique with deuterium oxide was used for estimation of total body water (TBW). FFM estimated by Tanita was compared with results of deuterium oxide dilution technique. Novel prediction equations were created for estimating FFM, using linear regression models, fitting child's height and impedance as predictors. There was a significant difference in FFM and percentage of body fat (BF%) between methods (p<0.01), Tanita underestimating TBW in boys (p=0.001) and underestimating BF% in girls (p<0.001). A basic linear regression model with height and impedance explained 83% of the variance in FFM estimated by deuterium oxide dilution technique. The best-fit equation to predict FFM from linear regression modelling was achieved by adding weight, sex, and age to the basic model, bringing the adjusted R2 to 89% (standard error=0.90, p<0.001). These data suggest Tanita analyzer may be a valid field-assessment technique in Bangladeshi children when using population-specific prediction equations, such as the ones developed here. PMID:23082630

  8. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  9. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    PubMed

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included. Copyright © 2011 Elsevier B.V. All rights reserved.

  10. Fluoroscopic removal of retrievable self-expandable metal stents in patients with malignant oesophageal strictures: Experience with a non-endoscopic removal system.

    PubMed

    Kim, Pyeong Hwa; Song, Ho-Young; Park, Jung-Hoon; Zhou, Wei-Zhong; Na, Han Kyu; Cho, Young Chul; Jun, Eun Jung; Kim, Jun Ki; Kim, Guk Bae

    2017-03-01

    To evaluate clinical outcomes of fluoroscopic removal of retrievable self-expandable metal stents (SEMSs) for malignant oesophageal strictures, to compare clinical outcomes of three different removal techniques, and to identify predictive factors of successful removal by the standard technique (primary technical success). A total of 137 stents were removed from 128 patients with malignant oesophageal strictures. Primary overall technical success and removal-related complications were evaluated. Logistic regression models were constructed to identify predictive factors of primary technical success. Primary technical success rate was 78.8 % (108/137). Complications occurred in six (4.4 %) cases. Stent location in the upper oesophagus (P=0.004), stricture length over 8 cm (P=0.030), and proximal granulation tissue (P<0.001) were negative predictive factors of primary technical success. If granulation tissue was present at the proximal end, eversion technique was more frequently required (P=0.002). Fluoroscopic removal of retrievable SEMSs for malignant oesophageal strictures using three different removal techniques appeared to be safe and easy. The standard technique is safe and effective in the majority of patients. The presence of proximal granulation tissue, stent location in the upper oesophagus, and stricture length over 8 cm were negative predictive factors for primary technical success by standard extraction and may require a modified removal technique. • Fluoroscopic retrievable SEMS removal is safe and effective. • Standard removal technique by traction is effective in the majority of patients. • Three negative predictive factors of primary technical success were identified. • Caution should be exercised during the removal in those situations. • Eversion technique is effective in cases of proximal granulation tissue.

  11. RESULTS OF COMPUTATIONS MADE FOR DASA-USNRDL FALLOUT SYMPOSIUM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Read, R.; Wagner, L.; Moorehead, E.

    1962-11-01

    The regression techniques introduced by the Civil Defense Research Project for estimating fallout particle deposition coordinates, their standard ellipses, and isointensity contours have been applied to some of the homework problems assigned for the DASA-USNRDL Fallout Symposium. The results are reported and the estimates are contrasted with estimates based on the assumption that winds are invariant with time. (auth).

  12. Strain-gage bridge calibration and flight loads measurements on a low-aspect-ratio thin wing

    NASA Technical Reports Server (NTRS)

    Peele, E. L.; Eckstrom, C. V.

    1975-01-01

    Strain-gage bridges were used to make in-flight measurements of bending moment, shear, and torque loads on a low-aspect-ratio, thin, swept wing having a full depth honeycomb sandwich type structure. Standard regression analysis techniques were employed in the calibration of the strain bridges. Comparison of the measured loads with theoretical loads are included.

  13. Multiple Linear Regression Analysis of Factors Affecting Real Property Price Index From Case Study Research In Istanbul/Turkey

    NASA Astrophysics Data System (ADS)

    Denli, H. H.; Koc, Z.

    2015-12-01

    Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.

  14. Analysis of Binary Adherence Data in the Setting of Polypharmacy: A Comparison of Different Approaches

    PubMed Central

    Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.

    2009-01-01

    Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358

  15. Regionalization of harmonic-mean streamflows in Kentucky

    USGS Publications Warehouse

    Martin, Gary R.; Ruhl, Kevin J.

    1993-01-01

    Harmonic-mean streamflow (Qh), defined as the reciprocal of the arithmetic mean of the reciprocal daily streamflow values, was determined for selected stream sites in Kentucky. Daily mean discharges for the available period of record through the 1989 water year at 230 continuous record streamflow-gaging stations located in and adjacent to Kentucky were used in the analysis. Periods of record affected by regulation were identified and analyzed separately from periods of record unaffected by regulation. Record-extension procedures were applied to short-term stations to reducetime-sampling error and, thus, improve estimates of the long-term Qh. Techniques to estimate the Qh at ungaged stream sites in Kentucky were developed. A regression model relating Qh to total drainage area and streamflow-variability index was presented with example applications. The regression model has a standard error of estimate of 76 percent and a standard error of prediction of 78 percent.

  16. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    PubMed

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  17. Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

    USGS Publications Warehouse

    Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

    2009-01-01

    In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.

  18. Deep learning ensemble with asymptotic techniques for oscillometric blood pressure estimation.

    PubMed

    Lee, Soojeong; Chang, Joon-Hyuk

    2017-11-01

    This paper proposes a deep learning based ensemble regression estimator with asymptotic techniques, and offers a method that can decrease uncertainty for oscillometric blood pressure (BP) measurements using the bootstrap and Monte-Carlo approach. While the former is used to estimate SBP and DBP, the latter attempts to determine confidence intervals (CIs) for SBP and DBP based on oscillometric BP measurements. This work originally employs deep belief networks (DBN)-deep neural networks (DNN) to effectively estimate BPs based on oscillometric measurements. However, there are some inherent problems with these methods. First, it is not easy to determine the best DBN-DNN estimator, and worthy information might be omitted when selecting one DBN-DNN estimator and discarding the others. Additionally, our input feature vectors, obtained from only five measurements per subject, represent a very small sample size; this is a critical weakness when using the DBN-DNN technique and can cause overfitting or underfitting, depending on the structure of the algorithm. To address these problems, an ensemble with an asymptotic approach (based on combining the bootstrap with the DBN-DNN technique) is utilized to generate the pseudo features needed to estimate the SBP and DBP. In the first stage, the bootstrap-aggregation technique is used to create ensemble parameters. Afterward, the AdaBoost approach is employed for the second-stage SBP and DBP estimation. We then use the bootstrap and Monte-Carlo techniques in order to determine the CIs based on the target BP estimated using the DBN-DNN ensemble regression estimator with the asymptotic technique in the third stage. The proposed method can mitigate the estimation uncertainty such as large the standard deviation of error (SDE) on comparing the proposed DBN-DNN ensemble regression estimator with the DBN-DNN single regression estimator, we identify that the SDEs of the SBP and DBP are reduced by 0.58 and 0.57  mmHg, respectively. These indicate that the proposed method actually enhances the performance by 9.18% and 10.88% compared with the DBN-DNN single estimator. The proposed methodology improves the accuracy of BP estimation and reduces the uncertainty for BP estimation. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Proposed standard-weight equations for brook trout

    USGS Publications Warehouse

    Hyatt, M.W.; Hubert, W.A.

    2001-01-01

    Weight and length data were obtained for 113 populations of brook trout Salvelinus fontinalis across the species' geographic range in North America to estimate a standard-weight (Ws) equation for this species. Estimation was done by applying the regression-line-percentile technique to fish of 120-620 mm total length (TL). The proposed metric-unit (g and mm) equation is log10Ws = -5.186 + 3.103 log10TL; the English-unit (lb and in) equivalent is log10Ws = -3.483 + 3.103 log10TL. No systematic length bias was evident in the relative-weight values calculated from these equations.

  20. Extending the Distributed Lag Model framework to handle chemical mixtures.

    PubMed

    Bello, Ghalib A; Arora, Manish; Austin, Christine; Horton, Megan K; Wright, Robert O; Gennings, Chris

    2017-07-01

    Distributed Lag Models (DLMs) are used in environmental health studies to analyze the time-delayed effect of an exposure on an outcome of interest. Given the increasing need for analytical tools for evaluation of the effects of exposure to multi-pollutant mixtures, this study attempts to extend the classical DLM framework to accommodate and evaluate multiple longitudinally observed exposures. We introduce 2 techniques for quantifying the time-varying mixture effect of multiple exposures on an outcome of interest. Lagged WQS, the first technique, is based on Weighted Quantile Sum (WQS) regression, a penalized regression method that estimates mixture effects using a weighted index. We also introduce Tree-based DLMs, a nonparametric alternative for assessment of lagged mixture effects. This technique is based on the Random Forest (RF) algorithm, a nonparametric, tree-based estimation technique that has shown excellent performance in a wide variety of domains. In a simulation study, we tested the feasibility of these techniques and evaluated their performance in comparison to standard methodology. Both methods exhibited relatively robust performance, accurately capturing pre-defined non-linear functional relationships in different simulation settings. Further, we applied these techniques to data on perinatal exposure to environmental metal toxicants, with the goal of evaluating the effects of exposure on neurodevelopment. Our methods identified critical neurodevelopmental windows showing significant sensitivity to metal mixtures. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Methods for trend analysis: Examples with problem/failure data

    NASA Technical Reports Server (NTRS)

    Church, Curtis K.

    1989-01-01

    Statistics are emphasized as an important role in quality control and reliability. Consequently, Trend Analysis Techniques recommended a variety of statistical methodologies that could be applied to time series data. The major goal of the working handbook, using data from the MSFC Problem Assessment System, is to illustrate some of the techniques in the NASA standard, some different techniques, and to notice patterns of data. Techniques for trend estimation used are: regression (exponential, power, reciprocal, straight line) and Kendall's rank correlation coefficient. The important details of a statistical strategy for estimating a trend component are covered in the examples. However, careful analysis and interpretation is necessary because of small samples and frequent zero problem reports in a given time period. Further investigations to deal with these issues are being conducted.

  2. The impact of global signal regression on resting state correlations: Are anti-correlated networks introduced?

    PubMed Central

    Murphy, Kevin; Birn, Rasmus M.; Handwerker, Daniel A.; Jones, Tyler B.; Bandettini, Peter A.

    2009-01-01

    Low-frequency fluctuations in fMRI signal have been used to map several consistent resting state networks in the brain. Using the posterior cingulate cortex as a seed region, functional connectivity analyses have found not only positive correlations in the default mode network but negative correlations in another resting state network related to attentional processes. The interpretation is that the human brain is intrinsically organized into dynamic, anti-correlated functional networks. Global variations of the BOLD signal are often considered nuisance effects and are commonly removed using a general linear model (GLM) technique. This global signal regression method has been shown to introduce negative activation measures in standard fMRI analyses. The topic of this paper is whether such a correction technique could be the cause of anti-correlated resting state networks in functional connectivity analyses. Here we show that, after global signal regression, correlation values to a seed voxel must sum to a negative value. Simulations also show that small phase differences between regions can lead to spurious negative correlation values. A combination breath holding and visual task demonstrates that the relative phase of global and local signals can affect connectivity measures and that, experimentally, global signal regression leads to bell-shaped correlation value distributions, centred on zero. Finally, analyses of negatively correlated networks in resting state data show that global signal regression is most likely the cause of anti-correlations. These results call into question the interpretation of negatively correlated regions in the brain when using global signal regression as an initial processing step. PMID:18976716

  3. The impact of global signal regression on resting state correlations: are anti-correlated networks introduced?

    PubMed

    Murphy, Kevin; Birn, Rasmus M; Handwerker, Daniel A; Jones, Tyler B; Bandettini, Peter A

    2009-02-01

    Low-frequency fluctuations in fMRI signal have been used to map several consistent resting state networks in the brain. Using the posterior cingulate cortex as a seed region, functional connectivity analyses have found not only positive correlations in the default mode network but negative correlations in another resting state network related to attentional processes. The interpretation is that the human brain is intrinsically organized into dynamic, anti-correlated functional networks. Global variations of the BOLD signal are often considered nuisance effects and are commonly removed using a general linear model (GLM) technique. This global signal regression method has been shown to introduce negative activation measures in standard fMRI analyses. The topic of this paper is whether such a correction technique could be the cause of anti-correlated resting state networks in functional connectivity analyses. Here we show that, after global signal regression, correlation values to a seed voxel must sum to a negative value. Simulations also show that small phase differences between regions can lead to spurious negative correlation values. A combination breath holding and visual task demonstrates that the relative phase of global and local signals can affect connectivity measures and that, experimentally, global signal regression leads to bell-shaped correlation value distributions, centred on zero. Finally, analyses of negatively correlated networks in resting state data show that global signal regression is most likely the cause of anti-correlations. These results call into question the interpretation of negatively correlated regions in the brain when using global signal regression as an initial processing step.

  4. Analysis of the Magnitude and Frequency of Peak Discharge and Maximum Observed Peak Discharge in New Mexico and Surrounding Areas

    USGS Publications Warehouse

    Waltemeyer, Scott D.

    2008-01-01

    Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.

  5. Lateral-Directional Parameter Estimation on the X-48B Aircraft Using an Abstracted, Multi-Objective Effector Model

    NASA Technical Reports Server (NTRS)

    Ratnayake, Nalin A.; Waggoner, Erin R.; Taylor, Brian R.

    2011-01-01

    The problem of parameter estimation on hybrid-wing-body aircraft is complicated by the fact that many design candidates for such aircraft involve a large number of aerodynamic control effectors that act in coplanar motion. This adds to the complexity already present in the parameter estimation problem for any aircraft with a closed-loop control system. Decorrelation of flight and simulation data must be performed in order to ascertain individual surface derivatives with any sort of mathematical confidence. Non-standard control surface configurations, such as clamshell surfaces and drag-rudder modes, further complicate the modeling task. In this paper, time-decorrelation techniques are applied to a model structure selected through stepwise regression for simulated and flight-generated lateral-directional parameter estimation data. A virtual effector model that uses mathematical abstractions to describe the multi-axis effects of clamshell surfaces is developed and applied. Comparisons are made between time history reconstructions and observed data in order to assess the accuracy of the regression model. The Cram r-Rao lower bounds of the estimated parameters are used to assess the uncertainty of the regression model relative to alternative models. Stepwise regression was found to be a useful technique for lateral-directional model design for hybrid-wing-body aircraft, as suggested by available flight data. Based on the results of this study, linear regression parameter estimation methods using abstracted effectors are expected to perform well for hybrid-wing-body aircraft properly equipped for the task.

  6. Comparison of Predictive Modeling Methods of Aircraft Landing Speed

    NASA Technical Reports Server (NTRS)

    Diallo, Ousmane H.

    2012-01-01

    Expected increases in air traffic demand have stimulated the development of air traffic control tools intended to assist the air traffic controller in accurately and precisely spacing aircraft landing at congested airports. Such tools will require an accurate landing-speed prediction to increase throughput while decreasing necessary controller interventions for avoiding separation violations. There are many practical challenges to developing an accurate landing-speed model that has acceptable prediction errors. This paper discusses the development of a near-term implementation, using readily available information, to estimate/model final approach speed from the top of the descent phase of flight to the landing runway. As a first approach, all variables found to contribute directly to the landing-speed prediction model are used to build a multi-regression technique of the response surface equation (RSE). Data obtained from operations of a major airlines for a passenger transport aircraft type to the Dallas/Fort Worth International Airport are used to predict the landing speed. The approach was promising because it decreased the standard deviation of the landing-speed error prediction by at least 18% from the standard deviation of the baseline error, depending on the gust condition at the airport. However, when the number of variables is reduced to the most likely obtainable at other major airports, the RSE model shows little improvement over the existing methods. Consequently, a neural network that relies on a nonlinear regression technique is utilized as an alternative modeling approach. For the reduced number of variables cases, the standard deviation of the neural network models errors represent over 5% reduction compared to the RSE model errors, and at least 10% reduction over the baseline predicted landing-speed error standard deviation. Overall, the constructed models predict the landing-speed more accurately and precisely than the current state-of-the-art.

  7. An Improved Framework for Confound Regression and Filtering for Control of Motion Artifact in the Preprocessing of Resting-State Functional Connectivity Data

    PubMed Central

    Satterthwaite, Theodore D.; Elliott, Mark A.; Gerraty, Raphael T.; Ruparel, Kosha; Loughead, James; Calkins, Monica E.; Eickhoff, Simon B.; Hakonarson, Hakon; Gur, Ruben C.; Gur, Raquel E.; Wolf, Daniel H.

    2013-01-01

    Several recent reports in large, independent samples have demonstrated the influence of motion artifact on resting-state functional connectivity MRI (rsfc-MRI). Standard rsfc-MRI preprocessing typically includes regression of confounding signals and band-pass filtering. However, substantial heterogeneity exists in how these techniques are implemented across studies, and no prior study has examined the effect of differing approaches for the control of motion-induced artifacts. To better understand how in-scanner head motion affects rsfc-MRI data, we describe the spatial, temporal, and spectral characteristics of motion artifacts in a sample of 348 adolescents. Analyses utilize a novel approach for describing head motion on a voxelwise basis. Next, we systematically evaluate the efficacy of a range of confound regression and filtering techniques for the control of motion-induced artifacts. Results reveal that the effectiveness of preprocessing procedures on the control of motion is heterogeneous, and that improved preprocessing provides a substantial benefit beyond typical procedures. These results demonstrate that the effect of motion on rsfc-MRI can be substantially attenuated through improved preprocessing procedures, but not completely removed. PMID:22926292

  8. Liquid scintillation counting for /sup 14/C uptake of single algal cells isolated from natural samples

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rivkin, R.B.; Seliger, H.H.

    1981-07-01

    Short term rates of /sup 14/C uptake for single cells and small numbers of isolated algal cells of five phytoplankton species from natural populations were measured by liquid scintillation counting. Regression analysis of uptake rates per cell for cells isolated from unialgal cultures of seven species of dinoflagellates, ranging in volume from ca. 10/sup 3/ to 10/sup 7/ ..mu..m/sup 3/, gave results identical to uptake rates per cell measured by conventional /sup 14/C techniques. Relative standard errors or regression coefficients ranged between 3 and 10%, indicating that for any species there was little variation in photosynthesis per cell.

  9. Application of a parameter-estimation technique to modeling the regional aquifer underlying the eastern Snake River plain, Idaho

    USGS Publications Warehouse

    Garabedian, Stephen P.

    1986-01-01

    A nonlinear, least-squares regression technique for the estimation of ground-water flow model parameters was applied to the regional aquifer underlying the eastern Snake River Plain, Idaho. The technique uses a computer program to simulate two-dimensional, steady-state ground-water flow. Hydrologic data for the 1980 water year were used to calculate recharge rates, boundary fluxes, and spring discharges. Ground-water use was estimated from irrigated land maps and crop consumptive-use figures. These estimates of ground-water withdrawal, recharge rates, and boundary flux, along with leakance, were used as known values in the model calibration of transmissivity. Leakance values were adjusted between regression solutions by comparing model-calculated to measured spring discharges. In other simulations, recharge and leakance also were calibrated as prior-information regression parameters, which limits the variation of these parameters using a normalized standard error of estimate. Results from a best-fit model indicate a wide areal range in transmissivity from about 0.05 to 44 feet squared per second and in leakance from about 2.2x10 -9 to 6.0 x 10 -8 feet per second per foot. Along with parameter values, model statistics also were calculated, including the coefficient of correlation between calculated and observed head (0.996), the standard error of the estimates for head (40 feet), and the parameter coefficients of variation (about 10-40 percent). Additional boundary flux was added in some areas during calibration to achieve proper fit to ground-water flow directions. Model fit improved significantly when areas that violated model assumptions were removed. It also improved slightly when y-direction (northwest-southeast) transmissivity values were larger than x-direction (northeast-southwest) transmissivity values. The model was most sensitive to changes in recharge, and in some areas, to changes in transmissivity, particularly near the spring discharge area from Milner Dam to King Hill.

  10. Estimating the concrete compressive strength using hard clustering and fuzzy clustering based regression techniques.

    PubMed

    Nagwani, Naresh Kumar; Deo, Shirish V

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm.

  11. Estimating the Concrete Compressive Strength Using Hard Clustering and Fuzzy Clustering Based Regression Techniques

    PubMed Central

    Nagwani, Naresh Kumar; Deo, Shirish V.

    2014-01-01

    Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939

  12. PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

    NASA Astrophysics Data System (ADS)

    Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

    2014-10-01

    The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.

  13. System Identification Applied to Dynamic CFD Simulation and Wind Tunnel Data

    NASA Technical Reports Server (NTRS)

    Murphy, Patrick C.; Klein, Vladislav; Frink, Neal T.; Vicroy, Dan D.

    2011-01-01

    Demanding aerodynamic modeling requirements for military and civilian aircraft have provided impetus for researchers to improve computational and experimental techniques. Model validation is a key component for these research endeavors so this study is an initial effort to extend conventional time history comparisons by comparing model parameter estimates and their standard errors using system identification methods. An aerodynamic model of an aircraft performing one-degree-of-freedom roll oscillatory motion about its body axes is developed. The model includes linear aerodynamics and deficiency function parameters characterizing an unsteady effect. For estimation of unknown parameters two techniques, harmonic analysis and two-step linear regression, were applied to roll-oscillatory wind tunnel data and to computational fluid dynamics (CFD) simulated data. The model used for this study is a highly swept wing unmanned aerial combat vehicle. Differences in response prediction, parameters estimates, and standard errors are compared and discussed

  14. Proposed standard-weight (Ws) equation and length-categorization standards for brown trout (Salmo trutta) in lentic habitats

    USGS Publications Warehouse

    Hyatt, M.W.; Hubert, W.A.

    2001-01-01

    We developed a standard-weight (Ws) equation for brown trout (Salmo trutta) in lentic habitats by applying the regression-line-percentile technique to samples from 49 populations in North America. The proposed Ws equation is log10 Ws = -5.422 + 3.194 log10 TL, when Ws is in grams and TL is total length in millimeters. The English-unit equivalent is log10 Ws = -3.592 + 3.194 log10 TL, when Ws is in pounds and TL is total length in inches. The equation is applicable for fish of 140-750 mm TL. Proposed length-category standards to evaluate fish within populations are: stock, 200 mm (8 in); quality, 300 mm (12 in); preferred, 400 mm (16 in); memorable, 500 mm (20 in); and trophy, 600 mm (24 in).

  15. Multi-fidelity Gaussian process regression for prediction of random fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parussini, L.; Venturi, D., E-mail: venturi@ucsc.edu; Perdikaris, P.

    We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgersmore » equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.« less

  16. Improving streamflow estimates through the use of LANDSAT. [Wisconsin and Pecatonica-Sugar River basins

    NASA Technical Reports Server (NTRS)

    Allord, G. J. (Principal Investigator); Scarpace, F. L.

    1981-01-01

    Estimates of low flow and flood frequency in several southwestern Wisconsin basins were improved by determining land cover from LANDSAT imagery. With the use of estimates of land cover in multiple-regression techniques, the standard error of estimate (SE) for the least annual 7-day low flow for 2- and 10-year recurrence intervals of ungaged sites were lowered by 9% each. The SE of flood frequency in the 'Driftless Area' of Wisconsin for 10-, 50-, and 100-year recurrence intervals were lowered by 14%. Four of nine basin characteristics determined from satellite imagery were significant variables in the multiple-regression techniques, whereas only 1 of the 12 characteristics determined from topographic maps was significant. The percentages of land cover categories in each basin were determined by merging basin boundaries, digitized from quadrangles, with a classified LANDSAT scene. Both the basin boundary X-Y polygon coordinates and the satellite coordinates were converted to latitude-longitude for merging compatibility.

  17. Biases and Standard Errors of Standardized Regression Coefficients

    ERIC Educational Resources Information Center

    Yuan, Ke-Hai; Chan, Wai

    2011-01-01

    The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…

  18. Two SPSS programs for interpreting multiple regression results.

    PubMed

    Lorenzo-Seva, Urbano; Ferrando, Pere J; Chico, Eliseo

    2010-02-01

    When multiple regression is used in explanation-oriented designs, it is very important to determine both the usefulness of the predictor variables and their relative importance. Standardized regression coefficients are routinely provided by commercial programs. However, they generally function rather poorly as indicators of relative importance, especially in the presence of substantially correlated predictors. We provide two user-friendly SPSS programs that implement currently recommended techniques and recent developments for assessing the relevance of the predictors. The programs also allow the user to take into account the effects of measurement error. The first program, MIMR-Corr.sps, uses a correlation matrix as input, whereas the second program, MIMR-Raw.sps, uses the raw data and computes bootstrap confidence intervals of different statistics. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from http://brm.psychonomic-journals.org/content/supplemental.

  19. The effect of different distance measures in detecting outliers using clustering-based algorithm for circular regression model

    NASA Astrophysics Data System (ADS)

    Di, Nur Faraidah Muhammad; Satari, Siti Zanariah

    2017-05-01

    Outlier detection in linear data sets has been done vigorously but only a small amount of work has been done for outlier detection in circular data. In this study, we proposed multiple outliers detection in circular regression models based on the clustering algorithm. Clustering technique basically utilizes distance measure to define distance between various data points. Here, we introduce the similarity distance based on Euclidean distance for circular model and obtain a cluster tree using the single linkage clustering algorithm. Then, a stopping rule for the cluster tree based on the mean direction and circular standard deviation of the tree height is proposed. We classify the cluster group that exceeds the stopping rule as potential outlier. Our aim is to demonstrate the effectiveness of proposed algorithms with the similarity distances in detecting the outliers. It is found that the proposed methods are performed well and applicable for circular regression model.

  20. Supervised Learning for Dynamical System Learning.

    PubMed

    Hefny, Ahmed; Downey, Carlton; Gordon, Geoffrey J

    2015-01-01

    Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem, we present a new view of dynamical system learning: we show how to learn dynamical systems by solving a sequence of ordinary supervised learning problems, thereby allowing users to incorporate prior knowledge via standard techniques such as L 1 regularization. Many existing spectral methods are special cases of this new framework, using linear regression as the supervised learner. We demonstrate the effectiveness of our framework by showing examples where nonlinear regression or lasso let us learn better state representations than plain linear regression does; the correctness of these instances follows directly from our general analysis.

  1. Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

    PubMed

    Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

    2012-05-01

    Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  2. The Prediction Properties of Inverse and Reverse Regression for the Simple Linear Calibration Problem

    NASA Technical Reports Server (NTRS)

    Parker, Peter A.; Geoffrey, Vining G.; Wilson, Sara R.; Szarka, John L., III; Johnson, Nels G.

    2010-01-01

    The calibration of measurement systems is a fundamental but under-studied problem within industrial statistics. The origins of this problem go back to basic chemical analysis based on NIST standards. In today's world these issues extend to mechanical, electrical, and materials engineering. Often, these new scenarios do not provide "gold standards" such as the standard weights provided by NIST. This paper considers the classic "forward regression followed by inverse regression" approach. In this approach the initial experiment treats the "standards" as the regressor and the observed values as the response to calibrate the instrument. The analyst then must invert the resulting regression model in order to use the instrument to make actual measurements in practice. This paper compares this classical approach to "reverse regression," which treats the standards as the response and the observed measurements as the regressor in the calibration experiment. Such an approach is intuitively appealing because it avoids the need for the inverse regression. However, it also violates some of the basic regression assumptions.

  3. Creep-Rupture Data Analysis - Engineering Application of Regression Techniques. Ph.D. Thesis - North Carolina State Univ.

    NASA Technical Reports Server (NTRS)

    Rummler, D. R.

    1976-01-01

    The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.

  4. The Role of Inflation and Price Escalation Adjustments in Properly Estimating Program Costs: F-35 Case Study

    DTIC Science & Technology

    2016-03-01

    regression models that yield hedonic price indexes is closely related to standard techniques for developing cost estimating relationships ( CERs ...October 2014). iii analysis) and derives a price index from the coefficients on variables reflecting the year of purchase. In CER development, the...index. The relevant cost metric in both cases is unit recurring flyaway (URF) costs. For the current project, we develop a “Baseline” CER model, taking

  5. The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Adnan, Arisman; Sugiarto, Sigit

    2017-06-01

    The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.

  6. Application of neural networks and sensitivity analysis to improved prediction of trauma survival.

    PubMed

    Hunter, A; Kennedy, L; Henry, J; Ferguson, I

    2000-05-01

    The performance of trauma departments is widely audited by applying predictive models that assess probability of survival, and examining the rate of unexpected survivals and deaths. Although the TRISS methodology, a logistic regression modelling technique, is still the de facto standard, it is known that neural network models perform better. A key issue when applying neural network models is the selection of input variables. This paper proposes a novel form of sensitivity analysis, which is simpler to apply than existing techniques, and can be used for both numeric and nominal input variables. The technique is applied to the audit survival problem, and used to analyse the TRISS variables. The conclusions discuss the implications for the design of further improved scoring schemes and predictive models.

  7. Cuffless and Continuous Blood Pressure Estimation from the Heart Sound Signals

    PubMed Central

    Peng, Rong-Chao; Yan, Wen-Rong; Zhang, Ning-Ling; Lin, Wan-Hua; Zhou, Xiao-Lin; Zhang, Yuan-Ting

    2015-01-01

    Cardiovascular disease, like hypertension, is one of the top killers of human life and early detection of cardiovascular disease is of great importance. However, traditional medical devices are often bulky and expensive, and unsuitable for home healthcare. In this paper, we proposed an easy and inexpensive technique to estimate continuous blood pressure from the heart sound signals acquired by the microphone of a smartphone. A cold-pressor experiment was performed in 32 healthy subjects, with a smartphone to acquire heart sound signals and with a commercial device to measure continuous blood pressure. The Fourier spectrum of the second heart sound and the blood pressure were regressed using a support vector machine, and the accuracy of the regression was evaluated using 10-fold cross-validation. Statistical analysis showed that the mean correlation coefficients between the predicted values from the regression model and the measured values from the commercial device were 0.707, 0.712, and 0.748 for systolic, diastolic, and mean blood pressure, respectively, and that the mean errors were less than 5 mmHg, with standard deviations less than 8 mmHg. These results suggest that this technique is of potential use for cuffless and continuous blood pressure monitoring and it has promising application in home healthcare services. PMID:26393591

  8. Cuffless and Continuous Blood Pressure Estimation from the Heart Sound Signals.

    PubMed

    Peng, Rong-Chao; Yan, Wen-Rong; Zhang, Ning-Ling; Lin, Wan-Hua; Zhou, Xiao-Lin; Zhang, Yuan-Ting

    2015-09-17

    Cardiovascular disease, like hypertension, is one of the top killers of human life and early detection of cardiovascular disease is of great importance. However, traditional medical devices are often bulky and expensive, and unsuitable for home healthcare. In this paper, we proposed an easy and inexpensive technique to estimate continuous blood pressure from the heart sound signals acquired by the microphone of a smartphone. A cold-pressor experiment was performed in 32 healthy subjects, with a smartphone to acquire heart sound signals and with a commercial device to measure continuous blood pressure. The Fourier spectrum of the second heart sound and the blood pressure were regressed using a support vector machine, and the accuracy of the regression was evaluated using 10-fold cross-validation. Statistical analysis showed that the mean correlation coefficients between the predicted values from the regression model and the measured values from the commercial device were 0.707, 0.712, and 0.748 for systolic, diastolic, and mean blood pressure, respectively, and that the mean errors were less than 5 mmHg, with standard deviations less than 8 mmHg. These results suggest that this technique is of potential use for cuffless and continuous blood pressure monitoring and it has promising application in home healthcare services.

  9. Techniques for estimating flood-peak discharges from urban basins in Missouri

    USGS Publications Warehouse

    Becker, L.D.

    1986-01-01

    Techniques are defined for estimating the magnitude and frequency of future flood peak discharges of rainfall-induced runoff from small urban basins in Missouri. These techniques were developed from an initial analysis of flood records of 96 gaged sites in Missouri and adjacent states. Final regression equations are based on a balanced, representative sampling of 37 gaged sites in Missouri. This sample included 9 statewide urban study sites, 18 urban sites in St. Louis County, and 10 predominantly rural sites statewide. Short-term records were extended on the basis of long-term climatic records and use of a rainfall-runoff model. Linear least-squares regression analyses were used with log-transformed variables to relate flood magnitudes of selected recurrence intervals (dependent variables) to selected drainage basin indexes (independent variables). For gaged urban study sites within the State, the flood peak estimates are from the frequency curves defined from the synthesized long-term discharge records. Flood frequency estimates are made for ungaged sites by using regression equations that require determination of the drainage basin size and either the percentage of impervious area or a basin development factor. Alternative sets of equations are given for the 2-, 5-, 10-, 25-, 50-, and 100-yr recurrence interval floods. The average standard errors of estimate range from about 33% for the 2-yr flood to 26% for the 100-yr flood. The techniques for estimation are applicable to flood flows that are not significantly affected by storage caused by manmade activities. Flood peak discharge estimating equations are considered applicable for sites on basins draining approximately 0.25 to 40 sq mi. (Author 's abstract)

  10. Updated techniques for estimating monthly streamflow-duration characteristics at ungaged and partial-record sites in central Nevada

    USGS Publications Warehouse

    Hess, Glen W.

    2002-01-01

    Techniques for estimating monthly streamflow-duration characteristics at ungaged and partial-record sites in central Nevada have been updated. These techniques were developed using streamflow records at six continuous-record sites, basin physical and climatic characteristics, and concurrent streamflow measurements at four partial-record sites. Two methods, the basin-characteristic method and the concurrent-measurement method, were developed to provide estimating techniques for selected streamflow characteristics at ungaged and partial-record sites in central Nevada. In the first method, logarithmic-regression analyses were used to relate monthly mean streamflows (from all months and by month) from continuous-record gaging sites of various percent exceedence levels or monthly mean streamflows (by month) to selected basin physical and climatic variables at ungaged sites. Analyses indicate that the total drainage area and percent of drainage area at altitudes greater than 10,000 feet are the most significant variables. For the equations developed from all months of monthly mean streamflow, the coefficient of determination averaged 0.84 and the standard error of estimate of the relations for the ungaged sites averaged 72 percent. For the equations derived from monthly means by month, the coefficient of determination averaged 0.72 and the standard error of estimate of the relations averaged 78 percent. If standard errors are compared, the relations developed in this study appear generally to be less accurate than those developed in a previous study. However, the new relations are based on additional data and the slight increase in error may be due to the wider range of streamflow for a longer period of record, 1995-2000. In the second method, streamflow measurements at partial-record sites were correlated with concurrent streamflows at nearby gaged sites by the use of linear-regression techniques. Statistical measures of results using the second method typically indicated greater accuracy than for the first method. However, to make estimates for individual months, the concurrent-measurement method requires several years additional streamflow data at more partial-record sites. Thus, exceedence values for individual months are not yet available due to the low number of concurrent-streamflow-measurement data available. Reliability, limitations, and applications of both estimating methods are described herein.

  11. Design of surface-water data networks for regional information

    USGS Publications Warehouse

    Moss, Marshall E.; Gilroy, E.J.; Tasker, Gary D.; Karlinger, M.R.

    1982-01-01

    This report describes a technique, Network Analysis of Regional Information (NARI), and the existing computer procedures that have been developed for the specification of the regional information-cost relation for several statistical parameters of streamflow. The measure of information used is the true standard error of estimate of a regional logarithmic regression. The cost is a function of the number of stations at which hydrologic data are collected and the number of years for which the data are collected. The technique can be used to obtain either (1) a minimum cost network that will attain a prespecified accuracy and reliability or (2) a network that maximizes information given a set of budgetary and time constraints.

  12. Real time flaw detection and characterization in tube through partial least squares and SVR: Application to eddy current testing

    NASA Astrophysics Data System (ADS)

    Ahmed, Shamim; Miorelli, Roberto; Calmon, Pierre; Anselmi, Nicola; Salucci, Marco

    2018-04-01

    This paper describes Learning-By-Examples (LBE) technique for performing quasi real time flaw localization and characterization within a conductive tube based on Eddy Current Testing (ECT) signals. Within the framework of LBE, the combination of full-factorial (i.e., GRID) sampling and Partial Least Squares (PLS) feature extraction (i.e., GRID-PLS) techniques are applied for generating a suitable training set in offine phase. Support Vector Regression (SVR) is utilized for model development and inversion during offine and online phases, respectively. The performance and robustness of the proposed GIRD-PLS/SVR strategy on noisy test set is evaluated and compared with standard GRID/SVR approach.

  13. Approaches to reduce urinary tract injury during management of placenta accreta, increta, and percreta: a systematic review.

    PubMed

    Tam Tam, Kiran Babu; Dozier, James; Martin, James Nello

    2012-04-01

    A systematic review of the literature was conducted to answer the following question: are there enhancements to standard peripartum hysterectomy technique that minimize unintentional urinary tract (UT) injury in pregnancies complicated by invasive placental attachment (INPLAT)? A PubMed search of English language articles on INPLAT published by June 2010 was conducted. Data regarding the following parameters was required for inclusion in the quantitative analysis of the review's objective: (1) type of INPLAT, (2) details pertaining to medical and surgical management of INPLAT, and (3) complications, if any, associated with management. An attempt was made to identify approaches that may lower the risk of unintentional UT injury. Most cases (285 of 292) were managed by hysterectomy. There were 83 (29%) cases of unintentional UT injury. Antenatal diagnosis of INPLAT lowered the rate of UT injury (39% vs. 63%; P = 0.04). Information regarding surgical technique or medical management was available for 90 cases; 14 of these underwent a standard hysterectomy technique. Methotrexate treatment and 11 modifications of the surgical technique were associated with 16% unintentional UT injury rate as opposed to 57% for standard hysterectomy (P = 0.002). The use of ureteral stents reduced risk of urologic injury (P = 0.01). Multiple logistic regression analysis identified antenatal diagnosis as the significant predictor of an intact UT. Antenatal diagnosis of INPLAT is paramount to minimize UT injury. Utilization of management modifications identified in this review may reduce urologic injury due to INPLAT.

  14. Subsonic Aircraft With Regression and Neural-Network Approximators Designed

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Hopkins, Dale A.

    2004-01-01

    At the NASA Glenn Research Center, NASA Langley Research Center's Flight Optimization System (FLOPS) and the design optimization testbed COMETBOARDS with regression and neural-network-analysis approximators have been coupled to obtain a preliminary aircraft design methodology. For a subsonic aircraft, the optimal design, that is the airframe-engine combination, is obtained by the simulation. The aircraft is powered by two high-bypass-ratio engines with a nominal thrust of about 35,000 lbf. It is to carry 150 passengers at a cruise speed of Mach 0.8 over a range of 3000 n mi and to operate on a 6000-ft runway. The aircraft design utilized a neural network and a regression-approximations-based analysis tool, along with a multioptimizer cascade algorithm that uses sequential linear programming, sequential quadratic programming, the method of feasible directions, and then sequential quadratic programming again. Optimal aircraft weight versus the number of design iterations is shown. The central processing unit (CPU) time to solution is given. It is shown that the regression-method-based analyzer exhibited a smoother convergence pattern than the FLOPS code. The optimum weight obtained by the approximation technique and the FLOPS code differed by 1.3 percent. Prediction by the approximation technique exhibited no error for the aircraft wing area and turbine entry temperature, whereas it was within 2 percent for most other parameters. Cascade strategy was required by FLOPS as well as the approximators. The regression method had a tendency to hug the data points, whereas the neural network exhibited a propensity to follow a mean path. The performance of the neural network and regression methods was considered adequate. It was at about the same level for small, standard, and large models with redundancy ratios (defined as the number of input-output pairs to the number of unknown coefficients) of 14, 28, and 57, respectively. In an SGI octane workstation (Silicon Graphics, Inc., Mountainview, CA), the regression training required a fraction of a CPU second, whereas neural network training was between 1 and 9 min, as given. For a single analysis cycle, the 3-sec CPU time required by the FLOPS code was reduced to milliseconds by the approximators. For design calculations, the time with the FLOPS code was 34 min. It was reduced to 2 sec with the regression method and to 4 min by the neural network technique. The performance of the regression and neural network methods was found to be satisfactory for the analysis and design optimization of the subsonic aircraft.

  15. Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

    NASA Technical Reports Server (NTRS)

    Ohring, G.

    1972-01-01

    Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

  16. Decision trees in epidemiological research.

    PubMed

    Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone

    2017-01-01

    In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  17. Fusion of multiscale wavelet-based fractal analysis on retina image for stroke prediction.

    PubMed

    Che Azemin, M Z; Kumar, Dinesh K; Wong, T Y; Wang, J J; Kawasaki, R; Mitchell, P; Arjunan, Sridhar P

    2010-01-01

    In this paper, we present a novel method of analyzing retinal vasculature using Fourier Fractal Dimension to extract the complexity of the retinal vasculature enhanced at different wavelet scales. Logistic regression was used as a fusion method to model the classifier for 5-year stroke prediction. The efficacy of this technique has been tested using standard pattern recognition performance evaluation, Receivers Operating Characteristics (ROC) analysis and medical prediction statistics, odds ratio. Stroke prediction model was developed using the proposed system.

  18. Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.

    PubMed

    Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C

    2014-03-01

    To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  19. A no-gold-standard technique for objective assessment of quantitative nuclear-medicine imaging methods

    PubMed Central

    Jha, Abhinav K; Caffo, Brian; Frey, Eric C

    2016-01-01

    The objective optimization and evaluation of nuclear-medicine quantitative imaging methods using patient data is highly desirable but often hindered by the lack of a gold standard. Previously, a regression-without-truth (RWT) approach has been proposed for evaluating quantitative imaging methods in the absence of a gold standard, but this approach implicitly assumes that bounds on the distribution of true values are known. Several quantitative imaging methods in nuclear-medicine imaging measure parameters where these bounds are not known, such as the activity concentration in an organ or the volume of a tumor. We extended upon the RWT approach to develop a no-gold-standard (NGS) technique for objectively evaluating such quantitative nuclear-medicine imaging methods with patient data in the absence of any ground truth. Using the parameters estimated with the NGS technique, a figure of merit, the noise-to-slope ratio (NSR), can be computed, which can rank the methods on the basis of precision. An issue with NGS evaluation techniques is the requirement of a large number of patient studies. To reduce this requirement, the proposed method explored the use of multiple quantitative measurements from the same patient, such as the activity concentration values from different organs in the same patient. The proposed technique was evaluated using rigorous numerical experiments and using data from realistic simulation studies. The numerical experiments demonstrated that the NSR was estimated accurately using the proposed NGS technique when the bounds on the distribution of true values were not precisely known, thus serving as a very reliable metric for ranking the methods on the basis of precision. In the realistic simulation study, the NGS technique was used to rank reconstruction methods for quantitative single-photon emission computed tomography (SPECT) based on their performance on the task of estimating the mean activity concentration within a known volume of interest. Results showed that the proposed technique provided accurate ranking of the reconstruction methods for 97.5% of the 50 noise realizations. Further, the technique was robust to the choice of evaluated reconstruction methods. The simulation study pointed to possible violations of the assumptions made in the NGS technique under clinical scenarios. However, numerical experiments indicated that the NGS technique was robust in ranking methods even when there was some degree of such violation. PMID:26982626

  20. A no-gold-standard technique for objective assessment of quantitative nuclear-medicine imaging methods.

    PubMed

    Jha, Abhinav K; Caffo, Brian; Frey, Eric C

    2016-04-07

    The objective optimization and evaluation of nuclear-medicine quantitative imaging methods using patient data is highly desirable but often hindered by the lack of a gold standard. Previously, a regression-without-truth (RWT) approach has been proposed for evaluating quantitative imaging methods in the absence of a gold standard, but this approach implicitly assumes that bounds on the distribution of true values are known. Several quantitative imaging methods in nuclear-medicine imaging measure parameters where these bounds are not known, such as the activity concentration in an organ or the volume of a tumor. We extended upon the RWT approach to develop a no-gold-standard (NGS) technique for objectively evaluating such quantitative nuclear-medicine imaging methods with patient data in the absence of any ground truth. Using the parameters estimated with the NGS technique, a figure of merit, the noise-to-slope ratio (NSR), can be computed, which can rank the methods on the basis of precision. An issue with NGS evaluation techniques is the requirement of a large number of patient studies. To reduce this requirement, the proposed method explored the use of multiple quantitative measurements from the same patient, such as the activity concentration values from different organs in the same patient. The proposed technique was evaluated using rigorous numerical experiments and using data from realistic simulation studies. The numerical experiments demonstrated that the NSR was estimated accurately using the proposed NGS technique when the bounds on the distribution of true values were not precisely known, thus serving as a very reliable metric for ranking the methods on the basis of precision. In the realistic simulation study, the NGS technique was used to rank reconstruction methods for quantitative single-photon emission computed tomography (SPECT) based on their performance on the task of estimating the mean activity concentration within a known volume of interest. Results showed that the proposed technique provided accurate ranking of the reconstruction methods for 97.5% of the 50 noise realizations. Further, the technique was robust to the choice of evaluated reconstruction methods. The simulation study pointed to possible violations of the assumptions made in the NGS technique under clinical scenarios. However, numerical experiments indicated that the NGS technique was robust in ranking methods even when there was some degree of such violation.

  1. Integration of measurements with atmospheric dispersion models: Source term estimation for dispersal of (239)Pu due to non-nuclear detonation of high explosive

    NASA Astrophysics Data System (ADS)

    Edwards, L. L.; Harvey, T. F.; Freis, R. P.; Pitovranov, S. E.; Chernokozhin, E. V.

    1992-10-01

    The accuracy associated with assessing the environmental consequences of an accidental release of radioactivity is highly dependent on our knowledge of the source term characteristics and, in the case when the radioactivity is condensed on particles, the particle size distribution, all of which are generally poorly known. This paper reports on the development of a numerical technique that integrates the radiological measurements with atmospheric dispersion modeling. This results in a more accurate particle-size distribution and particle injection height estimation when compared with measurements of high explosive dispersal of (239)Pu. The estimation model is based on a non-linear least squares regression scheme coupled with the ARAC three-dimensional atmospheric dispersion models. The viability of the approach is evaluated by estimation of ADPIC model input parameters such as the ADPIC particle size mean aerodynamic diameter, the geometric standard deviation, and largest size. Additionally we estimate an optimal 'coupling coefficient' between the particles and an explosive cloud rise model. The experimental data are taken from the Clean Slate 1 field experiment conducted during 1963 at the Tonopah Test Range in Nevada. The regression technique optimizes the agreement between the measured and model predicted concentrations of (239)Pu by varying the model input parameters within their respective ranges of uncertainties. The technique generally estimated the measured concentrations within a factor of 1.5, with the worst estimate being within a factor of 5, very good in view of the complexity of the concentration measurements, the uncertainties associated with the meteorological data, and the limitations of the models. The best fit also suggest a smaller mean diameter and a smaller geometric standard deviation on the particle size as well as a slightly weaker particle to cloud coupling than previously reported.

  2. Methods for estimating the magnitude and frequency of peak discharges of rural, unregulated streams in Virginia

    USGS Publications Warehouse

    Bisese, James A.

    1995-01-01

    Methods are presented for estimating the peak discharges of rural, unregulated streams in Virginia. A Pearson Type III distribution is fitted to the logarithms of the unregulated annual peak-discharge records from 363 stream-gaging stations in Virginia to estimate the peak discharge at these stations for recurrence intervals of 2 to 500 years. Peak-discharge characteristics for 284 unregulated stations are divided into eight regions based on physiographic province, and regressed on basin characteristics, including drainage area, main channel length, main channel slope, mean basin elevation, percentage of forest cover, mean annual precipitation, and maximum rainfall intensity. Regression equations for each region are computed by use of the generalized least-squares method, which accounts for spatial and temporal correlation between nearby gaging stations. This regression technique weights the significance of each station to the regional equation based on the length of records collected at each cation, the correlation between annual peak discharges among the stations, and the standard deviation of the annual peak discharge for each station.Drainage area proved to be the only significant explanatory variable in four regions, while other regions have as many as three significant variables. Standard errors of the regression equations range from 30 to 80 percent. Alternate equations using drainage area only are provided for the five regions with more than one significant explanatory variable.Methods and sample computations are provided to estimate peak discharges at gaged and engaged sites in Virginia for recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, and to adjust the regression estimates for sites on gaged streams where nearby gaging-station records are available.

  3. A review of the performance and structural considerations of paraffin wax hybrid rocket fuels with additives

    NASA Astrophysics Data System (ADS)

    Veale, Kirsty; Adali, Sarp; Pitot, Jean; Brooks, Michael

    2017-12-01

    Paraffin wax as a hybrid rocket fuel has not been comprehensively characterised, especially regarding the structural feasibility of the material in launch applications. Preliminary structural testing has shown paraffin wax to be a brittle, low strength material, and at risk of failure under launch loading conditions. Structural enhancing additives have been identified, but their effect on motor performance has not always been considered, nor has any standard method of testing been identified between research institutes. A review of existing regression rate measurement techniques on paraffin wax based fuels and the results obtained with various additives are collated and discussed in this paper. The review includes 2D slab motors that enable visualisation of liquefying fuel droplet entrainment and the effect of an increased viscosity on the droplet entrainment mechanism, which can occur with the addition of structural enhancing polymers. An increased viscosity has been shown to reduce the regression rate of liquefying fuels. Viscosity increasing additives that have been tested include EVA and LDPE. Both these additives increase the structural properties of paraffin wax, where the elongation and UTS are improved. Other additives, such as metal hydrides, aluminium and boron generally offer improvements on the regression rate. However, very little consideration has been given to the structural effects these additives have on the wax grain. A 40% aluminised grain, for example, offers a slight increase in the UTS but reduces the elongation of paraffin wax. Geometrically accurate lab-scale motors have also been used to determine the regression rate properties of various additives in paraffin wax. A concise review of all available regression rate testing techniques and results on paraffin wax based hybrid propellants, as well as existing structural testing data, is presented in this paper.

  4. Some methods of computing platform transmitter terminal location estimates. [ARGOS system; whale tracking

    NASA Technical Reports Server (NTRS)

    Hoisington, C. M.

    1984-01-01

    A position estimation algorithm was developed to track a humpback whale tagged with an ARGOS platform after a transmitter deployment failure and the whale's diving behavior precluded standard methods. The algorithm is especially useful where a transmitter location program exists; it determines the classical keplarian elements from the ARGOS spacecraft position vectors included with the probationary file messages. A minimum of three distinct messages are required. Once the spacecraft orbit is determined, the whale is located using standard least squares regression techniques. Experience suggests that in instances where circumstances inherent in the experiment yield message data unsuitable for the standard ARGOS reduction, (message data may be too sparse, span an insufficient period, or include variable-length messages). System ARGOS can still provide much valuable location information if the user is willing to accept the increased location uncertainties.

  5. Adjuvant corneal crosslinking to prevent hyperopic LASIK regression.

    PubMed

    Aslanides, Ioannis M; Mukherjee, Achyut N

    2013-01-01

    To report the long term outcomes, safety, stability, and efficacy in a pilot series of simultaneous hyperopic laser assisted in situ keratomileusis (LASIK) and corneal crosslinking (CXL). A small cohort series of five eyes, with clinically suboptimal topography and/or thickness, underwent LASIK surgery with immediate riboflavin application under the flap, followed by UV light irradiation. Postoperative assessment was performed at 1, 3, 6, and 12 months, with late follow up at 4 years, and results were compared with a matched cohort that received LASIK only. The average age of the LASIK-CXL group was 39 years (26-46), and the average spherical equivalent hyperopic refractive error was +3.45 diopters (standard deviation 0.76; range 2.5 to 4.5). All eyes maintained refractive stability over the 4 years. There were no complications related to CXL, and topographic and clinical outcomes were as expected for standard LASIK. This limited series suggests that simultaneous LASIK and CXL for hyperopia is safe. Outcomes of the small cohort suggest that this technique may be promising for ameliorating hyperopic regression, presumed to be biomechanical in origin, and may also address ectasia risk.

  6. Peak-flow characteristics of Wyoming streams

    USGS Publications Warehouse

    Miller, Kirk A.

    2003-01-01

    Peak-flow characteristics for unregulated streams in Wyoming are described in this report. Frequency relations for annual peak flows through water year 2000 at 364 streamflow-gaging stations in and near Wyoming were evaluated and revised or updated as needed. Analyses of historical floods, temporal trends, and generalized skew were included in the evaluation. Physical and climatic basin characteristics were determined for each gaging station using a geographic information system. Gaging stations with similar peak-flow and basin characteristics were grouped into six hydrologic regions. Regional statistical relations between peak-flow and basin characteristics were explored using multiple-regression techniques. Generalized least squares regression equations for estimating magnitudes of annual peak flows with selected recurrence intervals from 1.5 to 500 years were developed for each region. Average standard errors of estimate range from 34 to 131 percent. Average standard errors of prediction range from 35 to 135 percent. Several statistics for evaluating and comparing the errors in these estimates are described. Limitations of the equations are described. Methods for applying the regional equations for various circumstances are listed and examples are given.

  7. Subsampled Hessian Newton Methods for Supervised Learning.

    PubMed

    Wang, Chien-Chih; Huang, Chun-Heng; Lin, Chih-Jen

    2015-08-01

    Newton methods can be applied in many supervised learning approaches. However, for large-scale data, the use of the whole Hessian matrix can be time-consuming. Recently, subsampled Newton methods have been proposed to reduce the computational time by using only a subset of data for calculating an approximation of the Hessian matrix. Unfortunately, we find that in some situations, the running speed is worse than the standard Newton method because cheaper but less accurate search directions are used. In this work, we propose some novel techniques to improve the existing subsampled Hessian Newton method. The main idea is to solve a two-dimensional subproblem per iteration to adjust the search direction to better minimize the second-order approximation of the function value. We prove the theoretical convergence of the proposed method. Experiments on logistic regression, linear SVM, maximum entropy, and deep networks indicate that our techniques significantly reduce the running time of the subsampled Hessian Newton method. The resulting algorithm becomes a compelling alternative to the standard Newton method for large-scale data classification.

  8. Automated processing of label-free Raman microscope images of macrophage cells with standardized regression for high-throughput analysis.

    PubMed

    Milewski, Robert J; Kumagai, Yutaro; Fujita, Katsumasa; Standley, Daron M; Smith, Nicholas I

    2010-11-19

    Macrophages represent the front lines of our immune system; they recognize and engulf pathogens or foreign particles thus initiating the immune response. Imaging macrophages presents unique challenges, as most optical techniques require labeling or staining of the cellular compartments in order to resolve organelles, and such stains or labels have the potential to perturb the cell, particularly in cases where incomplete information exists regarding the precise cellular reaction under observation. Label-free imaging techniques such as Raman microscopy are thus valuable tools for studying the transformations that occur in immune cells upon activation, both on the molecular and organelle levels. Due to extremely low signal levels, however, Raman microscopy requires sophisticated image processing techniques for noise reduction and signal extraction. To date, efficient, automated algorithms for resolving sub-cellular features in noisy, multi-dimensional image sets have not been explored extensively. We show that hybrid z-score normalization and standard regression (Z-LSR) can highlight the spectral differences within the cell and provide image contrast dependent on spectral content. In contrast to typical Raman imaging processing methods using multivariate analysis, such as single value decomposition (SVD), our implementation of the Z-LSR method can operate nearly in real-time. In spite of its computational simplicity, Z-LSR can automatically remove background and bias in the signal, improve the resolution of spatially distributed spectral differences and enable sub-cellular features to be resolved in Raman microscopy images of mouse macrophage cells. Significantly, the Z-LSR processed images automatically exhibited subcellular architectures whereas SVD, in general, requires human assistance in selecting the components of interest. The computational efficiency of Z-LSR enables automated resolution of sub-cellular features in large Raman microscopy data sets without compromise in image quality or information loss in associated spectra. These results motivate further use of label free microscopy techniques in real-time imaging of live immune cells.

  9. Failure of Standard Training Sets in the Analysis of Fast-Scan Cyclic Voltammetry Data.

    PubMed

    Johnson, Justin A; Rodeberg, Nathan T; Wightman, R Mark

    2016-03-16

    The use of principal component regression, a multivariate calibration method, in the analysis of in vivo fast-scan cyclic voltammetry data allows for separation of overlapping signal contributions, permitting evaluation of the temporal dynamics of multiple neurotransmitters simultaneously. To accomplish this, the technique relies on information about current-concentration relationships across the scan-potential window gained from analysis of training sets. The ability of the constructed models to resolve analytes depends critically on the quality of these data. Recently, the use of standard training sets obtained under conditions other than those of the experimental data collection (e.g., with different electrodes, animals, or equipment) has been reported. This study evaluates the analyte resolution capabilities of models constructed using this approach from both a theoretical and experimental viewpoint. A detailed discussion of the theory of principal component regression is provided to inform this discussion. The findings demonstrate that the use of standard training sets leads to misassignment of the current-concentration relationships across the scan-potential window. This directly results in poor analyte resolution and, consequently, inaccurate quantitation, which may lead to erroneous conclusions being drawn from experimental data. Thus, it is strongly advocated that training sets be obtained under the experimental conditions to allow for accurate data analysis.

  10. Use of Tc-99m-galactosyl-neoglycoalbumin (Tc-NGA) to determine hepatic blood flow

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stadalnik, R.C.; Vera, D.R.; Woodle, E.S.

    1984-01-01

    Tc-NGA is a new liver radiopharmaceutical which binds to a hepatocyte-specific membrane receptor. Three characteristics of Tc-NGA can be exploited in the measurement of hepatic blood flow (HBF): 1) ability to alter the affinity of Tc-NGA for its receptor by changing the galactose: albumin ratio; 2) ability to achieve a high specific activity with Tc-99m labeling; and 3) ability to administer a high molar dose of Tc-NGA without physiologic side effects. In addition, kinetic modeling of Tc-NGA dynamic data can provide estimates of hepatic receptor concentration. In experimental studies in young pigs, HBF was determined using two techniques: 1) kineticmore » modeling of dynamic data using moderate affinity, low specific activity Tc-NGA (Group A, n=12); and 2) clearance (CL) technique using high affinity, high specific activity Tc-NGA (Group B, n=4). In both groups, HBF was determined simultaneously by continuous infusion of indocyanine green (CI-ICG) with hepatic vein sampling. Regression analysis of HBF measurements obtained with the Tc-NGA kinetic modeling technique and the CI-ICG technique (Group A) revealed good correlation between the two techniques (r=0.802, p=0.02). Similarly, HBF determination by the clearance technique (Group B) provided highly accurate measurements when compared to the CI-ICG technique. Hepatic blood flow measurements by the clearance technique (CL-NGA) fell within one standard deviation of the error associated with each CI-ICG HBF measurement (all CI-ICG standard deviations were less than 10%).« less

  11. Near-infrared spectral image analysis of pork marbling based on Gabor filter and wide line detector techniques.

    PubMed

    Huang, Hui; Liu, Li; Ngadi, Michael O; Gariépy, Claude; Prasher, Shiv O

    2014-01-01

    Marbling is an important quality attribute of pork. Detection of pork marbling usually involves subjective scoring, which raises the efficiency costs to the processor. In this study, the ability to predict pork marbling using near-infrared (NIR) hyperspectral imaging (900-1700 nm) and the proper image processing techniques were studied. Near-infrared images were collected from pork after marbling evaluation according to current standard chart from the National Pork Producers Council. Image analysis techniques-Gabor filter, wide line detector, and spectral averaging-were applied to extract texture, line, and spectral features, respectively, from NIR images of pork. Samples were grouped into calibration and validation sets. Wavelength selection was performed on calibration set by stepwise regression procedure. Prediction models of pork marbling scores were built using multiple linear regressions based on derivatives of mean spectra and line features at key wavelengths. The results showed that the derivatives of both texture and spectral features produced good results, with correlation coefficients of validation of 0.90 and 0.86, respectively, using wavelengths of 961, 1186, and 1220 nm. The results revealed the great potential of the Gabor filter for analyzing NIR images of pork for the effective and efficient objective evaluation of pork marbling.

  12. The Effective Dynamic Ranges for Glaucomatous Visual Field Progression With Standard Automated Perimetry and Stimulus Sizes III and V.

    PubMed

    Wall, Michael; Zamba, Gideon K D; Artes, Paul H

    2018-01-01

    It has been shown that threshold estimates below approximately 20 dB have little effect on the ability to detect visual field progression in glaucoma. We aimed to compare stimulus size V to stimulus size III, in areas of visual damage, to confirm these findings by using (1) a different dataset, (2) different techniques of progression analysis, and (3) an analysis to evaluate the effect of censoring on mean deviation (MD). In the Iowa Variability in Perimetry Study, 120 glaucoma subjects were tested every 6 months for 4 years with size III SITA Standard and size V Full Threshold. Progression was determined with three complementary techniques: pointwise linear regression (PLR), permutation of PLR, and linear regression of the MD index. All analyses were repeated on "censored'' datasets in which threshold estimates below a given criterion value were set to equal the criterion value. Our analyses confirmed previous observations that threshold estimates below 20 dB contribute much less to visual field progression than estimates above this range. These findings were broadly similar with stimulus sizes III and V. Censoring of threshold values < 20 dB has relatively little impact on the rates of visual field progression in patients with mild to moderate glaucoma. Size V, which has lower retest variability, performs at least as well as size III for longitudinal glaucoma progression analysis and appears to have a larger useful dynamic range owing to the upper sensitivity limit being higher.

  13. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program versions [19]. These techniques compare two programs with a large degree of syntactic similarity to prove that portions of one program version are equivalent to the other. Regression verification can be used for guaranteeing backward compatibility, and for showing behavioral equivalence in programs with syntactic differences, e.g., when a program is refactored to improve its performance, maintainability, or readability. Existing regression verification techniques leverage similarities between program versions by using abstraction and decomposition techniques to improve scalability of the analysis [10, 12, 19]. The abstractions and decomposition in the these techniques, e.g., summaries of unchanged code [12] or semantically equivalent methods [19], compute an over-approximation of the program behaviors. The equivalence checking results of these techniques are sound but not complete-they may characterize programs as not functionally equivalent when, in fact, they are equivalent. In this work we describe a novel approach that leverages the impact of the differences between two programs for scaling regression verification. We partition program behaviors of each version into (a) behaviors impacted by the changes and (b) behaviors not impacted (unimpacted) by the changes. Only the impacted program behaviors are used during equivalence checking. We then prove that checking equivalence of the impacted program behaviors is equivalent to checking equivalence of all program behaviors for a given depth bound. In this work we use symbolic execution to generate the program behaviors and leverage control- and data-dependence information to facilitate the partitioning of program behaviors. The impacted program behaviors are termed as impact summaries. The dependence analyses that facilitate the generation of the impact summaries, we believe, could be used in conjunction with other abstraction and decomposition based approaches, [10, 12], as a complementary reduction technique. An evaluation of our regression verification technique shows that our approach is capable of leveraging similarities between program versions to reduce the size of the queries and the time required to check for logical equivalence. The main contributions of this work are: - A regression verification technique to generate impact summaries that can be checked for functional equivalence using an off-the-shelf decision procedure. - A proof that our approach is sound and complete with respect to the depth bound of symbolic execution. - An implementation of our technique using the LLVMcompiler infrastructure, the klee Symbolic Virtual Machine [4], and a variety of Satisfiability Modulo Theory (SMT) solvers, e.g., STP [7] and Z3 [6]. - An empirical evaluation on a set of C artifacts which shows that the use of impact summaries can reduce the cost of regression verification.

  14. Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio

    USGS Publications Warehouse

    Koltun, G.F.; Roberts, J.W.

    1990-01-01

    Multiple-regression equations are presented for estimating flood-peak discharges having recurrence intervals of 2, 5, 10, 25, 50, and 100 years at ungaged sites on rural, unregulated streams in Ohio. The average standard errors of prediction for the equations range from 33.4% to 41.4%. Peak discharge estimates determined by log-Pearson Type III analysis using data collected through the 1987 water year are reported for 275 streamflow-gaging stations. Ordinary least-squares multiple-regression techniques were used to divide the State into three regions and to identify a set of basin characteristics that help explain station-to- station variation in the log-Pearson estimates. Contributing drainage area, main-channel slope, and storage area were identified as suitable explanatory variables. Generalized least-square procedures, which include historical flow data and account for differences in the variance of flows at different gaging stations, spatial correlation among gaging station records, and variable lengths of station record were used to estimate the regression parameters. Weighted peak-discharge estimates computed as a function of the log-Pearson Type III and regression estimates are reported for each station. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site located on the same stream. Limitations and shortcomings cited in an earlier report on the magnitude and frequency of floods in Ohio are addressed in this study. Geographic bias is no longer evident for the Maumee River basin of northwestern Ohio. No bias is found to be associated with the forested-area characteristic for the range used in the regression analysis (0.0 to 99.0%), nor is this characteristic significant in explaining peak discharges. Surface-mined area likewise is not significant in explaining peak discharges, and the regression equations are not biased when applied to basins having approximately 30% or less surface-mined area. Analyses of residuals indicate that the equations tend to overestimate flood-peak discharges for basins having approximately 30% or more surface-mined area. (USGS)

  15. Linear regression models for solvent accessibility prediction in proteins.

    PubMed

    Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław

    2005-04-01

    The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.

  16. Techniques for estimating flood-peak discharges of rural, unregulated streams in Ohio

    USGS Publications Warehouse

    Koltun, G.F.

    2003-01-01

    Regional equations for estimating 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood-peak discharges at ungaged sites on rural, unregulated streams in Ohio were developed by means of ordinary and generalized least-squares (GLS) regression techniques. One-variable, simple equations and three-variable, full-model equations were developed on the basis of selected basin characteristics and flood-frequency estimates determined for 305 streamflow-gaging stations in Ohio and adjacent states. The average standard errors of prediction ranged from about 39 to 49 percent for the simple equations, and from about 34 to 41 percent for the full-model equations. Flood-frequency estimates determined by means of log-Pearson Type III analyses are reported along with weighted flood-frequency estimates, computed as a function of the log-Pearson Type III estimates and the regression estimates. Values of explanatory variables used in the regression models were determined from digital spatial data sets by means of a geographic information system (GIS), with the exception of drainage area, which was determined by digitizing the area within basin boundaries manually delineated on topographic maps. Use of GIS-based explanatory variables represents a major departure in methodology from that described in previous reports on estimating flood-frequency characteristics of Ohio streams. Examples are presented illustrating application of the regression equations to ungaged sites on ungaged and gaged streams. A method is provided to adjust regression estimates for ungaged sites by use of weighted and regression estimates for a gaged site on the same stream. A region-of-influence method, which employs a computer program to estimate flood-frequency characteristics for ungaged sites based on data from gaged sites with similar characteristics, was also tested and compared to the GLS full-model equations. For all recurrence intervals, the GLS full-model equations had superior prediction accuracy relative to the simple equations and therefore are recommended for use.

  17. Regression Techniques for Determining the Effective Impervious Area in Southern California Watersheds

    NASA Astrophysics Data System (ADS)

    Sultana, R.; Mroczek, M.; Dallman, S.; Sengupta, A.; Stein, E. D.

    2016-12-01

    The portion of the Total Impervious Area (TIA) that is hydraulically connected to the storm drainage network is called the Effective Impervious Area (EIA). The remaining fraction of impervious area, called the non-effective impervious area, drains onto pervious surfaces which do not contribute to runoff for smaller events. Using the TIA instead of EIA in models and calculations can lead to overestimates of runoff volumes peak discharges and oversizing of drainage system since it is assumed all impervious areas produce urban runoff that is directly connected to storm drains. This makes EIA a better predictor of actual runoff from urban catchments for hydraulic design of storm drain systems and modeling non-point source pollution. Compared to TIA, determining the EIA is considerably more difficult to calculate since it cannot be found by using remote sensing techniques, readily available EIA datasets, or aerial imagery interpretation alone. For this study, EIA percentages were calculated by two successive regression methods for five watersheds (with areas of 8.38 - 158mi2) located in Southern California using rainfall-runoff event data for the years 2004 - 2007. Runoff generated from the smaller storm events are considered to be emanating only from the effective impervious areas. Therefore, larger events that were considered to have runoff from both impervious and pervious surfaces were successively removed in the regression methods using a criterion of (1) 1mm and (2) a max (2 , 1mm) above the regression line. MSE is calculated from actual runoff and runoff predicted by the regression. Analysis of standard deviations showed that criterion of max (2 , 1mm) better fit the regression line and is the preferred method in predicting the EIA percentage. The estimated EIAs have shown to be approximately 78% to 43% of the TIA which shows use of EIA instead of TIA can have significant impact on the cost building urban hydraulic systems and stormwater capture devices.

  18. Estimation of flood discharges at selected annual exceedance probabilities for unregulated, rural streams in Vermont, with a section on Vermont regional skew regression

    USGS Publications Warehouse

    Olson, Scott A.; with a section by Veilleux, Andrea G.

    2014-01-01

    This report provides estimates of flood discharges at selected annual exceedance probabilities (AEPs) for streamgages in and adjacent to Vermont and equations for estimating flood discharges at AEPs of 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent (recurrence intervals of 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-years, respectively) for ungaged, unregulated, rural streams in Vermont. The equations were developed using generalized least-squares regression. Flood-frequency and drainage-basin characteristics from 145 streamgages were used in developing the equations. The drainage-basin characteristics used as explanatory variables in the regression equations include drainage area, percentage of wetland area, and the basin-wide mean of the average annual precipitation. The average standard errors of prediction for estimating the flood discharges at the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent AEP with these equations are 34.9, 36.0, 38.7, 42.4, 44.9, 47.3, 50.7, and 55.1 percent, respectively. Flood discharges at selected AEPs for streamgages were computed by using the Expected Moments Algorithm. To improve estimates of the flood discharges for given exceedance probabilities at streamgages in Vermont, a new generalized skew coefficient was developed. The new generalized skew for the region is a constant, 0.44. The mean square error of the generalized skew coefficient is 0.078. This report describes a technique for using results from the regression equations to adjust an AEP discharge computed from a streamgage record. This report also describes a technique for using a drainage-area adjustment to estimate flood discharge at a selected AEP for an ungaged site upstream or downstream from a streamgage. The final regression equations and the flood-discharge frequency data used in this study will be available in StreamStats. StreamStats is a World Wide Web application providing automated regression-equation solutions for user-selected sites on streams.

  19. Replica analysis of overfitting in regression models for time-to-event data

    NASA Astrophysics Data System (ADS)

    Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.

    2017-09-01

    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.

  20. Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

    ERIC Educational Resources Information Center

    Li, Deping; Oranje, Andreas

    2007-01-01

    Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…

  1. Breast volume assessment: comparing five different techniques.

    PubMed

    Bulstrode, N; Bellamy, E; Shrotria, S

    2001-04-01

    Breast volume assessment is not routinely performed pre-operatively because as yet there is no accepted technique. There have been a variety of methods published, but this is the first study to compare these techniques. We compared volume measurements obtained from mammograms (previously compared to mastectomy specimens) with estimates of volume obtained from four other techniques: thermoplastic moulding, magnetic resonance imaging, Archimedes principle and anatomical measurements. We also assessed the acceptability of each method to the patient. Measurements were performed on 10 women, which produced results for 20 breasts. We were able to calculate regression lines between volume measurements obtained from mammography to the other four methods: (1) magnetic resonance imaging (MRI), 379+(0.75 MRI) [r=0.48], (2) Thermoplastic moulding, 132+(1.46 Thermoplastic moulding) [r=0.82], (3) Anatomical measurements, 168+(1.55 Anatomical measurements) [r=0.83]. (4) Archimedes principle, 359+(0.6 Archimedes principle) [r=0.61] all units in cc. The regression curves for the different techniques are variable and it is difficult to reliably compare results. A standard method of volume measurement should be used when comparing volumes before and after intervention or between individual patients, and it is unreliable to compare volume measurements using different methods. Calculating the breast volume from mammography has previously been compared to mastectomy samples and shown to be reasonably accurate. However we feel thermoplastic moulding shows promise and should be further investigated as it gives not only a volume assessment but a three-dimensional impression of the breast shape, which may be valuable in assessing cosmesis following breast-conserving-surgery.

  2. Bankfull characteristics of Ohio streams and their relation to peak streamflows

    USGS Publications Warehouse

    Sherwood, James M.; Huitger, Carrie A.

    2005-01-01

    Regional curves, simple-regression equations, and multiple-regression equations were developed to estimate bankfull width, bankfull mean depth, bankfull cross-sectional area, and bankfull discharge of rural, unregulated streams in Ohio. The methods are based on geomorphic, basin, and flood-frequency data collected at 50 study sites on unregulated natural alluvial streams in Ohio, of which 40 sites are near streamflow-gaging stations. The regional curves and simple-regression equations relate the bankfull characteristics to drainage area. The multiple-regression equations relate the bankfull characteristics to drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope. Average standard errors of prediction for bankfull width equations range from 20.6 to 24.8 percent; for bankfull mean depth, 18.8 to 20.6 percent; for bankfull cross-sectional area, 25.4 to 30.6 percent; and for bankfull discharge, 27.0 to 78.7 percent. The simple-regression (drainage-area only) equations have the highest average standard errors of prediction. The multiple-regression equations in which the explanatory variables included drainage area, main-channel slope, main-channel elevation index, median bed-material particle size, bankfull cross-sectional area, and local-channel slope have the lowest average standard errors of prediction. Field surveys were done at each of the 50 study sites to collect the geomorphic data. Bankfull indicators were identified and evaluated, cross-section and longitudinal profiles were surveyed, and bed- and bank-material were sampled. Field data were analyzed to determine various geomorphic characteristics such as bankfull width, bankfull mean depth, bankfull cross-sectional area, bankfull discharge, streambed slope, and bed- and bank-material particle-size distribution. The various geomorphic characteristics were analyzed by means of a combination of graphical and statistical techniques. The logarithms of the annual peak discharges for the 40 gaged study sites were fit by a Pearson Type III frequency distribution to develop flood-peak discharges associated with recurrence intervals of 2, 5, 10, 25, 50, and 100 years. The peak-frequency data were related to geomorphic, basin, and climatic variables by multiple-regression analysis. Simple-regression equations were developed to estimate 2-, 5-, 10-, 25-, 50-, and 100-year flood-peak discharges of rural, unregulated streams in Ohio from bankfull channel cross-sectional area. The average standard errors of prediction are 31.6, 32.6, 35.9, 41.5, 46.2, and 51.2 percent, respectively. The study and methods developed are intended to improve understanding of the relations between geomorphic, basin, and flood characteristics of streams in Ohio and to aid in the design of hydraulic structures, such as culverts and bridges, where stability of the stream and structure is an important element of the design criteria. The study was done in cooperation with the Ohio Department of Transportation and the U.S. Department of Transportation, Federal Highway Administration.

  3. Experiments to Determine Whether Recursive Partitioning (CART) or an Artificial Neural Network Overcomes Theoretical Limitations of Cox Proportional Hazards Regression

    NASA Technical Reports Server (NTRS)

    Kattan, Michael W.; Hess, Kenneth R.; Kattan, Michael W.

    1998-01-01

    New computationally intensive tools for medical survival analyses include recursive partitioning (also called CART) and artificial neural networks. A challenge that remains is to better understand the behavior of these techniques in effort to know when they will be effective tools. Theoretically they may overcome limitations of the traditional multivariable survival technique, the Cox proportional hazards regression model. Experiments were designed to test whether the new tools would, in practice, overcome these limitations. Two datasets in which theory suggests CART and the neural network should outperform the Cox model were selected. The first was a published leukemia dataset manipulated to have a strong interaction that CART should detect. The second was a published cirrhosis dataset with pronounced nonlinear effects that a neural network should fit. Repeated sampling of 50 training and testing subsets was applied to each technique. The concordance index C was calculated as a measure of predictive accuracy by each technique on the testing dataset. In the interaction dataset, CART outperformed Cox (P less than 0.05) with a C improvement of 0.1 (95% Cl, 0.08 to 0.12). In the nonlinear dataset, the neural network outperformed the Cox model (P less than 0.05), but by a very slight amount (0.015). As predicted by theory, CART and the neural network were able to overcome limitations of the Cox model. Experiments like these are important to increase our understanding of when one of these new techniques will outperform the standard Cox model. Further research is necessary to predict which technique will do best a priori and to assess the magnitude of superiority.

  4. Membrane Introduction Mass Spectrometry Combined with an Orthogonal Partial-Least Squares Calibration Model for Mixture Analysis.

    PubMed

    Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu

    2017-01-01

    The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.

  5. DFT study on oxidation of HS(CH2) m SH ( m = 1-8) in oxidative desulfurization

    NASA Astrophysics Data System (ADS)

    Song, Y. Z.; Song, J. J.; Zhao, T. T.; Chen, C. Y.; He, M.; Du, J.

    2016-06-01

    Density functional theory was employed for calculation of HS(CH2) m SH ( m = 1-8) and its derivatives at B3LYP method at 6-31++g ( d, p) level. Using eigenvalues of LUMO and HOMO for HS(CH2) m SH, the standard electrode potentials were estimated by a stepwise multiple regression techniques (MLR), and obtained as E° = 1.500 + 7.167 × 10-3 HOMO-0.229 LUMO with high correlation coefficients of 0.973 and F values of 43.973.

  6. A comparison between standard methods and structural nested modelling when bias from a healthy worker survivor effect is suspected: an iron-ore mining cohort study.

    PubMed

    Björ, Ove; Damber, Lena; Jonsson, Håkan; Nilsson, Tohr

    2015-07-01

    Iron-ore miners are exposed to extremely dusty and physically arduous work environments. The demanding activities of mining select healthier workers with longer work histories (ie, the Healthy Worker Survivor Effect (HWSE)), and could have a reversing effect on the exposure-response association. The objective of this study was to evaluate an iron-ore mining cohort to determine whether the effect of respirable dust was confounded by the presence of an HWSE. When an HWSE exists, standard modelling methods, such as Cox regression analysis, produce biased results. We compared results from g-estimation of accelerated failure-time modelling adjusted for HWSE with corresponding unadjusted Cox regression modelling results. For all-cause mortality when adjusting for the HWSE, cumulative exposure from respirable dust was associated with a 6% decrease of life expectancy if exposed ≥15 years, compared with never being exposed. Respirable dust continued to be associated with mortality after censoring outcomes known to be associated with dust when adjusting for the HWSE. In contrast, results based on Cox regression analysis did not support that an association was present. The adjustment for the HWSE made a difference when estimating the risk of mortality from respirable dust. The results of this study, therefore, support the recommendation that standard methods of analysis should be complemented with structural modelling analysis techniques, such as g-estimation of accelerated failure-time modelling, to adjust for the HWSE. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  7. Determination of glomerular filtration rate (GFR) from fractional renal accumulation of iodinated contrast material: a convenient and rapid single-kidney CT-GFR technique.

    PubMed

    Yuan, XiaoDong; Tang, Wei; Shi, WenWei; Yu, Libao; Zhang, Jing; Yuan, Qing; You, Shan; Wu, Ning; Ao, Guokun; Ma, Tingting

    2018-07-01

    To develop a convenient and rapid single-kidney CT-GFR technique. One hundred and twelve patients referred for multiphasic renal CT and 99mTc-DTPA renal dynamic imaging Gates-GFR measurement were prospectively included and randomly divided into two groups of 56 patients each: the training group and the validation group. On the basis of the nephrographic phase images, the fractional renal accumulation (FRA) was calculated and correlated with the Gates-GFR in the training group. From this correlation a formula was derived for single-kidney CT-GFR calculation, which was validated by a paired t test and linear regression analysis with the single-kidney Gates-GFR in the validation group. In the training group, the FRA (x-axis) correlated well (r = 0.95, p < 0.001) with single-kidney Gates-GFR (y-axis), producing a regression equation of y = 1665x + 1.5 for single-kidney CT-GFR calculation. In the validation group, the difference between the methods of single-kidney GFR measurements was 0.38 ± 5.57 mL/min (p = 0.471); the regression line is identical to the diagonal (intercept = 0 and slope = 1) (p = 0.727 and p = 0.473, respectively), with a standard deviation of residuals of 5.56 mL/min. A convenient and rapid single-kidney CT-GFR technique was presented and validated in this investigation. • The new CT-GFR method takes about 2.5 min of patient time. • The CT-GFR method demonstrated identical results to the Gates-GFR method. • The CT-GFR method is based on the fractional renal accumulation of iodinated CM. • The CT-GFR method is achieved without additional radiation dose to the patient.

  8. Techniques for Estimating the Magnitude and Frequency of Peak Flows on Small Streams in Minnesota Based on Data through Water Year 2005

    USGS Publications Warehouse

    Lorenz, David L.; Sanocki, Chris A.; Kocian, Matthew J.

    2010-01-01

    Knowledge of the peak flow of floods of a given recurrence interval is essential for regulation and planning of water resources and for design of bridges, culverts, and dams along Minnesota's rivers and streams. Statistical techniques are needed to estimate peak flow at ungaged sites because long-term streamflow records are available at relatively few places. Because of the need to have up-to-date peak-flow frequency information in order to estimate peak flows at ungaged sites, the U.S. Geological Survey (USGS) conducted a peak-flow frequency study in cooperation with the Minnesota Department of Transportation and the Minnesota Pollution Control Agency. Estimates of peak-flow magnitudes for 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are presented for 330 streamflow-gaging stations in Minnesota and adjacent areas in Iowa and South Dakota based on data through water year 2005. The peak-flow frequency information was subsequently used in regression analyses to develop equations relating peak flows for selected recurrence intervals to various basin and climatic characteristics. Two statistically derived techniques-regional regression equation and region of influence regression-can be used to estimate peak flow on ungaged streams smaller than 3,000 square miles in Minnesota. Regional regression equations were developed for selected recurrence intervals in each of six regions in Minnesota: A (northwestern), B (north central and east central), C (northeastern), D (west central and south central), E (southwestern), and F (southeastern). The regression equations can be used to estimate peak flows at ungaged sites. The region of influence regression technique dynamically selects streamflow-gaging stations with characteristics similar to a site of interest. Thus, the region of influence regression technique allows use of a potentially unique set of gaging stations for estimating peak flow at each site of interest. Two methods of selecting streamflow-gaging stations, similarity and proximity, can be used for the region of influence regression technique. The regional regression equation technique is the preferred technique as an estimate of peak flow in all six regions for ungaged sites. The region of influence regression technique is not appropriate for regions C, E, and F because the interrelations of some characteristics of those regions do not agree with the interrelations throughout the rest of the State. Both the similarity and proximity methods for the region of influence technique can be used in the other regions (A, B, and D) to provide additional estimates of peak flow. The peak-flow-frequency estimates and basin characteristics for selected streamflow-gaging stations and regional peak-flow regression equations are included in this report.

  9. Updated Design Standards and Guidance from the What Works Clearinghouse: Regression Discontinuity Designs and Cluster Designs

    ERIC Educational Resources Information Center

    Cole, Russell; Deke, John; Seftor, Neil

    2016-01-01

    The What Works Clearinghouse (WWC) maintains design standards to identify rigorous, internally valid education research. As education researchers advance new methodologies, the WWC must revise its standards to include an assessment of the new designs. Recently, the WWC has revised standards for two emerging study designs: regression discontinuity…

  10. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  11. Protocol for Standardizing High-to-Moderate Abundance Protein Biomarker Assessments Through an MRM-with-Standard-Peptides Quantitative Approach.

    PubMed

    Percy, Andrew J; Yang, Juncong; Chambers, Andrew G; Mohammed, Yassene; Miliotis, Tasso; Borchers, Christoph H

    2016-01-01

    Quantitative mass spectrometry (MS)-based approaches are emerging as a core technology for addressing health-related queries in systems biology and in the biomedical and clinical fields. In several 'omics disciplines (proteomics included), an approach centered on selected or multiple reaction monitoring (SRM or MRM)-MS with stable isotope-labeled standards (SIS), at the protein or peptide level, has emerged as the most precise technique for quantifying and screening putative analytes in biological samples. To enable the widespread use of MRM-based protein quantitation for disease biomarker assessment studies and its ultimate acceptance for clinical analysis, the technique must be standardized to facilitate precise and accurate protein quantitation. To that end, we have developed a number of kits for assessing method/platform performance, as well as for screening proposed candidate protein biomarkers in various human biofluids. Collectively, these kits utilize a bottom-up LC-MS methodology with SIS peptides as internal standards and quantify proteins using regression analysis of standard curves. This chapter details the methodology used to quantify 192 plasma proteins of high-to-moderate abundance (covers a 6 order of magnitude range from 31 mg/mL for albumin to 18 ng/mL for peroxidredoxin-2), and a 21-protein subset thereof. We also describe the application of this method to patient samples for biomarker discovery and verification studies. Additionally, we introduce our recently developed Qualis-SIS software, which is used to expedite the analysis and assessment of protein quantitation data in control and patient samples.

  12. Bayesian inference for the spatio-temporal invasion of alien species.

    PubMed

    Cook, Alex; Marion, Glenn; Butler, Adam; Gibson, Gavin

    2007-08-01

    In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions.

  13. Impact of ensemble learning in the assessment of skeletal maturity.

    PubMed

    Cunha, Pedro; Moura, Daniel C; Guevara López, Miguel Angel; Guerra, Conceição; Pinto, Daniela; Ramos, Isabel

    2014-09-01

    The assessment of the bone age, or skeletal maturity, is an important task in pediatrics that measures the degree of maturation of children's bones. Nowadays, there is no standard clinical procedure for assessing bone age and the most widely used approaches are the Greulich and Pyle and the Tanner and Whitehouse methods. Computer methods have been proposed to automatize the process; however, there is a lack of exploration about how to combine the features of the different parts of the hand, and how to take advantage of ensemble techniques for this purpose. This paper presents a study where the use of ensemble techniques for improving bone age assessment is evaluated. A new computer method was developed that extracts descriptors for each joint of each finger, which are then combined using different ensemble schemes for obtaining a final bone age value. Three popular ensemble schemes are explored in this study: bagging, stacking and voting. Best results were achieved by bagging with a rule-based regression (M5P), scoring a mean absolute error of 10.16 months. Results show that ensemble techniques improve the prediction performance of most of the evaluated regression algorithms, always achieving best or comparable to best results. Therefore, the success of the ensemble methods allow us to conclude that their use may improve computer-based bone age assessment, offering a scalable option for utilizing multiple regions of interest and combining their output.

  14. Standardization and validation of the body weight adjustment regression equations in Olympic weightlifting.

    PubMed

    Kauhanen, Heikki; Komi, Paavo V; Häkkinen, Keijo

    2002-02-01

    The problems in comparing the performances of Olympic weightlifters arise from the fact that the relationship between body weight and weightlifting results is not linear. In the present study, this relationship was examined by using a nonparametric curve fitting technique of robust locally weighted regression (LOWESS) on relatively large data sets of the weightlifting results made in top international competitions. Power function formulas were derived from the fitted LOWESS values to represent the relationship between the 2 variables in a way that directly compares the snatch, clean-and-jerk, and total weightlifting results of a given athlete with those of the world-class weightlifters (golden standards). A residual analysis of several other parametric models derived from the initial results showed that they all experience inconsistencies, yielding either underestimation or overestimation of certain body weights. In addition, the existing handicapping formulas commonly used in normalizing the performances of Olympic weightlifters did not yield satisfactory results when applied to the present data. It was concluded that the devised formulas may provide objective means for the evaluation of the performances of male weightlifters, regardless of their body weights, ages, or performance levels.

  15. Predictive sparse modeling of fMRI data for improved classification, regression, and visualization using the k-support norm.

    PubMed

    Belilovsky, Eugene; Gkirtzou, Katerina; Misyrlis, Michail; Konova, Anna B; Honorio, Jean; Alia-Klein, Nelly; Goldstein, Rita Z; Samaras, Dimitris; Blaschko, Matthew B

    2015-12-01

    We explore various sparse regularization techniques for analyzing fMRI data, such as the ℓ1 norm (often called LASSO in the context of a squared loss function), elastic net, and the recently introduced k-support norm. Employing sparsity regularization allows us to handle the curse of dimensionality, a problem commonly found in fMRI analysis. In this work we consider sparse regularization in both the regression and classification settings. We perform experiments on fMRI scans from cocaine-addicted as well as healthy control subjects. We show that in many cases, use of the k-support norm leads to better predictive performance, solution stability, and interpretability as compared to other standard approaches. We additionally analyze the advantages of using the absolute loss function versus the standard squared loss which leads to significantly better predictive performance for the regularization methods tested in almost all cases. Our results support the use of the k-support norm for fMRI analysis and on the clinical side, the generalizability of the I-RISA model of cocaine addiction. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Adjuvant corneal crosslinking to prevent hyperopic LASIK regression

    PubMed Central

    Aslanides, Ioannis M; Mukherjee, Achyut N

    2013-01-01

    Purpose To report the long term outcomes, safety, stability, and efficacy in a pilot series of simultaneous hyperopic laser assisted in situ keratomileusis (LASIK) and corneal crosslinking (CXL). Method A small cohort series of five eyes, with clinically suboptimal topography and/or thickness, underwent LASIK surgery with immediate riboflavin application under the flap, followed by UV light irradiation. Postoperative assessment was performed at 1, 3, 6, and 12 months, with late follow up at 4 years, and results were compared with a matched cohort that received LASIK only. Results The average age of the LASIK-CXL group was 39 years (26–46), and the average spherical equivalent hyperopic refractive error was +3.45 diopters (standard deviation 0.76; range 2.5 to 4.5). All eyes maintained refractive stability over the 4 years. There were no complications related to CXL, and topographic and clinical outcomes were as expected for standard LASIK. Conclusion This limited series suggests that simultaneous LASIK and CXL for hyperopia is safe. Outcomes of the small cohort suggest that this technique may be promising for ameliorating hyperopic regression, presumed to be biomechanical in origin, and may also address ectasia risk. PMID:23576861

  17. A comparison of four streamflow record extension techniques

    USGS Publications Warehouse

    Hirsch, Robert M.

    1982-01-01

    One approach to developing time series of streamflow, which may be used for simulation and optimization studies of water resources development activities, is to extend an existing gage record in time by exploiting the interstation correlation between the station of interest and some nearby (long-term) base station. Four methods of extension are described, and their properties are explored. The methods are regression (REG), regression plus noise (RPN), and two new methods, maintenance of variance extension types 1 and 2 (MOVE.l, MOVE.2). MOVE.l is equivalent to a method which is widely used in psychology, biometrics, and geomorphology and which has been called by various names, e.g., ‘line of organic correlation,’ ‘reduced major axis,’ ‘unique solution,’ and ‘equivalence line.’ The methods are examined for bias and standard error of estimate of moments and order statistics, and an empirical examination is made of the preservation of historic low-flow characteristics using 50-year-long monthly records from seven streams. The REG and RPN methods are shown to have serious deficiencies as record extension techniques. MOVE.2 is shown to be marginally better than MOVE.l, according to the various comparisons of bias and accuracy.

  18. A Comparison of Four Streamflow Record Extension Techniques

    NASA Astrophysics Data System (ADS)

    Hirsch, Robert M.

    1982-08-01

    One approach to developing time series of streamflow, which may be used for simulation and optimization studies of water resources development activities, is to extend an existing gage record in time by exploiting the interstation correlation between the station of interest and some nearby (long-term) base station. Four methods of extension are described, and their properties are explored. The methods are regression (REG), regression plus noise (RPN), and two new methods, maintenance of variance extension types 1 and 2 (MOVE.l, MOVE.2). MOVE.l is equivalent to a method which is widely used in psychology, biometrics, and geomorphology and which has been called by various names, e.g., `line of organic correlation,' `reduced major axis,' `unique solution,' and `equivalence line.' The methods are examined for bias and standard error of estimate of moments and order statistics, and an empirical examination is made of the preservation of historic low-flow characteristics using 50-year-long monthly records from seven streams. The REG and RPN methods are shown to have serious deficiencies as record extension techniques. MOVE.2 is shown to be marginally better than MOVE.l, according to the various comparisons of bias and accuracy.

  19. The Effective Dynamic Ranges for Glaucomatous Visual Field Progression With Standard Automated Perimetry and Stimulus Sizes III and V

    PubMed Central

    Zamba, Gideon K. D.; Artes, Paul H.

    2018-01-01

    Purpose It has been shown that threshold estimates below approximately 20 dB have little effect on the ability to detect visual field progression in glaucoma. We aimed to compare stimulus size V to stimulus size III, in areas of visual damage, to confirm these findings by using (1) a different dataset, (2) different techniques of progression analysis, and (3) an analysis to evaluate the effect of censoring on mean deviation (MD). Methods In the Iowa Variability in Perimetry Study, 120 glaucoma subjects were tested every 6 months for 4 years with size III SITA Standard and size V Full Threshold. Progression was determined with three complementary techniques: pointwise linear regression (PLR), permutation of PLR, and linear regression of the MD index. All analyses were repeated on “censored'' datasets in which threshold estimates below a given criterion value were set to equal the criterion value. Results Our analyses confirmed previous observations that threshold estimates below 20 dB contribute much less to visual field progression than estimates above this range. These findings were broadly similar with stimulus sizes III and V. Conclusions Censoring of threshold values < 20 dB has relatively little impact on the rates of visual field progression in patients with mild to moderate glaucoma. Size V, which has lower retest variability, performs at least as well as size III for longitudinal glaucoma progression analysis and appears to have a larger useful dynamic range owing to the upper sensitivity limit being higher. PMID:29356822

  20. Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

    NASA Astrophysics Data System (ADS)

    Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

    2017-11-01

    This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation

  1. Spatial Assessment of Model Errors from Four Regression Techniques

    Treesearch

    Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

    2005-01-01

    Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...

  2. Determining Directional Dependency in Causal Associations

    PubMed Central

    Pornprasertmanit, Sunthud; Little, Todd D.

    2014-01-01

    Directional dependency is a method to determine the likely causal direction of effect between two variables. This article aims to critique and improve upon the use of directional dependency as a technique to infer causal associations. We comment on several issues raised by von Eye and DeShon (2012), including: encouraging the use of the signs of skewness and excessive kurtosis of both variables, discouraging the use of D’Agostino’s K2, and encouraging the use of directional dependency to compare variables only within time points. We offer improved steps for determining directional dependency that fix the problems we note. Next, we discuss how to integrate directional dependency into longitudinal data analysis with two variables. We also examine the accuracy of directional dependency evaluations when several regression assumptions are violated. Directional dependency can suggest the direction of a relation if (a) the regression error in population is normal, (b) an unobserved explanatory variable correlates with any variables equal to or less than .2, (c) a curvilinear relation between both variables is not strong (standardized regression coefficient ≤ .2), (d) there are no bivariate outliers, and (e) both variables are continuous. PMID:24683282

  3. Near-infrared reflectance spectroscopy predicts protein, starch, and seed weight in intact seeds of common bean ( Phaseolus vulgaris L.).

    PubMed

    Hacisalihoglu, Gokhan; Larbi, Bismark; Settles, A Mark

    2010-01-27

    The objective of this study was to explore the potential of near-infrared reflectance (NIR) spectroscopy to determine individual seed composition in common bean ( Phaseolus vulgaris L.). NIR spectra and analytical measurements of seed weight, protein, and starch were collected from 267 individual bean seeds representing 91 diverse genotypes. Partial least-squares (PLS) regression models were developed with 61 bean accessions randomly assigned to a calibration data set and 30 accessions assigned to an external validation set. Protein gave the most accurate PLS regression, with the external validation set having a standard error of prediction (SEP) = 1.6%. PLS regressions for seed weight and starch had sufficient accuracy for seed sorting applications, with SEP = 41.2 mg and 4.9%, respectively. Seed color had a clear effect on the NIR spectra, with black beans having a distinct spectral type. Seed coat color did not impact the accuracy of PLS predictions. This research demonstrates that NIR is a promising technique for simultaneous sorting of multiple seed traits in single bean seeds with no sample preparation.

  4. Monthly monsoon rainfall forecasting using artificial neural networks

    NASA Astrophysics Data System (ADS)

    Ganti, Ravikumar

    2014-10-01

    Indian agriculture sector heavily depends on monsoon rainfall for successful harvesting. In the past, prediction of rainfall was mainly performed using regression models, which provide reasonable accuracy in the modelling and forecasting of complex physical systems. Recently, Artificial Neural Networks (ANNs) have been proposed as efficient tools for modelling and forecasting. A feed-forward multi-layer perceptron type of ANN architecture trained using the popular back-propagation algorithm was employed in this study. Other techniques investigated for modeling monthly monsoon rainfall include linear and non-linear regression models for comparison purposes. The data employed in this study include monthly rainfall and monthly average of the daily maximum temperature in the North Central region in India. Specifically, four regression models and two ANN model's were developed. The performance of various models was evaluated using a wide variety of standard statistical parameters and scatter plots. The results obtained in this study for forecasting monsoon rainfalls using ANNs have been encouraging. India's economy and agricultural activities can be effectively managed with the help of the availability of the accurate monsoon rainfall forecasts.

  5. TU-G-204-01: BEST IN PHYSICS (IMAGING): Dynamic CT Myocardial Perfusion Measurement and Its Comparison to Fractional Flow Reserve

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ziemer, B; Hubbard, L; Groves, E

    2015-06-15

    Purpose: To evaluate a first pass analysis (FPA) technique for CT perfusion measurement in a swine animal and its validation using fractional flow reserve (FFR) as a reference standard. Methods: Swine were placed under anesthesia and relevant physiologic parameters were continuously recorded. Intra-coronary adenosine was administered to induce maximum hyperemia. A pressure wire was advanced distal to the first diagonal branch of the left anterior descending (LAD) artery for FFR measurements and a balloon dilation catheter was inserted over the pressure wire into the proximal LAD to create varying levels of stenosis. Images were acquired with a 320-row wide volumemore » CT scanner. Three main coronary perfusion beds were delineated in the myocardium using arteries extracted from CT angiography images using a minimum energy hypothesis. The integrated density in the perfusion bed was used to calculate perfusion using the FPA technique. The perfusion in the LAD bed over a range of stenosis severity was measured. The measured fractional perfusion was compared to FFR and linear regression was performed. Results: The measured fractional perfusion using the FPA technique (P-FPA) and FFR were related as P-FPA = 1.06FFR – 0.06 (r{sup 2} = 0.86). The perfusion measurements were calculated with only three to five total CT volume scans, which drastically reduces the radiation dose as compared with the existing techniques requiring 15–20 volume scans. Conclusion: The measured perfusion using the first pass analysis technique showed good correlation with FFR measurements as a reference standard. The technique for perfusion measurement can potentially make a substantial reduction in radiation dose as compared with the existing techniques.« less

  6. Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Hines, Constance V.

    1995-01-01

    The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…

  7. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

    2015-05-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.

  8. Deriving Global Convection Maps From SuperDARN Measurements

    NASA Astrophysics Data System (ADS)

    Gjerloev, J. W.; Waters, C. L.; Barnes, R. J.

    2018-04-01

    A new statistical modeling technique for determining the global ionospheric convection is described. The principal component regression (PCR)-based technique is based on Super Dual Auroral Radar Network (SuperDARN) observations and is an advanced version of the PCR technique that Waters et al. (https//:doi.org.10.1002/2015JA021596) used for the SuperMAG data. While SuperMAG ground magnetic field perturbations are vector measurements, SuperDARN provides line-of-sight measurements of the ionospheric convection flow. Each line-of-sight flow has a known azimuth (or direction), which must be converted into the actual vector flow. However, the component perpendicular to the azimuth direction is unknown. Our method uses historical data from the SuperDARN database and PCR to determine a fill-in model convection distribution for any given universal time. The fill-in data process is driven by a list of state descriptors (magnetic indices and the solar zenith angle). The final solution is then derived from a spherical cap harmonic fit to the SuperDARN measurements and the fill-in model. When compared with the standard SuperDARN fill-in model, we find that our fill-in model provides improved solutions, and the final solutions are in better agreement with the SuperDARN measurements. Our solutions are far less dynamic than the standard SuperDARN solutions, which we interpret as being due to a lack of magnetosphere-ionosphere inertia and communication delays in the standard SuperDARN technique while it is inherently included in our approach. Rather, we argue that the magnetosphere-ionosphere system has inertia that prevents the global convection from changing abruptly in response to an interplanetary magnetic field change.

  9. Reduction of interferences in graphite furnace atomic absorption spectrometry by multiple linear regression modelling

    NASA Astrophysics Data System (ADS)

    Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto

    2000-12-01

    The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.

  10. Quantifying the uncertainty of regional and national estimates of soil carbon stocks

    NASA Astrophysics Data System (ADS)

    Papritz, Andreas

    2013-04-01

    At regional and national scales, carbon (C) stocks are frequently estimated by means of regression models. Such statistical models link measurements of carbons stocks, recorded for a set of soil profiles or soil cores, to covariates that characterize soil formation conditions and land management. A prerequisite is that these covariates are available for any location within a region of interest G because they are used along with the fitted regression coefficients to predict the carbon stocks at the nodes of a fine-meshed grid that is laid over G. The mean C stock in G is then estimated by the arithmetic mean of the stock predictions for the grid nodes. Apart from the mean stock, the precision of the estimate is often also of interest, for example to judge whether the mean C stock has changed significantly between two inventories. The standard error of the estimated mean stock in G can be computed from the regression results as well. Two issues are thereby important: (i) How large is the area of G relative to the support of the measurements? (ii) Are the residuals of the regression model spatially auto-correlated or is the assumption of statistical independence tenable? Both issues are correctly handled if one adopts a geostatistical block kriging approach for estimating the mean C stock within a region and its standard error. In the presentation I shall summarize the main ideas of external drift block kriging. To compute the standard error of the mean stock, one has in principle to sum the elements a potentially very large covariance matrix of point prediction errors, but I shall show that the required term can be approximated very well by Monte Carlo techniques. I shall further illustrated with a few examples how the standard error of the mean stock estimate changes with the size of G and with the strenght of the auto-correlation of the regression residuals. As an application a robust variant of block kriging is used to quantify the mean carbon stock stored in the soils of Swiss forests (Nussbaum et al., 2012). Nussbaum, M., Papritz, A., Baltensweiler, A., and Walthert, L. (2012). Organic carbon stocks of swiss forest soils. Final report, Institute of Terrestrial Ecosystems, ETH Zürich and Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), pp. 51, http://e-collection.library.ethz.ch/eserv/eth:6027/eth-6027-01.pdf

  11. Prediction of heat capacities of solid inorganic salts from group contributions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mostafa, A.T.M.G.; Eakman, J.M.; Yarbro, S.L.

    1997-01-01

    A group contribution technique is proposed to predict the coefficients in the heat capacity correlation, C{sub p} = a + bT + c/T{sup 2} + dT{sup 2}, for solid inorganic salts. The results from this work are compared with fits to experimental data from the literature. It is shown to give good predictions for both simple and complex solid inorganic salts. Literature heat capacities for a large number (664) of solid inorganic salts covering a broad range of cations (129), anions (17) and ligands (2) have been used in regressions to obtain group contributions for the parameters in the heatmore » capacity temperature function. A mean error of 3.18% is found when predicted values are compared with literature values for heat capacity at 298{degrees} K. Estimates of the error standard deviation from the regression for each additivity constant are also determined.« less

  12. Quantitative analysis of aircraft multispectral-scanner data and mapping of water-quality parameters in the James River in Virginia

    NASA Technical Reports Server (NTRS)

    Johnson, R. W.; Bahn, G. S.

    1977-01-01

    Statistical analysis techniques were applied to develop quantitative relationships between in situ river measurements and the remotely sensed data that were obtained over the James River in Virginia on 28 May 1974. The remotely sensed data were collected with a multispectral scanner and with photographs taken from an aircraft platform. Concentration differences among water quality parameters such as suspended sediment, chlorophyll a, and nutrients indicated significant spectral variations. Calibrated equations from the multiple regression analysis were used to develop maps that indicated the quantitative distributions of water quality parameters and the dispersion characteristics of a pollutant plume entering the turbid river system. Results from further analyses that use only three preselected multispectral scanner bands of data indicated that regression coefficients and standard errors of estimate were not appreciably degraded compared with results from the 10-band analysis.

  13. Engine With Regression and Neural Network Approximators Designed

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Hopkins, Dale A.

    2001-01-01

    At the NASA Glenn Research Center, the NASA engine performance program (NEPP, ref. 1) and the design optimization testbed COMETBOARDS (ref. 2) with regression and neural network analysis-approximators have been coupled to obtain a preliminary engine design methodology. The solution to a high-bypass-ratio subsonic waverotor-topped turbofan engine, which is shown in the preceding figure, was obtained by the simulation depicted in the following figure. This engine is made of 16 components mounted on two shafts with 21 flow stations. The engine is designed for a flight envelope with 47 operating points. The design optimization utilized both neural network and regression approximations, along with the cascade strategy (ref. 3). The cascade used three algorithms in sequence: the method of feasible directions, the sequence of unconstrained minimizations technique, and sequential quadratic programming. The normalized optimum thrusts obtained by the three methods are shown in the following figure: the cascade algorithm with regression approximation is represented by a triangle, a circle is shown for the neural network solution, and a solid line indicates original NEPP results. The solutions obtained from both approximate methods lie within one standard deviation of the benchmark solution for each operating point. The simulation improved the maximum thrust by 5 percent. The performance of the linear regression and neural network methods as alternate engine analyzers was found to be satisfactory for the analysis and operation optimization of air-breathing propulsion engines (ref. 4).

  14. Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

    PubMed Central

    Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses. PMID:22457655

  15. Tools to support interpreting multiple regression in the face of multicollinearity.

    PubMed

    Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

  16. A comparative study between nonlinear regression and nonparametric approaches for modelling Phalaris paradoxa seedling emergence

    USDA-ARS?s Scientific Manuscript database

    Parametric non-linear regression (PNR) techniques commonly are used to develop weed seedling emergence models. Such techniques, however, require statistical assumptions that are difficult to meet. To examine and overcome these limitations, we compared PNR with a nonparametric estimation technique. F...

  17. Visual field progression with frequency-doubling matrix perimetry and standard automated perimetry in patients with glaucoma and in healthy controls.

    PubMed

    Redmond, Tony; O'Leary, Neil; Hutchison, Donna M; Nicolela, Marcelo T; Artes, Paul H; Chauhan, Balwantray C

    2013-12-01

    A new analysis method called permutation of pointwise linear regression measures the significance of deterioration over time at each visual field location, combines the significance values into an overall statistic, and then determines the likelihood of change in the visual field. Because the outcome is a single P value, individualized to that specific visual field and independent of the scale of the original measurement, the method is well suited for comparing techniques with different stimuli and scales. To test the hypothesis that frequency-doubling matrix perimetry (FDT2) is more sensitive than standard automated perimetry (SAP) in identifying visual field progression in glaucoma. Patients with open-angle glaucoma and healthy controls were examined by FDT2 and SAP, both with the 24-2 test pattern, on the same day at 6-month intervals in a longitudinal prospective study conducted in a hospital-based setting. Only participants with at least 5 examinations were included. Data were analyzed with permutation of pointwise linear regression. Permutation of pointwise linear regression is individualized to each participant, in contrast to current analyses in which the statistical significance is inferred from population-based approaches. Analyses were performed with both total deviation and pattern deviation. Sixty-four patients and 36 controls were included in the study. The median age, SAP mean deviation, and follow-up period were 65 years, -2.6 dB, and 5.4 years, respectively, in patients and 62 years, +0.4 dB, and 5.2 years, respectively, in controls. Using total deviation analyses, statistically significant deterioration was identified in 17% of patients with FDT2, in 34% of patients with SAP, and in 14% of patients with both techniques; in controls these percentages were 8% with FDT2, 31% with SAP, and 8% with both. Using pattern deviation analyses, statistically significant deterioration was identified in 16% of patients with FDT2, in 17% of patients with SAP, and in 3% of patients with both techniques; in controls these values were 3% with FDT2 and none with SAP. No evidence was found that FDT2 is more sensitive than SAP in identifying visual field deterioration. In about one-third of healthy controls, age-related deterioration with SAP reached statistical significance.

  18. Estimating sensible heat flux in agricultural screenhouses by the flux-variance and half-order time derivative methods

    NASA Astrophysics Data System (ADS)

    Achiman, Ori; Mekhmandarov, Yonatan; Pirkner, Moran; Tanny, Josef

    2016-04-01

    Previous studies have established that the eddy covariance (EC) technique is reliable for whole canopy flux measurements in agricultural crops covered by porous screens, i.e., screenhouses. Nevertheless, the eddy covariance technique remains difficult to apply in the farm due to costs, operational complexity, and post-processing of data - thereby inviting alternative techniques to be developed. The subject of this research was estimating the sensible heat flux by two turbulent transport techniques, namely, Flux-Variance (FV) and Half-order Time Derivative (HTD) whose instrumentation needs and operational demands are not as elaborate as the EC. The FV is based on the standard deviation of high frequency temperature measurements and a similarity constant CT. The HTD method requires mean air temperature and air velocity data. Measurements were carried out in two types of screenhouses: (i) a banana plantation in a light shading (8%) screenhouse; (ii) a pepper crop in a dense insect-proof (50-mesh) screenhouse. In each screenhouse an EC system was deployed for reference and high frequency air temperature measurements were conducted using miniature thermocouples installed at several levels to identify the optimal measurement height. Quality control analysis showed that turbulence development and flow stationarity conditions in the two structures were suitable for flux measurements by the EC technique. Energy balance closure slopes in the two screenhouses were larger than 0.71, in agreement with results for open fields. Regressions between sensible heat flux measured by EC and estimated by FV resulted with CT values that were usually larger than 1, the typical value for open field. In both shading and insect-proof screenhouses the CT value generally increased with height. The optimal measurement height, defined as the height with maximum R2 of the regression between EC and FV sensible heat fluxes, was just above the screen. CT value at optimal height was 2.64 and 1.52 for the shading and insect-proof screenhouses, respectively, with R2 = 0.73 in both types of structures. FV data analysis of the temperature signal at frequencies lower than 10 Hz showed that R2 of these regressions was insensitive to the data analysis frequency up to 0.5 Hz. This suggests that turbulent transport in the screenhouses was governed by large scale vortices. Regressions between EC and HTD sensible heat fluxes resulted with R2 which slightly decreased with height and had values between 0.3 and 0.4 for both screenhouses. The regression slopes also decreased with height and had values between 0.4 and 0.6. We conclude that in screenhouses the FV technique provides a more reliable estimate of the sensible heat flux than the HTD; however, the latter is simpler and more robust in terms of equipment, operation and data analysis and hence may be more attainable for day-to-day use by the growers.

  19. Predicting School Enrollments Using the Modified Regression Technique.

    ERIC Educational Resources Information Center

    Grip, Richard S.; Young, John W.

    This report is based on a study in which a regression model was constructed to increase accuracy in enrollment predictions. A model, known as the Modified Regression Technique (MRT), was used to examine K-12 enrollment over the past 20 years in 2 New Jersey school districts of similar size and ethnicity. To test the model's accuracy, MRT was…

  20. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  1. SU-E-T-497: Semi-Automated in Vivo Radiochromic Film Dosimetry Using a Novel Image Processing Algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reyhan, M; Yue, N

    Purpose: To validate an automated image processing algorithm designed to detect the center of radiochromic film used for in vivo film dosimetry against the current gold standard of manual selection. Methods: An image processing algorithm was developed to automatically select the region of interest (ROI) in *.tiff images that contain multiple pieces of radiochromic film (0.5x1.3cm{sup 2}). After a user has linked a calibration file to the processing algorithm and selected a *.tiff file for processing, an ROI is automatically detected for all films by a combination of thresholding and erosion, which removes edges and any additional markings for orientation.more » Calibration is applied to the mean pixel values from the ROIs and a *.tiff image is output displaying the original image with an overlay of the ROIs and the measured doses. Validation of the algorithm was determined by comparing in vivo dose determined using the current gold standard (manually drawn ROIs) versus automated ROIs for n=420 scanned films. Bland-Altman analysis, paired t-test, and linear regression were performed to demonstrate agreement between the processes. Results: The measured doses ranged from 0.2-886.6cGy. Bland-Altman analysis of the two techniques (automatic minus manual) revealed a bias of -0.28cGy and a 95% confidence interval of (5.5cGy,-6.1cGy). These values demonstrate excellent agreement between the two techniques. Paired t-test results showed no statistical differences between the two techniques, p=0.98. Linear regression with a forced zero intercept demonstrated that Automatic=0.997*Manual, with a Pearson correlation coefficient of 0.999. The minimal differences between the two techniques may be explained by the fact that the hand drawn ROIs were not identical to the automatically selected ones. The average processing time was 6.7seconds in Matlab on an IntelCore2Duo processor. Conclusion: An automated image processing algorithm has been developed and validated, which will help minimize user interaction and processing time of radiochromic film used for in vivo dosimetry.« less

  2. The Geometry of Enhancement in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.

    2011-01-01

    In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…

  3. Logistic regression for risk factor modelling in stuttering research.

    PubMed

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. Method for obtaining electron energy-density functions from Langmuir-probe data using a card-programmable calculator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Longhurst, G.R.

    This paper presents a method for obtaining electron energy density functions from Langmuir probe data taken in cool, dense plasmas where thin-sheath criteria apply and where magnetic effects are not severe. Noise is filtered out by using regression of orthogonal polynomials. The method requires only a programmable calculator (TI-59 or equivalent) to implement and can be used for the most general, nonequilibrium electron energy distribution plasmas. Data from a mercury ion source analyzed using this method are presented and compared with results for the same data using standard numerical techniques.

  5. Competing risks models and time-dependent covariates

    PubMed Central

    Barnett, Adrian; Graves, Nick

    2008-01-01

    New statistical models for analysing survival data in an intensive care unit context have recently been developed. Two models that offer significant advantages over standard survival analyses are competing risks models and multistate models. Wolkewitz and colleagues used a competing risks model to examine survival times for nosocomial pneumonia and mortality. Their model was able to incorporate time-dependent covariates and so examine how risk factors that changed with time affected the chances of infection or death. We briefly explain how an alternative modelling technique (using logistic regression) can more fully exploit time-dependent covariates for this type of data. PMID:18423067

  6. Model averaging and muddled multimodel inferences.

    PubMed

    Cade, Brian S

    2015-09-01

    Three flawed practices associated with model averaging coefficients for predictor variables in regression models commonly occur when making multimodel inferences in analyses of ecological data. Model-averaged regression coefficients based on Akaike information criterion (AIC) weights have been recommended for addressing model uncertainty but they are not valid, interpretable estimates of partial effects for individual predictors when there is multicollinearity among the predictor variables. Multicollinearity implies that the scaling of units in the denominators of the regression coefficients may change across models such that neither the parameters nor their estimates have common scales, therefore averaging them makes no sense. The associated sums of AIC model weights recommended to assess relative importance of individual predictors are really a measure of relative importance of models, with little information about contributions by individual predictors compared to other measures of relative importance based on effects size or variance reduction. Sometimes the model-averaged regression coefficients for predictor variables are incorrectly used to make model-averaged predictions of the response variable when the models are not linear in the parameters. I demonstrate the issues with the first two practices using the college grade point average example extensively analyzed by Burnham and Anderson. I show how partial standard deviations of the predictor variables can be used to detect changing scales of their estimates with multicollinearity. Standardizing estimates based on partial standard deviations for their variables can be used to make the scaling of the estimates commensurate across models, a necessary but not sufficient condition for model averaging of the estimates to be sensible. A unimodal distribution of estimates and valid interpretation of individual parameters are additional requisite conditions. The standardized estimates or equivalently the t statistics on unstandardized estimates also can be used to provide more informative measures of relative importance than sums of AIC weights. Finally, I illustrate how seriously compromised statistical interpretations and predictions can be for all three of these flawed practices by critiquing their use in a recent species distribution modeling technique developed for predicting Greater Sage-Grouse (Centrocercus urophasianus) distribution in Colorado, USA. These model averaging issues are common in other ecological literature and ought to be discontinued if we are to make effective scientific contributions to ecological knowledge and conservation of natural resources.

  7. Model averaging and muddled multimodel inferences

    USGS Publications Warehouse

    Cade, Brian S.

    2015-01-01

    Three flawed practices associated with model averaging coefficients for predictor variables in regression models commonly occur when making multimodel inferences in analyses of ecological data. Model-averaged regression coefficients based on Akaike information criterion (AIC) weights have been recommended for addressing model uncertainty but they are not valid, interpretable estimates of partial effects for individual predictors when there is multicollinearity among the predictor variables. Multicollinearity implies that the scaling of units in the denominators of the regression coefficients may change across models such that neither the parameters nor their estimates have common scales, therefore averaging them makes no sense. The associated sums of AIC model weights recommended to assess relative importance of individual predictors are really a measure of relative importance of models, with little information about contributions by individual predictors compared to other measures of relative importance based on effects size or variance reduction. Sometimes the model-averaged regression coefficients for predictor variables are incorrectly used to make model-averaged predictions of the response variable when the models are not linear in the parameters. I demonstrate the issues with the first two practices using the college grade point average example extensively analyzed by Burnham and Anderson. I show how partial standard deviations of the predictor variables can be used to detect changing scales of their estimates with multicollinearity. Standardizing estimates based on partial standard deviations for their variables can be used to make the scaling of the estimates commensurate across models, a necessary but not sufficient condition for model averaging of the estimates to be sensible. A unimodal distribution of estimates and valid interpretation of individual parameters are additional requisite conditions. The standardized estimates or equivalently the tstatistics on unstandardized estimates also can be used to provide more informative measures of relative importance than sums of AIC weights. Finally, I illustrate how seriously compromised statistical interpretations and predictions can be for all three of these flawed practices by critiquing their use in a recent species distribution modeling technique developed for predicting Greater Sage-Grouse (Centrocercus urophasianus) distribution in Colorado, USA. These model averaging issues are common in other ecological literature and ought to be discontinued if we are to make effective scientific contributions to ecological knowledge and conservation of natural resources.

  8. Seasonal forecasting of high wind speeds over Western Europe

    NASA Astrophysics Data System (ADS)

    Palutikof, J. P.; Holt, T.

    2003-04-01

    As financial losses associated with extreme weather events escalate, there is interest from end users in the forestry and insurance industries, for example, in the development of seasonal forecasting models with a long lead time. This study uses exceedences of the 90th, 95th, and 99th percentiles of daily maximum wind speed over the period 1958 to present to derive predictands of winter wind extremes. The source data is the 6-hourly NCEP Reanalysis gridded surface wind field. Predictor variables include principal components of Atlantic sea surface temperature and several indices of climate variability, including the NAO and SOI. Lead times of up to a year are considered, in monthly increments. Three regression techniques are evaluated; multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS). PCR and PLS proved considerably superior to MLR with much lower standard errors. PLS was chosen to formulate the predictive model since it offers more flexibility in experimental design and gave slightly better results than PCR. The results indicate that winter windiness can be predicted with considerable skill one year ahead for much of coastal Europe, but that this deteriorates rapidly in the hinterland. The experiment succeeded in highlighting PLS as a very useful method for developing more precise forecasting models, and in identifying areas of high predictability.

  9. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  10. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  11. Estimation of stature from the foot and its segments in a sub-adult female population of North India

    PubMed Central

    2011-01-01

    Background Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. Methods The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. Results The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. Conclusions The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults. PMID:22104433

  12. Estimation of stature from the foot and its segments in a sub-adult female population of North India.

    PubMed

    Krishan, Kewal; Kanchan, Tanuj; Passi, Neelam

    2011-11-21

    Establishing personal identity is one of the main concerns in forensic investigations. Estimation of stature forms a basic domain of the investigation process in unknown and co-mingled human remains in forensic anthropology case work. The objective of the present study was to set up standards for estimation of stature from the foot and its segments in a sub-adult female population. The sample for the study constituted 149 young females from the Northern part of India. The participants were aged between 13 and 18 years. Besides stature, seven anthropometric measurements that included length of the foot from each toe (T1, T2, T3, T4, and T5 respectively), foot breadth at ball (BBAL) and foot breadth at heel (BHEL) were measured on both feet in each participant using standard methods and techniques. The results indicated that statistically significant differences (p < 0.05) between left and right feet occur in both the foot breadth measurements (BBAL and BHEL). Foot length measurements (T1 to T5 lengths) did not show any statistically significant bilateral asymmetry. The correlation between stature and all the foot measurements was found to be positive and statistically significant (p-value < 0.001). Linear regression models and multiple regression models were derived for estimation of stature from the measurements of the foot. The present study indicates that anthropometric measurements of foot and its segments are valuable in the estimation of stature. Foot length measurements estimate stature with greater accuracy when compared to foot breadth measurements. The present study concluded that foot measurements have a strong relationship with stature in the sub-adult female population of North India. Hence, the stature of an individual can be successfully estimated from the foot and its segments using different regression models derived in the study. The regression models derived in the study may be applied successfully for the estimation of stature in sub-adult females, whenever foot remains are brought for forensic examination. Stepwise multiple regression models tend to estimate stature more accurately than linear regression models in female sub-adults.

  13. Validity of linear measurements of the jaws using ultralow-dose MDCT and the iterative techniques of ASIR and MBIR.

    PubMed

    Al-Ekrish, Asma'a A; Al-Shawaf, Reema; Schullian, Peter; Al-Sadhan, Ra'ed; Hörmann, Romed; Widmann, Gerlig

    2016-10-01

    To assess the comparability of linear measurements of dental implant sites recorded from multidetector computed tomography (MDCT) images obtained using standard-dose filtered backprojection (FBP) technique with those from various ultralow doses combined with FBP, adaptive statistical iterative reconstruction (ASIR), and model-based iterative reconstruction (MBIR) techniques. The results of the study may contribute to MDCT dose optimization for dental implant site imaging. MDCT scans of two cadavers were acquired using a standard reference protocol and four ultralow-dose test protocols (TP). The volume CT dose index of the different dose protocols ranged from a maximum of 30.48-36.71 mGy to a minimum of 0.44-0.53 mGy. All scans were reconstructed using FBP, ASIR-50, ASIR-100, and MBIR, and either a bone or standard reconstruction kernel. Linear measurements were recorded from standardized images of the jaws by two examiners. Intra- and inter-examiner reliability of the measurements were analyzed using Cronbach's alpha and inter-item correlation. Agreement between the measurements obtained with the reference-dose/FBP protocol and each of the test protocols was determined with Bland-Altman plots and linear regression. Statistical significance was set at a P-value of 0.05. No systematic variation was found between the linear measurements obtained with the reference protocol and the other imaging protocols. The only exceptions were TP3/ASIR-50 (bone kernel) and TP4/ASIR-100 (bone and standard kernels). The mean measurement differences between these three protocols and the reference protocol were within ±0.1 mm, with the 95 % confidence interval limits being within the range of ±1.15 mm. A nearly 97.5 % reduction in dose did not significantly affect the height and width measurements of edentulous jaws regardless of the reconstruction algorithm used.

  14. Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

    NASA Astrophysics Data System (ADS)

    Wu, Cheng; Zhen Yu, Jian

    2018-03-01

    Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.

  15. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care.

    PubMed

    Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M

    2014-06-19

    An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.

  16. Comparison of data mining techniques applied to fetal heart rate parameters for the early identification of IUGR fetuses.

    PubMed

    Magenes, G; Bellazzi, R; Malovini, A; Signorini, M G

    2016-08-01

    The onset of fetal pathologies can be screened during pregnancy by means of Fetal Heart Rate (FHR) monitoring and analysis. Noticeable advances in understanding FHR variations were obtained in the last twenty years, thanks to the introduction of quantitative indices extracted from the FHR signal. This study searches for discriminating Normal and Intra Uterine Growth Restricted (IUGR) fetuses by applying data mining techniques to FHR parameters, obtained from recordings in a population of 122 fetuses (61 healthy and 61 IUGRs), through standard CTG non-stress test. We computed N=12 indices (N=4 related to time domain FHR analysis, N=4 to frequency domain and N=4 to non-linear analysis) and normalized them with respect to the gestational week. We compared, through a 10-fold crossvalidation procedure, 15 data mining techniques in order to select the more reliable approach for identifying IUGR fetuses. The results of this comparison highlight that two techniques (Random Forest and Logistic Regression) show the best classification accuracy and that both outperform the best single parameter in terms of mean AUROC on the test sets.

  17. Estimation of Unsteady Aerodynamic Models from Dynamic Wind Tunnel Data

    NASA Technical Reports Server (NTRS)

    Murphy, Patrick; Klein, Vladislav

    2011-01-01

    Demanding aerodynamic modelling requirements for military and civilian aircraft have motivated researchers to improve computational and experimental techniques and to pursue closer collaboration in these areas. Model identification and validation techniques are key components for this research. This paper presents mathematical model structures and identification techniques that have been used successfully to model more general aerodynamic behaviours in single-degree-of-freedom dynamic testing. Model parameters, characterizing aerodynamic properties, are estimated using linear and nonlinear regression methods in both time and frequency domains. Steps in identification including model structure determination, parameter estimation, and model validation, are addressed in this paper with examples using data from one-degree-of-freedom dynamic wind tunnel and water tunnel experiments. These techniques offer a methodology for expanding the utility of computational methods in application to flight dynamics, stability, and control problems. Since flight test is not always an option for early model validation, time history comparisons are commonly made between computational and experimental results and model adequacy is inferred by corroborating results. An extension is offered to this conventional approach where more general model parameter estimates and their standard errors are compared.

  18. Simplified estimation of age-specific reference intervals for skewed data.

    PubMed

    Wright, E M; Royston, P

    1997-12-30

    Age-specific reference intervals are commonly used in medical screening and clinical practice, where interest lies in the detection of extreme values. Many different statistical approaches have been published on this topic. The advantages of a parametric method are that they necessarily produce smooth centile curves, the entire density is estimated and an explicit formula is available for the centiles. The method proposed here is a simplified version of a recent approach proposed by Royston and Wright. Basic transformations of the data and multiple regression techniques are combined to model the mean, standard deviation and skewness. Using these simple tools, which are implemented in almost all statistical computer packages, age-specific reference intervals may be obtained. The scope of the method is illustrated by fitting models to several real data sets and assessing each model using goodness-of-fit techniques.

  19. Event detection and localization for small mobile robots using reservoir computing.

    PubMed

    Antonelo, E A; Schrauwen, B; Stroobandt, D

    2008-08-01

    Reservoir Computing (RC) techniques use a fixed (usually randomly created) recurrent neural network, or more generally any dynamic system, which operates at the edge of stability, where only a linear static readout output layer is trained by standard linear regression methods. In this work, RC is used for detecting complex events in autonomous robot navigation. This can be extended to robot localization tasks which are solely based on a few low-range, high-noise sensory data. The robot thus builds an implicit map of the environment (after learning) that is used for efficient localization by simply processing the input stream of distance sensors. These techniques are demonstrated in both a simple simulation environment and in the physically realistic Webots simulation of the commercially available e-puck robot, using several complex and even dynamic environments.

  20. Application of Semiparametric Spline Regression Model in Analyzing Factors that In uence Population Density in Central Java

    NASA Astrophysics Data System (ADS)

    Sumantari, Y. D.; Slamet, I.; Sugiyanto

    2017-06-01

    Semiparametric regression is a statistical analysis method that consists of parametric and nonparametric regression. There are various approach techniques in nonparametric regression. One of the approach techniques is spline. Central Java is one of the most densely populated province in Indonesia. Population density in this province can be modeled by semiparametric regression because it consists of parametric and nonparametric component. Therefore, the purpose of this paper is to determine the factors that in uence population density in Central Java using the semiparametric spline regression model. The result shows that the factors which in uence population density in Central Java is Family Planning (FP) active participants and district minimum wage.

  1. Kalman filter approach for uncertainty quantification in time-resolved laser-induced incandescence.

    PubMed

    Hadwin, Paul J; Sipkens, Timothy A; Thomson, Kevin A; Liu, Fengshan; Daun, Kyle J

    2018-03-01

    Time-resolved laser-induced incandescence (TiRe-LII) data can be used to infer spatially and temporally resolved volume fractions and primary particle size distributions of soot-laden aerosols, but these estimates are corrupted by measurement noise as well as uncertainties in the spectroscopic and heat transfer submodels used to interpret the data. Estimates of the temperature, concentration, and size distribution of soot primary particles within a sample aerosol are typically made by nonlinear regression of modeled spectral incandescence decay, or effective temperature decay, to experimental data. In this work, we employ nonstationary Bayesian estimation techniques to infer aerosol properties from simulated and experimental LII signals, specifically the extended Kalman filter and Schmidt-Kalman filter. These techniques exploit the time-varying nature of both the measurements and the models, and they reveal how uncertainty in the estimates computed from TiRe-LII data evolves over time. Both techniques perform better when compared with standard deterministic estimates; however, we demonstrate that the Schmidt-Kalman filter produces more realistic uncertainty estimates.

  2. Impact of multicollinearity on small sample hydrologic regression models

    NASA Astrophysics Data System (ADS)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  3. Statistical approach to Higgs boson couplings in the standard model effective field theory

    NASA Astrophysics Data System (ADS)

    Murphy, Christopher W.

    2018-01-01

    We perform a parameter fit in the standard model effective field theory (SMEFT) with an emphasis on using regularized linear regression to tackle the issue of the large number of parameters in the SMEFT. In regularized linear regression, a positive definite function of the parameters of interest is added to the usual cost function. A cross-validation is performed to try to determine the optimal value of the regularization parameter to use, but it selects the standard model (SM) as the best model to explain the measurements. Nevertheless as proof of principle of this technique we apply it to fitting Higgs boson signal strengths in SMEFT, including the latest Run-2 results. Results are presented in terms of the eigensystem of the covariance matrix of the least squares estimators as it has a degree model-independent to it. We find several results in this initial work: the SMEFT predicts the total width of the Higgs boson to be consistent with the SM prediction; the ATLAS and CMS experiments at the LHC are currently sensitive to non-resonant double Higgs boson production. Constraints are derived on the viable parameter space for electroweak baryogenesis in the SMEFT, reinforcing the notion that a first order phase transition requires fairly low-scale beyond the SM physics. Finally, we study which future experimental measurements would give the most improvement on the global constraints on the Higgs sector of the SMEFT.

  4. Development of quantitative screen for 1550 chemicals with GC-MS.

    PubMed

    Bergmann, Alan J; Points, Gary L; Scott, Richard P; Wilson, Glenn; Anderson, Kim A

    2018-05-01

    With hundreds of thousands of chemicals in the environment, effective monitoring requires high-throughput analytical techniques. This paper presents a quantitative screening method for 1550 chemicals based on statistical modeling of responses with identification and integration performed using deconvolution reporting software. The method was evaluated with representative environmental samples. We tested biological extracts, low-density polyethylene, and silicone passive sampling devices spiked with known concentrations of 196 representative chemicals. A multiple linear regression (R 2  = 0.80) was developed with molecular weight, logP, polar surface area, and fractional ion abundance to predict chemical responses within a factor of 2.5. Linearity beyond the calibration had R 2  > 0.97 for three orders of magnitude. Median limits of quantitation were estimated to be 201 pg/μL (1.9× standard deviation). The number of detected chemicals and the accuracy of quantitation were similar for environmental samples and standard solutions. To our knowledge, this is the most precise method for the largest number of semi-volatile organic chemicals lacking authentic standards. Accessible instrumentation and software make this method cost effective in quantifying a large, customizable list of chemicals. When paired with silicone wristband passive samplers, this quantitative screen will be very useful for epidemiology where binning of concentrations is common. Graphical abstract A multiple linear regression of chemical responses measured with GC-MS allowed quantitation of 1550 chemicals in samples such as silicone wristbands.

  5. Anesthesia Technique and Outcomes of Mechanical Thrombectomy in Patients With Acute Ischemic Stroke.

    PubMed

    Bekelis, Kimon; Missios, Symeon; MacKenzie, Todd A; Tjoumakaris, Stavropoula; Jabbour, Pascal

    2017-02-01

    The impact of anesthesia technique on the outcomes of mechanical thrombectomy for acute ischemic stroke remains an issue of debate. We investigated the association of general anesthesia with outcomes in patients undergoing mechanical thrombectomy for ischemic stroke. We performed a cohort study involving patients undergoing mechanical thrombectomy for ischemic stroke from 2009 to 2013, who were registered in the New York Statewide Planning and Research Cooperative System database. An instrumental variable (hospital rate of general anesthesia) analysis was used to simulate the effects of randomization and investigate the association of anesthesia technique with case-fatality and length of stay. Among 1174 patients, 441 (37.6%) underwent general anesthesia and 733 (62.4%) underwent conscious sedation. Using an instrumental variable analysis, we identified that general anesthesia was associated with a 6.4% increased case-fatality (95% confidence interval, 1.9%-11.0%) and 8.4 days longer length of stay (95% confidence interval, 2.9-14.0) in comparison to conscious sedation. This corresponded to 15 patients needing to be treated with conscious sedation to prevent 1 death. Our results were robust in sensitivity analysis with mixed effects regression and propensity score-adjusted regression models. Using a comprehensive all-payer cohort of acute ischemic stroke patients undergoing mechanical thrombectomy in New York State, we identified an association of general anesthesia with increased case-fatality and length of stay. These considerations should be taken into account when standardizing acute stroke care. © 2017 American Heart Association, Inc.

  6. A collaborative comparison of objective structured clinical examination (OSCE) standard setting methods at Australian medical schools.

    PubMed

    Malau-Aduli, Bunmi Sherifat; Teague, Peta-Ann; D'Souza, Karen; Heal, Clare; Turner, Richard; Garne, David L; van der Vleuten, Cees

    2017-12-01

    A key issue underpinning the usefulness of the OSCE assessment to medical education is standard setting, but the majority of standard-setting methods remain challenging for performance assessment because they produce varying passing marks. Several studies have compared standard-setting methods; however, most of these studies are limited by their experimental scope, or use data on examinee performance at a single OSCE station or from a single medical school. This collaborative study between 10 Australian medical schools investigated the effect of standard-setting methods on OSCE cut scores and failure rates. This research used 5256 examinee scores from seven shared OSCE stations to calculate cut scores and failure rates using two different compromise standard-setting methods, namely the Borderline Regression and Cohen's methods. The results of this study indicate that Cohen's method yields similar outcomes to the Borderline Regression method, particularly for large examinee cohort sizes. However, with lower examinee numbers on a station, the Borderline Regression method resulted in higher cut scores and larger difference margins in the failure rates. Cohen's method yields similar outcomes as the Borderline Regression method and its application for benchmarking purposes and in resource-limited settings is justifiable, particularly with large examinee numbers.

  7. Estimates of Flow Duration, Mean Flow, and Peak-Discharge Frequency Values for Kansas Stream Locations

    USGS Publications Warehouse

    Perry, Charles A.; Wolock, David M.; Artman, Joshua C.

    2004-01-01

    Streamflow statistics of flow duration and peak-discharge frequency were estimated for 4,771 individual locations on streams listed on the 1999 Kansas Surface Water Register. These statistics included the flow-duration values of 90, 75, 50, 25, and 10 percent, as well as the mean flow value. Peak-discharge frequency values were estimated for the 2-, 5-, 10-, 25-, 50-, and 100-year floods. Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating flow-duration values of 90, 75, 50, 25, and 10 percent and the mean flow for uncontrolled flow stream locations. The contributing-drainage areas of 149 U.S. Geological Survey streamflow-gaging stations in Kansas and parts of surrounding States that had flow uncontrolled by Federal reservoirs and used in the regression analyses ranged from 2.06 to 12,004 square miles. Logarithmic transformations of climatic and basin data were performed to yield the best linear relation for developing equations to compute flow durations and mean flow. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were contributing-drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. The analyses yielded a model standard error of prediction range of 0.43 logarithmic units for the 90-percent duration analysis to 0.15 logarithmic units for the 10-percent duration analysis. The model standard error of prediction was 0.14 logarithmic units for the mean flow. Regression equations used to estimate peak-discharge frequency values were obtained from a previous report, and estimates for the 2-, 5-, 10-, 25-, 50-, and 100-year floods were determined for this report. The regression equations and an interpolation procedure were used to compute flow durations, mean flow, and estimates of peak-discharge frequency for locations along uncontrolled flow streams on the 1999 Kansas Surface Water Register. Flow durations, mean flow, and peak-discharge frequency values determined at available gaging stations were used to interpolate the regression-estimated flows for the stream locations where available. Streamflow statistics for locations that had uncontrolled flow were interpolated using data from gaging stations weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled reaches of Kansas streams, the streamflow statistics were interpolated between gaging stations using only gaged data weighted by drainage area.

  8. Reconstructions of Soil Moisture for the Upper Colorado River Basin Using Tree-Ring Chronologies

    NASA Astrophysics Data System (ADS)

    Tootle, G.; Anderson, S.; Grissino-Mayer, H.

    2012-12-01

    Soil moisture is an important factor in the global hydrologic cycle, but existing reconstructions of historic soil moisture are limited. Tree-ring chronologies (TRCs) were used to reconstruct annual soil moisture in the Upper Colorado River Basin (UCRB). Gridded soil moisture data were spatially regionalized using principal components analysis and k-nearest neighbor techniques. Moisture sensitive tree-ring chronologies in and adjacent to the UCRB were correlated with regional soil moisture and tested for temporal stability. TRCs that were positively correlated and stable for the calibration period were retained. Stepwise linear regression was applied to identify the best predictor combinations for each soil moisture region. The regressions explained 42-78% of the variability in soil moisture data. We performed reconstructions for individual soil moisture grid cells to enhance understanding of the disparity in reconstructive skill across the regions. Reconstructions that used chronologies based on ponderosa pines (Pinus ponderosa) and pinyon pines (Pinus edulis) explained increased variance in the datasets. Reconstructed soil moisture was standardized and compared with standardized reconstructed streamflow and snow water equivalent from the same region. Soil moisture reconstructions were highly correlated with streamflow and snow water equivalent reconstructions, indicating reconstructions of soil moisture in the UCRB using TRCs successfully represent hydrologic trends, including the identification of periods of prolonged drought.

  9. Prediction of adult height in girls: the Beunen-Malina-Freitas method.

    PubMed

    Beunen, Gaston P; Malina, Robert M; Freitas, Duarte L; Thomis, Martine A; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Maes, Hermine H; Lefevre, Johan

    2011-12-01

    The purpose of this study was to validate and cross-validate the Beunen-Malina-Freitas method for non-invasive prediction of adult height in girls. A sample of 420 girls aged 10-15 years from the Madeira Growth Study were measured at yearly intervals and then 8 years later. Anthropometric dimensions (lengths, breadths, circumferences, and skinfolds) were measured; skeletal age was assessed using the Tanner-Whitehouse 3 method and menarcheal status (present or absent) was recorded. Adult height was measured and predicted using stepwise, forward, and maximum R (2) regression techniques. Multiple correlations, mean differences, standard errors of prediction, and error boundaries were calculated. A sample of the Leuven Longitudinal Twin Study was used to cross-validate the regressions. Age-specific coefficients of determination (R (2)) between predicted and measured adult height varied between 0.57 and 0.96, while standard errors of prediction varied between 1.1 and 3.9 cm. The cross-validation confirmed the validity of the Beunen-Malina-Freitas method in girls aged 12-15 years, but at lower ages the cross-validation was less consistent. We conclude that the Beunen-Malina-Freitas method is valid for the prediction of adult height in girls aged 12-15 years. It is applicable to European populations or populations of European ancestry.

  10. Regression techniques for oceanographic parameter retrieval using space-borne microwave radiometry

    NASA Technical Reports Server (NTRS)

    Hofer, R.; Njoku, E. G.

    1981-01-01

    Variations of conventional multiple regression techniques are applied to the problem of remote sensing of oceanographic parameters from space. The techniques are specifically adapted to the scanning multichannel microwave radiometer (SMRR) launched on the Seasat and Nimbus 7 satellites to determine ocean surface temperature, wind speed, and atmospheric water content. The retrievals are studied primarily from a theoretical viewpoint, to illustrate the retrieval error structure, the relative importances of different radiometer channels, and the tradeoffs between spatial resolution and retrieval accuracy. Comparisons between regressions using simulated and actual SMMR data are discussed; they show similar behavior.

  11. Adjustment of geochemical background by robust multivariate statistics

    USGS Publications Warehouse

    Zhou, D.

    1985-01-01

    Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.

  12. Characterization of acid functional groups of carbon dots by nonlinear regression data fitting of potentiometric titration curves

    NASA Astrophysics Data System (ADS)

    Alves, Larissa A.; de Castro, Arthur H.; de Mendonça, Fernanda G.; de Mesquita, João P.

    2016-05-01

    The oxygenated functional groups present on the surface of carbon dots with an average size of 2.7 ± 0.5 nm were characterized by a variety of techniques. In particular, we discussed the fit data of potentiometric titration curves using a nonlinear regression method based on the Levenberg-Marquardt algorithm. The results obtained by statistical treatment of the titration curve data showed that the best fit was obtained considering the presence of five Brønsted-Lowry acids on the surface of the carbon dots with constant ionization characteristics of carboxylic acids, cyclic ester, phenolic and pyrone-like groups. The total number of oxygenated acid groups obtained was 5 mmol g-1, with approximately 65% (∼2.9 mmol g-1) originating from groups with pKa < 6. The methodology showed good reproducibility and stability with standard deviations below 5%. The nature of the groups was independent of small variations in experimental conditions, i.e. the mass of carbon dots titrated and initial concentration of HCl solution. Finally, we believe that the methodology used here, together with other characterization techniques, is a simple, fast and powerful tool to characterize the complex acid-base properties of these so interesting and intriguing nanoparticles.

  13. Multicollinearity and Regression Analysis

    NASA Astrophysics Data System (ADS)

    Daoud, Jamal I.

    2017-12-01

    In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.

  14. Estimation of distributional parameters for censored trace level water quality data: 1. Estimation techniques

    USGS Publications Warehouse

    Gilliom, Robert J.; Helsel, Dennis R.

    1986-01-01

    A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensored observations, for determining the best performing parameter estimation method for any particular data set. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.

  15. Estimation of distributional parameters for censored trace level water quality data. 1. Estimation Techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilliom, R.J.; Helsel, D.R.

    1986-02-01

    A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less

  16. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    PubMed

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  17. Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures

    PubMed Central

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques. PMID:24416353

  18. Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)

    NASA Astrophysics Data System (ADS)

    Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul

    2018-05-01

    The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.

  19. A Survey of UML Based Regression Testing

    NASA Astrophysics Data System (ADS)

    Fahad, Muhammad; Nadeem, Aamer

    Regression testing is the process of ensuring software quality by analyzing whether changed parts behave as intended, and unchanged parts are not affected by the modifications. Since it is a costly process, a lot of techniques are proposed in the research literature that suggest testers how to build regression test suite from existing test suite with minimum cost. In this paper, we discuss the advantages and drawbacks of using UML diagrams for regression testing and analyze that UML model helps in identifying changes for regression test selection effectively. We survey the existing UML based regression testing techniques and provide an analysis matrix to give a quick insight into prominent features of the literature work. We discuss the open research issues like managing and reducing the size of regression test suite, prioritization of the test cases that would be helpful during strict schedule and resources that remain to be addressed for UML based regression testing.

  20. Estimating peak-flow frequency statistics for selected gaged and ungaged sites in naturally flowing streams and rivers in Idaho

    USGS Publications Warehouse

    Wood, Molly S.; Fosness, Ryan L.; Skinner, Kenneth D.; Veilleux, Andrea G.

    2016-06-27

    The U.S. Geological Survey, in cooperation with the Idaho Transportation Department, updated regional regression equations to estimate peak-flow statistics at ungaged sites on Idaho streams using recent streamflow (flow) data and new statistical techniques. Peak-flow statistics with 80-, 67-, 50-, 43-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities (1.25-, 1.50-, 2.00-, 2.33-, 5.00-, 10.0-, 25.0-, 50.0-, 100-, 200-, and 500-year recurrence intervals, respectively) were estimated for 192 streamgages in Idaho and bordering States with at least 10 years of annual peak-flow record through water year 2013. The streamgages were selected from drainage basins with little or no flow diversion or regulation. The peak-flow statistics were estimated by fitting a log-Pearson type III distribution to records of annual peak flows and applying two additional statistical methods: (1) the Expected Moments Algorithm to help describe uncertainty in annual peak flows and to better represent missing and historical record; and (2) the generalized Multiple Grubbs Beck Test to screen out potentially influential low outliers and to better fit the upper end of the peak-flow distribution. Additionally, a new regional skew was estimated for the Pacific Northwest and used to weight at-station skew at most streamgages. The streamgages were grouped into six regions (numbered 1_2, 3, 4, 5, 6_8, and 7, to maintain consistency in region numbering with a previous study), and the estimated peak-flow statistics were related to basin and climatic characteristics to develop regional regression equations using a generalized least squares procedure. Four out of 24 evaluated basin and climatic characteristics were selected for use in the final regional peak-flow regression equations.Overall, the standard error of prediction for the regional peak-flow regression equations ranged from 22 to 132 percent. Among all regions, regression model fit was best for region 4 in west-central Idaho (average standard error of prediction=46.4 percent; pseudo-R2>92 percent) and region 5 in central Idaho (average standard error of prediction=30.3 percent; pseudo-R2>95 percent). Regression model fit was poor for region 7 in southern Idaho (average standard error of prediction=103 percent; pseudo-R2<78 percent) compared to other regions because few streamgages in region 7 met the criteria for inclusion in the study, and the region’s semi-arid climate and associated variability in precipitation patterns causes substantial variability in peak flows.A drainage area ratio-adjustment method, using ratio exponents estimated using generalized least-squares regression, was presented as an alternative to the regional regression equations if peak-flow estimates are desired at an ungaged site that is close to a streamgage selected for inclusion in this study. The alternative drainage area ratio-adjustment method is appropriate for use when the drainage area ratio between the ungaged and gaged sites is between 0.5 and 1.5.The updated regional peak-flow regression equations had lower total error (standard error of prediction) than all regression equations presented in a 1982 study and in four of six regions presented in 2002 and 2003 studies in Idaho. A more extensive streamgage screening process used in the current study resulted in fewer streamgages used in the current study than in the 1982, 2002, and 2003 studies. Fewer streamgages used and the selection of different explanatory variables were likely causes of increased error in some regions compared to previous studies, but overall, regional peak‑flow regression model fit was generally improved for Idaho. The revised statistical procedures and increased streamgage screening applied in the current study most likely resulted in a more accurate representation of natural peak-flow conditions.The updated, regional peak-flow regression equations will be integrated in the U.S. Geological Survey StreamStats program to allow users to estimate basin and climatic characteristics and peak-flow statistics at ungaged locations of interest. StreamStats estimates peak-flow statistics with quantifiable certainty only when used at sites with basin and climatic characteristics within the range of input variables used to develop the regional regression equations. Both the regional regression equations and StreamStats should be used to estimate peak-flow statistics only in naturally flowing, relatively unregulated streams without substantial local influences to flow, such as large seeps, springs, or other groundwater-surface water interactions that are not widespread or characteristic of the respective region.

  1. London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

    PubMed Central

    Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

    2017-01-01

    Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343

  2. Methods for estimating annual exceedance probability discharges for streams in Arkansas, based on data through water year 2013

    USGS Publications Warehouse

    Wagner, Daniel M.; Krieger, Joshua D.; Veilleux, Andrea G.

    2016-08-04

    In 2013, the U.S. Geological Survey initiated a study to update regional skew, annual exceedance probability discharges, and regional regression equations used to estimate annual exceedance probability discharges for ungaged locations on streams in the study area with the use of recent geospatial data, new analytical methods, and available annual peak-discharge data through the 2013 water year. An analysis of regional skew using Bayesian weighted least-squares/Bayesian generalized-least squares regression was performed for Arkansas, Louisiana, and parts of Missouri and Oklahoma. The newly developed constant regional skew of -0.17 was used in the computation of annual exceedance probability discharges for 281 streamgages used in the regional regression analysis. Based on analysis of covariance, four flood regions were identified for use in the generation of regional regression models. Thirty-nine basin characteristics were considered as potential explanatory variables, and ordinary least-squares regression techniques were used to determine the optimum combinations of basin characteristics for each of the four regions. Basin characteristics in candidate models were evaluated based on multicollinearity with other basin characteristics (variance inflation factor < 2.5) and statistical significance at the 95-percent confidence level (p ≤ 0.05). Generalized least-squares regression was used to develop the final regression models for each flood region. Average standard errors of prediction of the generalized least-squares models ranged from 32.76 to 59.53 percent, with the largest range in flood region D. Pseudo coefficients of determination of the generalized least-squares models ranged from 90.29 to 97.28 percent, with the largest range also in flood region D. The regional regression equations apply only to locations on streams in Arkansas where annual peak discharges are not substantially affected by regulation, diversion, channelization, backwater, or urbanization. The applicability and accuracy of the regional regression equations depend on the basin characteristics measured for an ungaged location on a stream being within range of those used to develop the equations.

  3. What are the important surgical factors affecting the wound healing after primary total knee arthroplasty?

    PubMed

    Harato, Kengo; Tanikawa, Hidenori; Morishige, Yutaro; Kaneda, Kazuya; Niki, Yasuo

    2016-01-13

    Wound condition after primary total knee arthroplasty (TKA) is an important issue to avoid any postoperative adverse events. Our purpose was to investigate and to clarify the important surgical factors affecting wound score after TKA. A total of 139 knees in 128 patients (mean 73 years) without severe comorbidity were enrolled in the present study. All primary unilateral or bilateral TKAs were done using the same skin incision line, measured resection technique, and wound closure technique using unidirectional barbed suture. In terms of the wound healing, Hollander Wound Evaluation Score (HWES) was assessed on postoperative day 14. We performed multiple regression analysis using stepwise method to identify the factors affecting HWES. Variables considered in the analysis were age, sex, body mass index (kg/m(2)), HbA1C (%), femorotibial angle (degrees) on plain radiographs, intraoperative patella eversion during the cutting phase of the femur and the tibia in knee flexion, intraoperative anterior translation of the tibia, patella resurfacing, surgical time (min), tourniquet time (min), length of skin incision (cm), postoperative drainage (ml), patellar height on postoperative lateral radiographs, and HWES. HWES was treated as a dependent variable, and others were as independent variables. The average HWES was 5.0 ± 0.8 point. According to stepwise forward regression test, patella eversion during the cutting phase of the femur and the tibia in knee flexion and anterior translation of the tibia were entered in this model, while other factors were not entered. Standardized partial regression coefficient was as follows: 0.57 in anterior translation of the tibia and 0.38 in patella eversion. Fortunately, in the present study using the unidirectional barbed suture, major wound healing problem did not occur. As to the surgical technique, intraoperative patella eversion and anterior translation of the tibia should be avoided for quality cosmesis in primary TKA.

  4. Hypothesis Testing Using Factor Score Regression

    PubMed Central

    Devlieger, Ines; Mayer, Axel; Rosseel, Yves

    2015-01-01

    In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886

  5. Studies in Software Cost Model Behavior: Do We Really Understand Cost Model Performance?

    NASA Technical Reports Server (NTRS)

    Lum, Karen; Hihn, Jairus; Menzies, Tim

    2006-01-01

    While there exists extensive literature on software cost estimation techniques, industry practice continues to rely upon standard regression-based algorithms. These software effort models are typically calibrated or tuned to local conditions using local data. This paper cautions that current approaches to model calibration often produce sub-optimal models because of the large variance problem inherent in cost data and by including far more effort multipliers than the data supports. Building optimal models requires that a wider range of models be considered while correctly calibrating these models requires rejection rules that prune variables and records and use multiple criteria for evaluating model performance. The main contribution of this paper is to document a standard method that integrates formal model identification, estimation, and validation. It also documents what we call the large variance problem that is a leading cause of cost model brittleness or instability.

  6. Large data series: Modeling the usual to identify the unusual

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Downing, D.J.; Fedorov, V.V.; Lawkins, W.F.

    {open_quotes}Standard{close_quotes} approaches such as regression analysis, Fourier analysis, Box-Jenkins procedure, et al., which handle a data series as a whole, are not useful for very large data sets for at least two reasons. First, even with computer hardware available today, including parallel processors and storage devices, there are no effective means for manipulating and analyzing gigabyte, or larger, data files. Second, in general it can not be assumed that a very large data set is {open_quotes}stable{close_quotes} by the usual measures, like homogeneity, stationarity, and ergodicity, that standard analysis techniques require. Both reasons dictate the necessity to use {open_quotes}local{close_quotes} data analysismore » methods whereby the data is segmented and ordered, where order leads to a sense of {open_quotes}neighbor,{close_quotes} and then analyzed segment by segment. The idea of local data analysis is central to the study reported here.« less

  7. Computed tomography-based volumetric tool for standardized measurement of the maxillary sinus

    PubMed Central

    Giacomini, Guilherme; Pavan, Ana Luiza Menegatti; Altemani, João Mauricio Carrasco; Duarte, Sergio Barbosa; Fortaleza, Carlos Magno Castelo Branco; Miranda, José Ricardo de Arruda

    2018-01-01

    Volume measurements of maxillary sinus may be useful to identify diseases affecting paranasal sinuses. However, literature shows a lack of consensus in studies measuring the volume. This may be attributable to different computed tomography data acquisition techniques, segmentation methods, focuses of investigation, among other reasons. Furthermore, methods for volumetrically quantifying the maxillary sinus are commonly manual or semiautomated, which require substantial user expertise and are time-consuming. The purpose of the present study was to develop an automated tool for quantifying the total and air-free volume of the maxillary sinus based on computed tomography images. The quantification tool seeks to standardize maxillary sinus volume measurements, thus allowing better comparisons and determinations of factors that influence maxillary sinus size. The automated tool utilized image processing techniques (watershed, threshold, and morphological operators). The maxillary sinus volume was quantified in 30 patients. To evaluate the accuracy of the automated tool, the results were compared with manual segmentation that was performed by an experienced radiologist using a standard procedure. The mean percent differences between the automated and manual methods were 7.19% ± 5.83% and 6.93% ± 4.29% for total and air-free maxillary sinus volume, respectively. Linear regression and Bland-Altman statistics showed good agreement and low dispersion between both methods. The present automated tool for maxillary sinus volume assessment was rapid, reliable, robust, accurate, and reproducible and may be applied in clinical practice. The tool may be used to standardize measurements of maxillary volume. Such standardization is extremely important for allowing comparisons between studies, providing a better understanding of the role of the maxillary sinus, and determining the factors that influence maxillary sinus size under normal and pathological conditions. PMID:29304130

  8. Improving estimates of streamflow characteristics by using Landsat-1 imagery

    USGS Publications Warehouse

    Hollyday, Este F.

    1976-01-01

    Imagery from the first Earth Resources Technology Satellite (renamed Landsat-1) was used to discriminate physical features of drainage basins in an effort to improve equations used to estimate streamflow characteristics at gaged and ungaged sites. Records of 20 gaged basins in the Delmarva Peninsula of Maryland, Delaware, and Virginia were analyzed for 40 statistical streamflow characteristics. Equations relating these characteristics to basin characteristics were obtained by a technique of multiple linear regression. A control group of equations contains basin characteristics derived from maps. An experimental group of equations contains basin characteristics derived from maps and imagery. Characteristics from imagery were forest, riparian (streambank) vegetation, water, and combined agricultural and urban land use. These basin characteristics were isolated photographically by techniques of film-density discrimination. The area of each characteristic in each basin was measured photometrically. Comparison of equations in the control group with corresponding equations in the experimental group reveals that for 12 out of 40 equations the standard error of estimate was reduced by more than 10 percent. As an example, the standard error of estimate of the equation for the 5-year recurrence-interval flood peak was reduced from 46 to 32 percent. Similarly, the standard error of the equation for the mean monthly flow for September was reduced from 32 to 24 percent, the standard error for the 7-day, 2-year recurrence low flow was reduced from 136 to 102 percent, and the standard error for the 3-day, 2-year flood volume was reduced from 30 to 12 percent. It is concluded that data from Landsat imagery can substantially improve the accuracy of estimates of some streamflow characteristics at sites in the Delmarva Peninsula.

  9. A predictive modeling approach to increasing the economic effectiveness of disease management programs.

    PubMed

    Bayerstadler, Andreas; Benstetter, Franz; Heumann, Christian; Winter, Fabian

    2014-09-01

    Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-) effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.

  10. Optimizing Hybrid Metrology: Rigorous Implementation of Bayesian and Combined Regression.

    PubMed

    Henn, Mark-Alexander; Silver, Richard M; Villarrubia, John S; Zhang, Nien Fan; Zhou, Hui; Barnes, Bryan M; Ming, Bin; Vladár, András E

    2015-01-01

    Hybrid metrology, e.g., the combination of several measurement techniques to determine critical dimensions, is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of 3-D structures but also a more realistic estimation of the corresponding uncertainties. Recent developments at the National Institute of Standards and Technology (NIST) feature the combination of optical critical dimension (OCD) measurements and scanning electron microscope (SEM) results. The hybrid methodology offers the potential to make measurements of essential 3-D attributes that may not be otherwise feasible. However, combining techniques gives rise to essential challenges in error analysis and comparing results from different instrument models, especially the effect of systematic and highly correlated errors in the measurement on the χ 2 function that is minimized. Both hypothetical examples and measurement data are used to illustrate solutions to these challenges.

  11. Memoized Symbolic Execution

    NASA Technical Reports Server (NTRS)

    Yang, Guowei; Pasareanu, Corina S.; Khurshid, Sarfraz

    2012-01-01

    This paper introduces memoized symbolic execution (Memoise), a novel approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symbolic execution often requires several successive runs of the technique on largely similar underlying problems, e.g., running it once to check a program to find a bug, fixing the bug, and running it again to check the modified program. Memoise introduces a trie-based data structure that stores the key elements of a run of symbolic execution. Maintenance of the trie during successive runs allows re-use of previously computed results of symbolic execution without the need for re-computing them as is traditionally done. Experiments using our prototype embodiment of Memoise show the benefits it holds in various standard scenarios of using symbolic execution, e.g., with iterative deepening of exploration depth, to perform regression analysis, or to enhance coverage.

  12. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yield accurate resultsmore » from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  13. Accurate predictions of iron redox state in silicate glasses: A multivariate approach using X-ray absorption spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dyar, M. Darby; McCanta, Molly; Breves, Elly

    2016-03-01

    Pre-edge features in the K absorption edge of X-ray absorption spectra are commonly used to predict Fe 3+ valence state in silicate glasses. However, this study shows that using the entire spectral region from the pre-edge into the extended X-ray absorption fine-structure region provides more accurate results when combined with multivariate analysis techniques. The least absolute shrinkage and selection operator (lasso) regression technique yields %Fe 3+ values that are accurate to ±3.6% absolute when the full spectral region is employed. This method can be used across a broad range of glass compositions, is easily automated, and is demonstrated to yieldmore » accurate results from different synchrotrons. It will enable future studies involving X-ray mapping of redox gradients on standard thin sections at 1 × 1 μm pixel sizes.« less

  14. Dynamic multifactor clustering of financial networks

    NASA Astrophysics Data System (ADS)

    Ross, Gordon J.

    2014-02-01

    We investigate the tendency for financial instruments to form clusters when there are multiple factors influencing the correlation structure. Specifically, we consider a stock portfolio which contains companies from different industrial sectors, located in several different countries. Both sector membership and geography combine to create a complex clustering structure where companies seem to first be divided based on sector, with geographical subclusters emerging within each industrial sector. We argue that standard techniques for detecting overlapping clusters and communities are not able to capture this type of structure and show how robust regression techniques can instead be used to remove the influence of both sector and geography from the correlation matrix separately. Our analysis reveals that prior to the 2008 financial crisis, companies did not tend to form clusters based on geography. This changed immediately following the crisis, with geography becoming a more important determinant of clustering structure.

  15. Computerized scintigraphic technique for the evaluation of adult respiratory distress syndrome: initial clinical trials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tatum, J.L.; Burke, T.S.; Sugerman, H.T.

    1982-04-01

    Eleven patients with suspected adult respiratory distress syndrome (ARDS) and five control patients were studied using a computerized gamma imaging and analysis technique and /sup 99m/Tc-labeled human serum albumin. The heart and right lung were imaged, lung:heart ratio was plotted vs. time, and a linear regression was fitted to the data points displayed. The slope of this fit was termed the ''slope index.'' An index value of 2 standard deviations greater than the control mean was considered positive. Radiographs from the six positive studies revealed typical diffuse air-space disease. Radiographs from two of the five negative studies demonstrated air-space consolidation.more » Both of these patients had elevated pulmonary capillary wedge pressure, cardiomegaly, and clinical course consistent with cardiogenic pulmonary edema. These preliminary data demonstrated a good correlation between positive slope index and clinical ARDS.« less

  16. Computerized scintigraphic technique for the evaluation of adult respiratory distress syndrome: initial clinical trials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tatum, J.L.; Burke, T.S.; Sugerman, H.J.

    1982-04-01

    Eleven patients with suspected adult respiratory distress syndrome (ARDS) and five control patients were studied using a computerized gamma imaging and analysis technique and 99mTc-labeled human serum albumin. The heart and right lung were imaged, lung:heart ratio was plotted vs. time, and a linear regression was fitted to the data points displayed. The slope of this fit was termed the ''slope index.'' An index value of 2 standard deviations greater than the control mean was considered positive. Radiographs from the six positive studies revealed typical diffuse air-space disease. Radiographs from two of the five negative studies demonstrated air-space consolidation. Bothmore » of these patients had elevated pulmonary capillary wedge pressure, cardiomegaly, and clinical course consistent with cardiogenic pulmonary edema. These preliminary data demonstrated a good correlation between positive slope index and clinical ARDS.« less

  17. Motion patterns in acupuncture needle manipulation.

    PubMed

    Seo, Yoonjeong; Lee, In-Seon; Jung, Won-Mo; Ryu, Ho-Sun; Lim, Jinwoong; Ryu, Yeon-Hee; Kang, Jung-Won; Chae, Younbyoung

    2014-10-01

    In clinical practice, acupuncture manipulation is highly individualised for each practitioner. Before we establish a standard for acupuncture manipulation, it is important to understand completely the manifestations of acupuncture manipulation in the actual clinic. To examine motion patterns during acupuncture manipulation, we generated a fitted model of practitioners' motion patterns and evaluated their consistencies in acupuncture manipulation. Using a motion sensor, we obtained real-time motion data from eight experienced practitioners while they conducted acupuncture manipulation using their own techniques. We calculated the average amplitude and duration of a sampled motion unit for each practitioner and, after normalisation, we generated a true regression curve of motion patterns for each practitioner using a generalised additive mixed modelling (GAMM). We observed significant differences in rotation amplitude and duration in motion samples among practitioners. GAMM showed marked variations in average regression curves of motion patterns among practitioners but there was strong consistency in motion parameters for individual practitioners. The fitted regression model showed that the true regression curve accounted for an average of 50.2% of variance in the motion pattern for each practitioner. Our findings suggest that there is great inter-individual variability between practitioners, but remarkable intra-individual consistency within each practitioner. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  18. Hybrid ICA-Regression: Automatic Identification and Removal of Ocular Artifacts from Electroencephalographic Signals.

    PubMed

    Mannan, Malik M Naeem; Jeong, Myung Y; Kamran, Muhammad A

    2016-01-01

    Electroencephalography (EEG) is a portable brain-imaging technique with the advantage of high-temporal resolution that can be used to record electrical activity of the brain. However, it is difficult to analyze EEG signals due to the contamination of ocular artifacts, and which potentially results in misleading conclusions. Also, it is a proven fact that the contamination of ocular artifacts cause to reduce the classification accuracy of a brain-computer interface (BCI). It is therefore very important to remove/reduce these artifacts before the analysis of EEG signals for applications like BCI. In this paper, a hybrid framework that combines independent component analysis (ICA), regression and high-order statistics has been proposed to identify and eliminate artifactual activities from EEG data. We used simulated, experimental and standard EEG signals to evaluate and analyze the effectiveness of the proposed method. Results demonstrate that the proposed method can effectively remove ocular artifacts as well as it can preserve the neuronal signals present in EEG data. A comparison with four methods from literature namely ICA, regression analysis, wavelet-ICA (wICA), and regression-ICA (REGICA) confirms the significantly enhanced performance and effectiveness of the proposed method for removal of ocular activities from EEG, in terms of lower mean square error and mean absolute error values and higher mutual information between reconstructed and original EEG.

  19. Hybrid ICA—Regression: Automatic Identification and Removal of Ocular Artifacts from Electroencephalographic Signals

    PubMed Central

    Mannan, Malik M. Naeem; Jeong, Myung Y.; Kamran, Muhammad A.

    2016-01-01

    Electroencephalography (EEG) is a portable brain-imaging technique with the advantage of high-temporal resolution that can be used to record electrical activity of the brain. However, it is difficult to analyze EEG signals due to the contamination of ocular artifacts, and which potentially results in misleading conclusions. Also, it is a proven fact that the contamination of ocular artifacts cause to reduce the classification accuracy of a brain-computer interface (BCI). It is therefore very important to remove/reduce these artifacts before the analysis of EEG signals for applications like BCI. In this paper, a hybrid framework that combines independent component analysis (ICA), regression and high-order statistics has been proposed to identify and eliminate artifactual activities from EEG data. We used simulated, experimental and standard EEG signals to evaluate and analyze the effectiveness of the proposed method. Results demonstrate that the proposed method can effectively remove ocular artifacts as well as it can preserve the neuronal signals present in EEG data. A comparison with four methods from literature namely ICA, regression analysis, wavelet-ICA (wICA), and regression-ICA (REGICA) confirms the significantly enhanced performance and effectiveness of the proposed method for removal of ocular activities from EEG, in terms of lower mean square error and mean absolute error values and higher mutual information between reconstructed and original EEG. PMID:27199714

  20. Using ridge regression in systematic pointing error corrections

    NASA Technical Reports Server (NTRS)

    Guiar, C. N.

    1988-01-01

    A pointing error model is used in the antenna calibration process. Data from spacecraft or radio star observations are used to determine the parameters in the model. However, the regression variables are not truly independent, displaying a condition known as multicollinearity. Ridge regression, a biased estimation technique, is used to combat the multicollinearity problem. Two data sets pertaining to Voyager 1 spacecraft tracking (days 105 and 106 of 1987) were analyzed using both linear least squares and ridge regression methods. The advantages and limitations of employing the technique are presented. The problem is not yet fully resolved.

  1. Forensic dental age estimation by measuring root dentin translucency area using a new digital technique.

    PubMed

    Acharya, Ashith B

    2014-05-01

    Dentin translucency measurement is an easy yet relatively accurate approach to postmortem age estimation. Translucency area represents a two-dimensional change and may reflect age variations better than length. Manually measuring area is challenging and this paper proposes a new digital method using commercially available computer hardware and software. Area and length were measured on 100 tooth sections (age range, 19-82 years) of 250 μm thickness. Regression analysis revealed lower standard error of estimate and higher correlation with age for length than for area (R = 0.62 vs. 0.60). However, test of regression formulae on a control sample (n = 33, 21-85 years) showed smaller mean absolute difference (8.3 vs. 8.8 years) and greater frequency of smaller errors (73% vs. 67% age estimates ≤ ± 10 years) for area than for length. These suggest that digital area measurements of root translucency may be used as an alternative to length in forensic age estimation. © 2014 American Academy of Forensic Sciences.

  2. Performance evaluation of spectral vegetation indices using a statistical sensitivity function

    USGS Publications Warehouse

    Ji, Lei; Peters, Albert J.

    2007-01-01

    A great number of spectral vegetation indices (VIs) have been developed to estimate biophysical parameters of vegetation. Traditional techniques for evaluating the performance of VIs are regression-based statistics, such as the coefficient of determination and root mean square error. These statistics, however, are not capable of quantifying the detailed relationship between VIs and biophysical parameters because the sensitivity of a VI is usually a function of the biophysical parameter instead of a constant. To better quantify this relationship, we developed a “sensitivity function” for measuring the sensitivity of a VI to biophysical parameters. The sensitivity function is defined as the first derivative of the regression function, divided by the standard error of the dependent variable prediction. The function elucidates the change in sensitivity over the range of the biophysical parameter. The Student's t- or z-statistic can be used to test the significance of VI sensitivity. Additionally, we developed a “relative sensitivity function” that compares the sensitivities of two VIs when the biophysical parameters are unavailable.

  3. Application of glas laser altimetry to detect elevation changes in East Antarctica

    NASA Astrophysics Data System (ADS)

    Scaioni, M.; Tong, X.; Li, R.

    2013-10-01

    In this paper the use of ICESat/GLAS laser altimeter for estimating multi-temporal elevation changes on polar ice sheets is afforded. Due to non-overlapping laser spots during repeat passes, interpolation methods are required to make comparisons. After reviewing the main methods described in the literature (crossover point analysis, cross-track DEM projection, space-temporal regressions), the last one has been chosen for its capability of providing more elevation change rate measurements. The standard implementation of the space-temporal linear regression technique has been revisited and improved to better cope with outliers and to check the estimability of model's parameters. GLAS data over the PANDA route in East Antarctica have been used for testing. Obtained results have been quite meaningful from a physical point of view, confirming the trend reported by the literature of a constant snow accumulation in the area during the two past decades, unlike the most part of the continent that has been losing mass.

  4. Analysis of an experiment aimed at improving the reliability of transmission centre shafts.

    PubMed

    Davis, T P

    1995-01-01

    Smith (1991) presents a paper proposing the use of Weibull regression models to establish dependence of failure data (usually times) on covariates related to the design of the test specimens and test procedures. In his article Smith made the point that good experimental design was as important in reliability applications as elsewhere, and in view of the current interest in design inspired by Taguchi and others, we pay some attention in this article to that topic. A real case study from the Ford Motor Company is presented. Our main approach is to utilize suggestions in the literature for applying standard least squares techniques of experimental analysis even when there is likely to be nonnormal error, and censoring. This approach lacks theoretical justification, but its appeal is its simplicity and flexibility. For completeness we also include some analysis based on the proportional hazards model, and in an attempt to link back to Smith (1991), look at a Weibull regression model.

  5. On the Occurrence of Standardized Regression Coefficients Greater than One.

    ERIC Educational Resources Information Center

    Deegan, John, Jr.

    1978-01-01

    It is demonstrated here that standardized regression coefficients greater than one can legitimately occur. Furthermore, the relationship between the occurrence of such coefficients and the extent of multicollinearity present among the set of predictor variables in an equation is examined. Comments on the interpretation of these coefficients are…

  6. Compliance with Standard Precautions and Associated Factors among Healthcare Workers in Gondar University Comprehensive Specialized Hospital, Northwest Ethiopia

    PubMed Central

    Haile, Tariku Gebre

    2017-01-01

    Background. In many studies, compliance with standard precautions among healthcare workers was reported to be inadequate. Objective. The aim of this study was to assess compliance with standard precautions and associated factors among healthcare workers in northwest Ethiopia. Methods. An institution-based cross-sectional study was conducted from March 01 to April 30, 2014. Simple random sampling technique was used to select participants. Data were entered into Epi info 3.5.1 and were exported to SPSS version 20.0 for statistical analysis. Multivariate logistic regression analyses were computed and adjusted odds ratio with 95% confidence interval was calculated to identify associated factors. Results. The proportion of healthcare workers who always comply with standard precautions was found to be 12%. Being a female healthcare worker (AOR [95% CI] 2.18 [1.12–4.23]), higher infection risk perception (AOR [95% CI] 3.46 [1.67–7.18]), training on standard precautions (AOR [95% CI] 2.90 [1.20–7.02]), accessibility of personal protective equipment (AOR [95% CI] 2.87 [1.41–5.86]), and management support (AOR [95% CI] 2.23 [1.11–4.53]) were found to be statistically significant. Conclusion and Recommendation. Compliance with standard precautions among the healthcare workers is very low. Interventions which include training of healthcare workers on standard precautions and consistent management support are recommended. PMID:28191020

  7. The standard deviation of extracellular water/intracellular water is associated with all-cause mortality and technique failure in peritoneal dialysis patients.

    PubMed

    Tian, Jun-Ping; Wang, Hong; Du, Feng-He; Wang, Tao

    2016-09-01

    The mortality rate of peritoneal dialysis (PD) patients is still high, and the predicting factors for PD patient mortality remain to be determined. This study aimed to explore the relationship between the standard deviation (SD) of extracellular water/intracellular water (E/I) and all-cause mortality and technique failure in continuous ambulatory PD (CAPD) patients. All 152 patients came from the PD Center between January 1st 2006 and December 31st 2007. Clinical data and at least five-visit E/I ratio defined by bioelectrical impedance analysis were collected. The patients were followed up till December 31st 2010. The primary outcomes were death from any cause and technique failure. Kaplan-Meier analysis and Cox proportional hazards models were used to identify risk factors for mortality and technique failure in CAPD patients. All patients were followed up for 59.6 ± 23.0 months. The patients were divided into two groups according to their SD of E/I values: lower SD of E/I group (≤0.126) and higher SD of E/I group (>0.126). The patients with higher SD of E/I showed a higher all-cause mortality (log-rank χ (2) = 10.719, P = 0.001) and technique failure (log-rank χ (2) = 9.724, P = 0.002) than those with lower SD of E/I. Cox regression analysis found that SD of E/I independently predicted all-cause mortality (HR  3.551, 95 % CI 1.442-8.746, P = 0.006) and technique failure (HR  2.487, 95 % CI 1.093-5.659, P = 0.030) in CAPD patients after adjustment for confounders except when sensitive C-reactive protein was added into the model. The SD of E/I was a strong independent predictor of all-cause mortality and technique failure in CAPD patients.

  8. Compressed air injection technique to standardize block injection pressures.

    PubMed

    Tsui, Ban C H; Li, Lisa X Y; Pillay, Jennifer J

    2006-11-01

    Presently, no standardized technique exists to monitor injection pressures during peripheral nerve blocks. Our objective was to determine if a compressed air injection technique, using an in vitro model based on Boyle's law and typical regional anesthesia equipment, could consistently maintain injection pressures below a 1293 mmHg level associated with clinically significant nerve injury. Injection pressures for 20 and 30 mL syringes with various needle sizes (18G, 20G, 21G, 22G, and 24G) were measured in a closed system. A set volume of air was aspirated into a saline-filled syringe and then compressed and maintained at various percentages while pressure was measured. The needle was inserted into the injection port of a pressure sensor, which had attached extension tubing with an injection plug clamped "off". Using linear regression with all data points, the pressure value and 99% confidence interval (CI) at 50% air compression was estimated. The linearity of Boyle's law was demonstrated with a high correlation, r = 0.99, and a slope of 0.984 (99% CI: 0.967-1.001). The net pressure generated at 50% compression was estimated as 744.8 mmHg, with the 99% CI between 729.6 and 760.0 mmHg. The various syringe/needle combinations had similar results. By creating and maintaining syringe air compression at 50% or less, injection pressures will be substantially below the 1293 mmHg threshold considered to be an associated risk factor for clinically significant nerve injury. This technique may allow simple, real-time and objective monitoring during local anesthetic injections while inherently reducing injection speed.

  9. Analysis of Learning Curve Fitting Techniques.

    DTIC Science & Technology

    1987-09-01

    1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied

  10. The National Streamflow Statistics Program: A Computer Program for Estimating Streamflow Statistics for Ungaged Sites

    USGS Publications Warehouse

    Ries(compiler), Kernell G.; With sections by Atkins, J. B.; Hummel, P.R.; Gray, Matthew J.; Dusenbury, R.; Jennings, M.E.; Kirby, W.H.; Riggs, H.C.; Sauer, V.B.; Thomas, W.O.

    2007-01-01

    The National Streamflow Statistics (NSS) Program is a computer program that should be useful to engineers, hydrologists, and others for planning, management, and design applications. NSS compiles all current U.S. Geological Survey (USGS) regional regression equations for estimating streamflow statistics at ungaged sites in an easy-to-use interface that operates on computers with Microsoft Windows operating systems. NSS expands on the functionality of the USGS National Flood Frequency Program, and replaces it. The regression equations included in NSS are used to transfer streamflow statistics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally, the equations were developed on a statewide or metropolitan-area basis as part of cooperative study programs. Equations are available for estimating rural and urban flood-frequency statistics, such as the 1 00-year flood, for every state, for Puerto Rico, and for the island of Tutuila, American Samoa. Equations are available for estimating other statistics, such as the mean annual flow, monthly mean flows, flow-duration percentiles, and low-flow frequencies (such as the 7-day, 0-year low flow) for less than half of the states. All equations available for estimating streamflow statistics other than flood-frequency statistics assume rural (non-regulated, non-urbanized) conditions. The NSS output provides indicators of the accuracy of the estimated streamflow statistics. The indicators may include any combination of the standard error of estimate, the standard error of prediction, the equivalent years of record, or 90 percent prediction intervals, depending on what was provided by the authors of the equations. The program includes several other features that can be used only for flood-frequency estimation. These include the ability to generate flood-frequency plots, and plots of typical flood hydrographs for selected recurrence intervals, estimates of the probable maximum flood, extrapolation of the 500-year flood when an equation for estimating it is not available, and weighting techniques to improve flood-frequency estimates for gaging stations and ungaged sites on gaged streams. This report describes the regionalization techniques used to develop the equations in NSS and provides guidance on the applicability and limitations of the techniques. The report also includes a users manual and a summary of equations available for estimating basin lagtime, which is needed by the program to generate flood hydrographs. The NSS software and accompanying database, and the documentation for the regression equations included in NSS, are available on the Web at http://water.usgs.gov/software/.

  11. SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei, J; Chao, M

    2016-06-15

    Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associatedmore » algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately improving tumor motion management for radiation therapy of cancer patients.« less

  12. Different lasers and techniques for proliferative diabetic retinopathy.

    PubMed

    Moutray, Tanya; Evans, Jennifer R; Lois, Noemi; Armstrong, David J; Peto, Tunde; Azuara-Blanco, Augusto

    2018-03-15

    Diabetic retinopathy (DR) is a chronic progressive disease of the retinal microvasculature associated with prolonged hyperglycaemia. Proliferative DR (PDR) is a sight-threatening complication of DR and is characterised by the development of abnormal new vessels in the retina, optic nerve head or anterior segment of the eye. Argon laser photocoagulation has been the gold standard for the treatment of PDR for many years, using regimens evaluated by the Early Treatment of Diabetic Retinopathy Study (ETDRS). Over the years, there have been modifications of the technique and introduction of new laser technologies. To assess the effects of different types of laser, other than argon laser, and different laser protocols, other than those established by the ETDRS, for the treatment of PDR. We compared different wavelengths; power and pulse duration; pattern, number and location of burns versus standard argon laser undertaken as specified by the ETDRS. We searched the Cochrane Central Register of Controlled Trials (CENTRAL) (which contains the Cochrane Eyes and Vision Trials Register) (2017, Issue 5); Ovid MEDLINE; Ovid Embase; LILACS; the ISRCTN registry; ClinicalTrials.gov and the ICTRP. The date of the search was 8 June 2017. We included randomised controlled trials (RCTs) of pan-retinal photocoagulation (PRP) using standard argon laser for treatment of PDR compared with any other laser modality. We excluded studies of lasers that are not in common use, such as the xenon arc, ruby or Krypton laser. We followed Cochrane guidelines and graded the certainty of evidence using the GRADE approach. We identified 11 studies from Europe (6), the USA (2), the Middle East (1) and Asia (2). Five studies compared different types of laser to argon: Nd:YAG (2 studies) or diode (3 studies). Other studies compared modifications to the standard argon laser PRP technique. The studies were poorly reported and we judged all to be at high risk of bias in at least one domain. The sample size varied from 20 to 270 eyes but the majority included 50 participants or fewer.Nd:YAG versus argon laser (2 studies): very low-certainty evidence on vision loss, vision gain, progression and regression of PDR, pain during laser treatment and adverse effects.Diode versus argon laser (3 studies): very-low certainty evidence on vision loss, vision gain, progression and regression of PDR and adverse effects; moderate-certainty evidence that diode laser was more painful (risk ratio (RR) troublesome pain during laser treatment (RR 3.12, 95% CI 2.16 to 4.51; eyes = 202; studies = 3; I 2 = 0%).0.5 second versus 0.1 second exposure (1 study): low-certainty evidence of lower chance of vision loss with 0.5 second compared with 0.1 second exposure but estimates were imprecise and compatible with no difference or an increased chance of vision loss (RR 0.42, 95% CI 0.08 to 2.04, 44 eyes, 1 RCT); low-certainty evidence that people treated with 0.5 second exposure were more likely to gain vision (RR 2.22, 95% CI 0.68 to 7.28, 44 eyes, 1 RCT) but again the estimates were imprecise . People given 0.5 second exposure were more likely to have regression of PDR compared with 0.1 second laser PRP again with imprecise estimate (RR 1.17, 95% CI 0.92 to 1.48, 32 eyes, 1 RCT). There was very low-certainty evidence on progression of PDR and adverse effects.'Light intensity' PRP versus classic PRP (1 study): vision loss or gain was not reported but the mean difference in logMAR acuity at 1 year was -0.09 logMAR (95% CI -0.22 to 0.04, 65 eyes, 1 RCT); and low-certainty evidence that fewer patients had pain during light PRP compared with classic PRP with an imprecise estimate compatible with increased or decreased pain (RR 0.23, 95% CI 0.03 to 1.93, 65 eyes, 1 RCT).'Mild scatter' (laser pattern limited to 400 to 600 laser burns in one sitting) PRP versus standard 'full' scatter PRP (1 study): very low-certainty evidence on vision and visual field loss. No information on adverse effects.'Central' (a more central PRP in addition to mid-peripheral PRP) versus 'peripheral' standard PRP (1 study): low-certainty evidence that people treated with central PRP were more likely to lose 15 or more letters of BCVA compared with peripheral laser PRP (RR 3.00, 95% CI 0.67 to 13.46, 50 eyes, 1 RCT); and less likely to gain 15 or more letters (RR 0.25, 95% CI 0.03 to 2.08) with imprecise estimates compatible with increased or decreased risk.'Centre sparing' PRP (argon laser distribution limited to 3 disc diameters from the upper temporal and lower margin of the fovea) versus standard 'full scatter' PRP (1 study): low-certainty evidence that people treated with 'centre sparing' PRP were less likely to lose 15 or more ETDRS letters of BCVA compared with 'full scatter' PRP (RR 0.67, 95% CI 0.30 to 1.50, 53 eyes). Low-certainty evidence of similar risk of regression of PDR between groups (RR 0.96, 95% CI 0.73 to 1.27, 53 eyes). Adverse events were not reported.'Extended targeted' PRP (to include the equator and any capillary non-perfusion areas between the vascular arcades) versus standard PRP (1 study): low-certainty evidence that people in the extended group had similar or slightly reduced chance of loss of 15 or more letters of BCVA compared with the standard PRP group (RR 0.94, 95% CI 0.70 to 1.28, 270 eyes). Low-certainty evidence that people in the extended group had a similar or slightly increased chance of regression of PDR compared with the standard PRP group (RR 1.11, 95% CI 0.95 to 1.31, 270 eyes). Very low-certainty information on adverse effects. Modern laser techniques and modalities have been developed to treat PDR. However there is limited evidence available with respect to the efficacy and safety of alternative laser systems or strategies compared with the standard argon laser as described in ETDRS.

  13. Analysis and Interpretation of Findings Using Multiple Regression Techniques

    ERIC Educational Resources Information Center

    Hoyt, William T.; Leierer, Stephen; Millington, Michael J.

    2006-01-01

    Multiple regression and correlation (MRC) methods form a flexible family of statistical techniques that can address a wide variety of different types of research questions of interest to rehabilitation professionals. In this article, we review basic concepts and terms, with an emphasis on interpretation of findings relevant to research questions…

  14. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    NASA Astrophysics Data System (ADS)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  15. Morse Code, Scrabble, and the Alphabet

    ERIC Educational Resources Information Center

    Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

    2004-01-01

    In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…

  16. Standard weight (Ws) equations for four rare desert fishes

    USGS Publications Warehouse

    Didenko, A.V.; Bonar, Scott A.; Matter, W.J.

    2004-01-01

    Standard weight (Ws) equations have been used extensively to examine body condition in sport fishes. However, development of these equations for nongame fishes has only recently been emphasized. We used the regression-line-percentile technique to develop standard weight equations for four rare desert fishes: flannelmouth sucker Catostomus latipinnis, razorback sucker Xyrauchen texanus, roundtail chub Gila robusta, and humpback chub G. cypha. The Ws equation for flannelmouth suckers of 100-690 mm total length (TL) was developed from 17 populations: log10Ws = -5.180 + 3.068 log10TL. The Ws equation for razorback suckers of 110-885 mm TL was developed from 12 populations: log 10Ws = -4.886 + 2.985 log10TL. The W s equation for roundtail chub of 100-525 mm TL was developed from 20 populations: log10Ws = -5.065 + 3.015 log10TL. The Ws equation for humpback chub of 120-495 mm TL was developed from 9 populations: log10Ws = -5.278 + 3.096 log 10TL. These equations meet criteria for acceptable standard weight indexes and can be used to calculate relative weight, an index of body condition.

  17. Individual memory change after anterior temporal lobectomy: a base rate analysis using regression-based outcome methodology.

    PubMed

    Martin, R C; Sawrie, S M; Roth, D L; Gilliam, F G; Faught, E; Morawetz, R B; Kuzniecky, R

    1998-10-01

    To characterize patterns of base rate change on measures of verbal and visual memory after anterior temporal lobectomy (ATL) using a newly developed regression-based outcome methodology that accounts for effects of practice and regression towards the mean, and to comment on the predictive utility of baseline memory measures on postoperative memory outcome. Memory change was operationalized using regression-based change norms in a group of left (n = 53) and right (n = 48) ATL patients. All patients were administered tests of episodic verbal (prose recall, list learning) and visual (figure reproduction) memory, and semantic memory before and after ATL. ATL patients displayed a wide range of memory outcome across verbal and visual memory domains. Significant performance declines were noted for 25-50% of left ATL patients on verbal semantic and episodic memory tasks, while one-third of right ATL patients displayed significant declines in immediate and delayed episodic prose recall. Significant performance improvement was noted in an additional one-third of right ATL patients on delayed prose recall. Base rate change was similar between the two ATL groups across immediate and delayed visual memory. Approximately one-fourth of all patients displayed clinically meaningful losses on the visual memory task following surgery. Robust relationships between preoperative memory measures and nonstandardized change scores were attenuated or reversed using standardized memory outcome techniques. Our results demonstrated substantial group variability in memory outcome for ATL patients. These results extend previous research by incorporating known effects of practice and regression to the mean when addressing meaningful neuropsychological change following epilepsy surgery. Our findings also suggest that future neuropsychological outcome studies should take steps towards controlling for regression-to-the-mean before drawing predictive conclusions.

  18. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  19. An Innovative Technique to Assess Spontaneous Baroreflex Sensitivity with Short Data Segments: Multiple Trigonometric Regressive Spectral Analysis.

    PubMed

    Li, Kai; Rüdiger, Heinz; Haase, Rocco; Ziemssen, Tjalf

    2018-01-01

    Objective: As the multiple trigonometric regressive spectral (MTRS) analysis is extraordinary in its ability to analyze short local data segments down to 12 s, we wanted to evaluate the impact of the data segment settings by applying the technique of MTRS analysis for baroreflex sensitivity (BRS) estimation using a standardized data pool. Methods: Spectral and baroreflex analyses were performed on the EuroBaVar dataset (42 recordings, including lying and standing positions). For this analysis, the technique of MTRS was used. We used different global and local data segment lengths, and chose the global data segments from different positions. Three global data segments of 1 and 2 min and three local data segments of 12, 20, and 30 s were used in MTRS analysis for BRS. Results: All the BRS-values calculated on the three global data segments were highly correlated, both in the supine and standing positions; the different global data segments provided similar BRS estimations. When using different local data segments, all the BRS-values were also highly correlated. However, in the supine position, using short local data segments of 12 s overestimated BRS compared with those using 20 and 30 s. In the standing position, the BRS estimations using different local data segments were comparable. There was no proportional bias for the comparisons between different BRS estimations. Conclusion: We demonstrate that BRS estimation by the MTRS technique is stable when using different global data segments, and MTRS is extraordinary in its ability to evaluate BRS in even short local data segments (20 and 30 s). Because of the non-stationary character of most biosignals, the MTRS technique would be preferable for BRS analysis especially in conditions when only short stationary data segments are available or when dynamic changes of BRS should be monitored.

  20. Post-processing through linear regression

    NASA Astrophysics Data System (ADS)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  1. Estimation of liver T₂ in transfusion-related iron overload in patients with weighted least squares T₂ IDEAL.

    PubMed

    Vasanawala, Shreyas S; Yu, Huanzhou; Shimakawa, Ann; Jeng, Michael; Brittain, Jean H

    2012-01-01

    MRI imaging of hepatic iron overload can be achieved by estimating T(2) values using multiple-echo sequences. The purpose of this work is to develop and clinically evaluate a weighted least squares algorithm based on T(2) Iterative Decomposition of water and fat with Echo Asymmetry and Least-squares estimation (IDEAL) technique for volumetric estimation of hepatic T(2) in the setting of iron overload. The weighted least squares T(2) IDEAL technique improves T(2) estimation by automatically decreasing the impact of later, noise-dominated echoes. The technique was evaluated in 37 patients with iron overload. Each patient underwent (i) a standard 2D multiple-echo gradient echo sequence for T(2) assessment with nonlinear exponential fitting, and (ii) a 3D T(2) IDEAL technique, with and without a weighted least squares fit. Regression and Bland-Altman analysis demonstrated strong correlation between conventional 2D and T(2) IDEAL estimation. In cases of severe iron overload, T(2) IDEAL without weighted least squares reconstruction resulted in a relative overestimation of T(2) compared with weighted least squares. Copyright © 2011 Wiley-Liss, Inc.

  2. Using soft computing techniques to predict corrected air permeability using Thomeer parameters, air porosity and grain density

    NASA Astrophysics Data System (ADS)

    Nooruddin, Hasan A.; Anifowose, Fatai; Abdulraheem, Abdulazeez

    2014-03-01

    Soft computing techniques are recently becoming very popular in the oil industry. A number of computational intelligence-based predictive methods have been widely applied in the industry with high prediction capabilities. Some of the popular methods include feed-forward neural networks, radial basis function network, generalized regression neural network, functional networks, support vector regression and adaptive network fuzzy inference system. A comparative study among most popular soft computing techniques is presented using a large dataset published in literature describing multimodal pore systems in the Arab D formation. The inputs to the models are air porosity, grain density, and Thomeer parameters obtained using mercury injection capillary pressure profiles. Corrected air permeability is the target variable. Applying developed permeability models in recent reservoir characterization workflow ensures consistency between micro and macro scale information represented mainly by Thomeer parameters and absolute permeability. The dataset was divided into two parts with 80% of data used for training and 20% for testing. The target permeability variable was transformed to the logarithmic scale as a pre-processing step and to show better correlations with the input variables. Statistical and graphical analysis of the results including permeability cross-plots and detailed error measures were created. In general, the comparative study showed very close results among the developed models. The feed-forward neural network permeability model showed the lowest average relative error, average absolute relative error, standard deviations of error and root means squares making it the best model for such problems. Adaptive network fuzzy inference system also showed very good results.

  3. Combining macula clinical signs and patient characteristics for age-related macular degeneration diagnosis: a machine learning approach.

    PubMed

    Fraccaro, Paolo; Nicolo, Massimo; Bonetto, Monica; Giacomini, Mauro; Weller, Peter; Traverso, Carlo Enrico; Prosperi, Mattia; OSullivan, Dympna

    2015-01-27

    To investigate machine learning methods, ranging from simpler interpretable techniques to complex (non-linear) "black-box" approaches, for automated diagnosis of Age-related Macular Degeneration (AMD). Data from healthy subjects and patients diagnosed with AMD or other retinal diseases were collected during routine visits via an Electronic Health Record (EHR) system. Patients' attributes included demographics and, for each eye, presence/absence of major AMD-related clinical signs (soft drusen, retinal pigment epitelium, defects/pigment mottling, depigmentation area, subretinal haemorrhage, subretinal fluid, macula thickness, macular scar, subretinal fibrosis). Interpretable techniques known as white box methods including logistic regression and decision trees as well as less interpreitable techniques known as black box methods, such as support vector machines (SVM), random forests and AdaBoost, were used to develop models (trained and validated on unseen data) to diagnose AMD. The gold standard was confirmed diagnosis of AMD by physicians. Sensitivity, specificity and area under the receiver operating characteristic (AUC) were used to assess performance. Study population included 487 patients (912 eyes). In terms of AUC, random forests, logistic regression and adaboost showed a mean performance of (0.92), followed by SVM and decision trees (0.90). All machine learning models identified soft drusen and age as the most discriminating variables in clinicians' decision pathways to diagnose AMD. Both black-box and white box methods performed well in identifying diagnoses of AMD and their decision pathways. Machine learning models developed through the proposed approach, relying on clinical signs identified by retinal specialists, could be embedded into EHR to provide physicians with real time (interpretable) support.

  4. Unobtrusive measurement of indoor energy expenditure using an infrared sensor-based activity monitoring system.

    PubMed

    Hwang, Bosun; Han, Jonghee; Choi, Jong Min; Park, Kwang Suk

    2008-11-01

    The purpose of this study was to develop an unobtrusive energy expenditure (EE) measurement system using an infrared (IR) sensor-based activity monitoring system to measure indoor activities and to estimate individual quantitative EE. IR-sensor activation counts were measured with a Bluetooth-based monitoring system and the standard EE was calculated using an established regression equation. Ten male subjects participated in the experiment and three different EE measurement systems (gas analyzer, accelerometer, IR sensor) were used simultaneously in order to determine the regression equation and evaluate the performance. As a standard measurement, oxygen consumption was simultaneously measured by a portable metabolic system (Metamax 3X, Cortex, Germany). A single room experiment was performed to develop a regression model of the standard EE measurement from the proposed IR sensor-based measurement system. In addition, correlation and regression analyses were done to compare the performance of the IR system with that of the Actigraph system. We determined that our proposed IR-based EE measurement system shows a similar correlation to the Actigraph system with the standard measurement system.

  5. An intercomparison of approaches for improving operational seasonal streamflow forecasts

    NASA Astrophysics Data System (ADS)

    Mendoza, Pablo A.; Wood, Andrew W.; Clark, Elizabeth; Rothwell, Eric; Clark, Martyn P.; Nijssen, Bart; Brekke, Levi D.; Arnold, Jeffrey R.

    2017-07-01

    For much of the last century, forecasting centers around the world have offered seasonal streamflow predictions to support water management. Recent work suggests that the two major avenues to advance seasonal predictability are improvements in the estimation of initial hydrologic conditions (IHCs) and the incorporation of climate information. This study investigates the marginal benefits of a variety of methods using IHCs and/or climate information, focusing on seasonal water supply forecasts (WSFs) in five case study watersheds located in the US Pacific Northwest region. We specify two benchmark methods that mimic standard operational approaches - statistical regression against IHCs and model-based ensemble streamflow prediction (ESP) - and then systematically intercompare WSFs across a range of lead times. Additional methods include (i) statistical techniques using climate information either from standard indices or from climate reanalysis variables and (ii) several hybrid/hierarchical approaches harnessing both land surface and climate predictability. In basins where atmospheric teleconnection signals are strong, and when watershed predictability is low, climate information alone provides considerable improvements. For those basins showing weak teleconnections, custom predictors from reanalysis fields were more effective in forecast skill than standard climate indices. ESP predictions tended to have high correlation skill but greater bias compared to other methods, and climate predictors failed to substantially improve these deficiencies within a trace weighting framework. Lower complexity techniques were competitive with more complex methods, and the hierarchical expert regression approach introduced here (hierarchical ensemble streamflow prediction - HESP) provided a robust alternative for skillful and reliable water supply forecasts at all initialization times. Three key findings from this effort are (1) objective approaches supporting methodologically consistent hindcasts open the door to a broad range of beneficial forecasting strategies; (2) the use of climate predictors can add to the seasonal forecast skill available from IHCs; and (3) sample size limitations must be handled rigorously to avoid over-trained forecast solutions. Overall, the results suggest that despite a rich, long heritage of operational use, there remain a number of compelling opportunities to improve the skill and value of seasonal streamflow predictions.

  6. Effective techniques in healthy eating and physical activity interventions: a meta-regression.

    PubMed

    Michie, Susan; Abraham, Charles; Whittington, Craig; McAteer, John; Gupta, Sunjai

    2009-11-01

    Meta-analyses of behavior change (BC) interventions typically find large heterogeneity in effectiveness and small effects. This study aimed to assess the effectiveness of active BC interventions designed to promote physical activity and healthy eating and investigate whether theoretically specified BC techniques improve outcome. Interventions, evaluated in experimental or quasi-experimental studies, using behavioral and/or cognitive techniques to increase physical activity and healthy eating in adults, were systematically reviewed. Intervention content was reliably classified into 26 BC techniques and the effects of individual techniques, and of a theoretically derived combination of self-regulation techniques, were assessed using meta-regression. Valid outcomes of physical activity and healthy eating. The 122 evaluations (N = 44,747) produced an overall pooled effect size of 0.31 (95% confidence interval = 0.26 to 0.36, I(2) = 69%). The technique, "self-monitoring," explained the greatest amount of among-study heterogeneity (13%). Interventions that combined self-monitoring with at least one other technique derived from control theory were significantly more effective than the other interventions (0.42 vs. 0.26). Classifying interventions according to component techniques and theoretically derived technique combinations and conducting meta-regression enabled identification of effective components of interventions designed to increase physical activity and healthy eating. PsycINFO Database Record (c) 2009 APA, all rights reserved.

  7. The repeatability of mean defect with size III and size V standard automated perimetry.

    PubMed

    Wall, Michael; Doyle, Carrie K; Zamba, K D; Artes, Paul; Johnson, Chris A

    2013-02-15

    The mean defect (MD) of the visual field is a global statistical index used to monitor overall visual field change over time. Our goal was to investigate the relationship of MD and its variability for two clinically used strategies (Swedish Interactive Threshold Algorithm [SITA] standard size III and full threshold size V) in glaucoma patients and controls. We tested one eye, at random, for 46 glaucoma patients and 28 ocularly healthy subjects with Humphrey program 24-2 SITA standard for size III and full threshold for size V each five times over a 5-week period. The standard deviation of MD was regressed against the MD for the five repeated tests, and quantile regression was used to show the relationship of variability and MD. A Wilcoxon test was used to compare the standard deviations of the two testing methods following quantile regression. Both types of regression analysis showed increasing variability with increasing visual field damage. Quantile regression showed modestly smaller MD confidence limits. There was a 15% decrease in SD with size V in glaucoma patients (P = 0.10) and a 12% decrease in ocularly healthy subjects (P = 0.08). The repeatability of size V MD appears to be slightly better than size III SITA testing. When using MD to determine visual field progression, a change of 1.5 to 4 decibels (dB) is needed to be outside the normal 95% confidence limits, depending on the size of the stimulus and the amount of visual field damage.

  8. Evaluating a novel application of optical fibre evanescent field absorbance: rapid measurement of red colour in winegrape homogenates

    NASA Astrophysics Data System (ADS)

    Lye, Peter G.; Bradbury, Ronald; Lamb, David W.

    Silica optical fibres were used to measure colour (mg anthocyanin/g fresh berry weight) in samples of red wine grape homogenates via optical Fibre Evanescent Field Absorbance (FEFA). Colour measurements from 126 samples of grape homogenate were compared against the standard industry spectrophotometric reference method that involves chemical extraction and subsequent optical absorption measurements of clarified samples at 520 nm. FEFA absorbance on homogenates at 520 nm (FEFA520h) was correlated with the industry reference method measurements of colour (R2 = 0.46, n = 126). Using a simple regression equation colour could be predicted with a standard error of cross-validation (SECV) of 0.21 mg/g, with a range of 0.6 to 2.2 mg anthocyanin/g and a standard deviation of 0.33 mg/g. With a Ratio of Performance Deviation (RPD) of 1.6, the technique when utilizing only a single detection wavelength, is not robust enough to apply in a diagnostic sense, however the results do demonstrate the potential of the FEFA method as a fast and low-cost assay of colour in homogenized samples.

  9. Quantitative Determination of Fluorine Content in Blends of Polylactide (PLA)–Talc Using Near Infrared Spectroscopy

    PubMed Central

    Tamburini, Elena; Tagliati, Chiara; Bonato, Tiziano; Costa, Stefania; Scapoli, Chiara; Pedrini, Paola

    2016-01-01

    Near-infrared spectroscopy (NIRS) has been widely used for quantitative and/or qualitative determination of a wide range of matrices. The objective of this study was to develop a NIRS method for the quantitative determination of fluorine content in polylactide (PLA)-talc blends. A blending profile was obtained by mixing different amounts of PLA granules and talc powder. The calibration model was built correlating wet chemical data (alkali digestion method) and NIR spectra. Using FT (Fourier Transform)-NIR technique, a Partial Least Squares (PLS) regression model was set-up, in a concentration interval of 0 ppm of pure PLA to 800 ppm of pure talc. Fluorine content prediction (R2cal = 0.9498; standard error of calibration, SEC = 34.77; standard error of cross-validation, SECV = 46.94) was then externally validated by means of a further 15 independent samples (R2EX.V = 0.8955; root mean standard error of prediction, RMSEP = 61.08). A positive relationship between an inorganic component as fluorine and NIR signal has been evidenced, and used to obtain quantitative analytical information from the spectra. PMID:27490548

  10. A potential gender bias in assessing quality of life - a standard gamble experiment among university students.

    PubMed

    Obaidi, Leath Al; Mahlich, Jörg

    2015-01-01

    There are several methodologies that can be used for evaluating patients' perception of their quality of life. Most commonly, utilities are directly elicited by means of either the time-trade-off or the standard-gamble method. In both methods, risk attitudes determine the quality of life values. Quality of life values among 31 Austrian undergraduate students were elicited by means of the standard gamble approach. The impact of several variables such as gender, side job, length of study, and living arrangements on the quality of life were identified using different types of regression techniques (ordinary least squares, generalized linear model, Betafit). Significant evidence was found that females are associated with a higher quality of life in all specifications of our estimations. The observed gender differences in quality of life can be attributed to a higher degree of risk aversion of women. A higher risk aversion leads to a higher valuation of given health states and a potential gender bias in health economic evaluations. This result could have implications for health policy planners when it comes to budget allocation decisions.

  11. The association between short interpregnancy interval and preterm birth in Louisiana: a comparison of methods.

    PubMed

    Howard, Elizabeth J; Harville, Emily; Kissinger, Patricia; Xiong, Xu

    2013-07-01

    There is growing interest in the application of propensity scores (PS) in epidemiologic studies, especially within the field of reproductive epidemiology. This retrospective cohort study assesses the impact of a short interpregnancy interval (IPI) on preterm birth and compares the results of the conventional logistic regression analysis with analyses utilizing a PS. The study included 96,378 singleton infants from Louisiana birth certificate data (1995-2007). Five regression models designed for methods comparison are presented. Ten percent (10.17 %) of all births were preterm; 26.83 % of births were from a short IPI. The PS-adjusted model produced a more conservative estimate of the exposure variable compared to the conventional logistic regression method (β-coefficient: 0.21 vs. 0.43), as well as a smaller standard error (0.024 vs. 0.028), odds ratio and 95 % confidence intervals [1.15 (1.09, 1.20) vs. 1.23 (1.17, 1.30)]. The inclusion of more covariate and interaction terms in the PS did not change the estimates of the exposure variable. This analysis indicates that PS-adjusted regression may be appropriate for validation of conventional methods in a large dataset with a fairly common outcome. PS's may be beneficial in producing more precise estimates, especially for models with many confounders and effect modifiers and where conventional adjustment with logistic regression is unsatisfactory. Short intervals between pregnancies are associated with preterm birth in this population, according to either technique. Birth spacing is an issue that women have some control over. Educational interventions, including birth control, should be applied during prenatal visits and following delivery.

  12. Above-bottom biomass retrieval of aquatic plants with regression models and SfM data acquired by a UAV platform - A case study in Wild Duck Lake Wetland, Beijing, China

    NASA Astrophysics Data System (ADS)

    Jing, Ran; Gong, Zhaoning; Zhao, Wenji; Pu, Ruiliang; Deng, Lei

    2017-12-01

    Above-bottom biomass (ABB) is considered as an important parameter for measuring the growth status of aquatic plants, and is of great significance for assessing health status of wetland ecosystems. In this study, Structure from Motion (SfM) technique was used to rebuild the study area with high overlapped images acquired by an unmanned aerial vehicle (UAV). We generated orthoimages and SfM dense point cloud data, from which vegetation indices (VIs) and SfM point cloud variables including average height (HAVG), standard deviation of height (HSD) and coefficient of variation of height (HCV) were extracted. These VIs and SfM point cloud variables could effectively characterize the growth status of aquatic plants, and thus they could be used to develop a simple linear regression model (SLR) and a stepwise linear regression model (SWL) with field measured ABB samples of aquatic plants. We also utilized a decision tree method to discriminate different types of aquatic plants. The experimental results indicated that (1) the SfM technique could effectively process high overlapped UAV images and thus be suitable for the reconstruction of fine texture feature of aquatic plant canopy structure; and (2) an SWL model based on point cloud variables: HAVG, HSD, HCV and two VIs: NGRDI, ExGR as independent variables has produced the best predictive result of ABB of aquatic plants in the study area, with a coefficient of determination of 0.84 and a relative root mean square error of 7.13%. In this analysis, a novel method for the quantitative inversion of a growth parameter (i.e., ABB) of aquatic plants in wetlands was demonstrated.

  13. Hierarchical Bayesian modelling of mobility metrics for hazard model input calibration

    NASA Astrophysics Data System (ADS)

    Calder, Eliza; Ogburn, Sarah; Spiller, Elaine; Rutarindwa, Regis; Berger, Jim

    2015-04-01

    In this work we present a method to constrain flow mobility input parameters for pyroclastic flow models using hierarchical Bayes modeling of standard mobility metrics such as H/L and flow volume etc. The advantage of hierarchical modeling is that it can leverage the information in global dataset for a particular mobility metric in order to reduce the uncertainty in modeling of an individual volcano, especially important where individual volcanoes have only sparse datasets. We use compiled pyroclastic flow runout data from Colima, Merapi, Soufriere Hills, Unzen and Semeru volcanoes, presented in an open-source database FlowDat (https://vhub.org/groups/massflowdatabase). While the exact relationship between flow volume and friction varies somewhat between volcanoes, dome collapse flows originating from the same volcano exhibit similar mobility relationships. Instead of fitting separate regression models for each volcano dataset, we use a variation of the hierarchical linear model (Kass and Steffey, 1989). The model presents a hierarchical structure with two levels; all dome collapse flows and dome collapse flows at specific volcanoes. The hierarchical model allows us to assume that the flows at specific volcanoes share a common distribution of regression slopes, then solves for that distribution. We present comparisons of the 95% confidence intervals on the individual regression lines for the data set from each volcano as well as those obtained from the hierarchical model. The results clearly demonstrate the advantage of considering global datasets using this technique. The technique developed is demonstrated here for mobility metrics, but can be applied to many other global datasets of volcanic parameters. In particular, such methods can provide a means to better contain parameters for volcanoes for which we only have sparse data, a ubiquitous problem in volcanology.

  14. A tandem regression-outlier analysis of a ligand cellular system for key structural modifications around ligand binding.

    PubMed

    Lin, Ying-Ting

    2013-04-30

    A tandem technique of hard equipment is often used for the chemical analysis of a single cell to first isolate and then detect the wanted identities. The first part is the separation of wanted chemicals from the bulk of a cell; the second part is the actual detection of the important identities. To identify the key structural modifications around ligand binding, the present study aims to develop a counterpart of tandem technique for cheminformatics. A statistical regression and its outliers act as a computational technique for separation. A PPARγ (peroxisome proliferator-activated receptor gamma) agonist cellular system was subjected to such an investigation. Results show that this tandem regression-outlier analysis, or the prioritization of the context equations tagged with features of the outliers, is an effective regression technique of cheminformatics to detect key structural modifications, as well as their tendency of impact to ligand binding. The key structural modifications around ligand binding are effectively extracted or characterized out of cellular reactions. This is because molecular binding is the paramount factor in such ligand cellular system and key structural modifications around ligand binding are expected to create outliers. Therefore, such outliers can be captured by this tandem regression-outlier analysis.

  15. A Comparison of Mean Phase Difference and Generalized Least Squares for Analyzing Single-Case Data

    ERIC Educational Resources Information Center

    Manolov, Rumen; Solanas, Antonio

    2013-01-01

    The present study focuses on single-case data analysis specifically on two procedures for quantifying differences between baseline and treatment measurements. The first technique tested is based on generalized least square regression analysis and is compared to a proposed non-regression technique, which allows obtaining similar information. The…

  16. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  17. Two biased estimation techniques in linear regression: Application to aircraft

    NASA Technical Reports Server (NTRS)

    Klein, Vladislav

    1988-01-01

    Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.

  18. Pancreatic Fat Is Associated With Metabolic Syndrome and Visceral Fat but Not Beta-Cell Function or Body Mass Index in Pediatric Obesity.

    PubMed

    Staaf, Johan; Labmayr, Viktor; Paulmichl, Katharina; Manell, Hannes; Cen, Jing; Ciba, Iris; Dahlbom, Marie; Roomp, Kirsten; Anderwald, Christian-Heinz; Meissnitzer, Matthias; Schneider, Reinhard; Forslund, Anders; Widhalm, Kurt; Bergquist, Jonas; Ahlström, Håkan; Bergsten, Peter; Weghuber, Daniel; Kullberg, Joel

    2017-03-01

    Adolescents with obesity have increased risk of type 2 diabetes and metabolic syndrome (MetS). Pancreatic fat has been related to these conditions; however, little is known about associations in pediatric obesity. The present study was designed to explore these associations further. We examined 116 subjects, 90 with obesity. Anthropometry, MetS, blood samples, and oral glucose tolerance tests were assessed using standard techniques. Pancreatic fat fraction (PFF) and other fat depots were quantified using magnetic resonance imaging. The PFF was elevated in subjects with obesity. No association between PFF and body mass index-standard deviation score (BMI-SDS) was found in the obesity subcohort. Pancreatic fat fraction correlated to Insulin Secretion Sensitivity Index-2 and Homeostatic Model Assessment of Insulin Resistance in simple regression; however, when using adjusted regression and correcting for BMI-SDS and other fat compartments, PFF correlated only to visceral adipose tissue and fasting glucose. Highest levels of PFF were found in subjects with obesity and MetS. In adolescents with obesity, PFF is elevated and associated to MetS, fasting glucose, and visceral adipose tissue but not to beta-cell function, glucose tolerance, or BMI-SDS. This study demonstrates that conclusions regarding PFF and its associations depend on the body mass features of the cohort.

  19. Eddy current technique for predicting burst pressure

    DOEpatents

    Petri, Mark C.; Kupperman, David S.; Morman, James A.; Reifman, Jaques; Wei, Thomas Y. C.

    2003-01-01

    A signal processing technique which correlates eddy current inspection data from a tube having a critical tubing defect with a range of predicted burst pressures for the tube is provided. The method can directly correlate the raw eddy current inspection data representing the critical tubing defect with the range of burst pressures using a regression technique, preferably an artificial neural network. Alternatively, the technique deconvolves the raw eddy current inspection data into a set of undistorted signals, each of which represents a separate defect of the tube. The undistorted defect signal which represents the critical tubing defect is related to a range of burst pressures utilizing a regression technique.

  20. Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

    NASA Astrophysics Data System (ADS)

    Yang, J.; Astitha, M.; Schwartz, C. S.

    2017-12-01

    Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.

  1. Evaluating uses of data mining techniques in propensity score estimation: a simulation study.

    PubMed

    Setoguchi, Soko; Schneeweiss, Sebastian; Brookhart, M Alan; Glynn, Robert J; Cook, E Francis

    2008-06-01

    In propensity score modeling, it is a standard practice to optimize the prediction of exposure status based on the covariate information. In a simulation study, we examined in what situations analyses based on various types of exposure propensity score (EPS) models using data mining techniques such as recursive partitioning (RP) and neural networks (NN) produce unbiased and/or efficient results. We simulated data for a hypothetical cohort study (n = 2000) with a binary exposure/outcome and 10 binary/continuous covariates with seven scenarios differing by non-linear and/or non-additive associations between exposure and covariates. EPS models used logistic regression (LR) (all possible main effects), RP1 (without pruning), RP2 (with pruning), and NN. We calculated c-statistics (C), standard errors (SE), and bias of exposure-effect estimates from outcome models for the PS-matched dataset. Data mining techniques yielded higher C than LR (mean: NN, 0.86; RPI, 0.79; RP2, 0.72; and LR, 0.76). SE tended to be greater in models with higher C. Overall bias was small for each strategy, although NN estimates tended to be the least biased. C was not correlated with the magnitude of bias (correlation coefficient [COR] = -0.3, p = 0.1) but increased SE (COR = 0.7, p < 0.001). Effect estimates from EPS models by simple LR were generally robust. NN models generally provided the least numerically biased estimates. C was not associated with the magnitude of bias but was with the increased SE.

  2. Neonatal Jaundice Detection System.

    PubMed

    Aydın, Mustafa; Hardalaç, Fırat; Ural, Berkan; Karap, Serhat

    2016-07-01

    Neonatal jaundice is a common condition that occurs in newborn infants in the first week of life. Today, techniques used for detection are required blood samples and other clinical testing with special equipment. The aim of this study is creating a non-invasive system to control and to detect the jaundice periodically and helping doctors for early diagnosis. In this work, first, a patient group which is consisted from jaundiced babies and a control group which is consisted from healthy babies are prepared, then between 24 and 48 h after birth, 40 jaundiced and 40 healthy newborns are chosen. Second, advanced image processing techniques are used on the images which are taken with a standard smartphone and the color calibration card. Segmentation, pixel similarity and white balancing methods are used as image processing techniques and RGB values and pixels' important information are obtained exactly. Third, during feature extraction stage, with using colormap transformations and feature calculation, comparisons are done in RGB plane between color change values and the 8-color calibration card which is specially designed. Finally, in the bilirubin level estimation stage, kNN and SVR machine learning regressions are used on the dataset which are obtained from feature extraction. At the end of the process, when the control group is based on for comparisons, jaundice is succesfully detected for 40 jaundiced infants and the success rate is 85 %. Obtained bilirubin estimation results are consisted with bilirubin results which are obtained from the standard blood test and the compliance rate is 85 %.

  3. Recent growth of conifer species of western North America: Assessing spatial patterns of radial growth trends

    USGS Publications Warehouse

    McKenzie, D.; Hessl, Amy E.; Peterson, D.L.

    2001-01-01

    We explored spatial patterns of low-frequency variability in radial tree growth among western North American conifer species and identified predictors of the variability in these patterns. Using 185 sites from the International Tree-Ring Data Bank, each of which contained 10a??60 raw ring-width series, we rebuilt two chronologies for each site, using two conservative methods designed to retain any low-frequency variability associated with recent environmental change. We used factor analysis to identify regional low-frequency patterns in site chronologies and estimated the slope of the growth trend since 1850 at each site from a combination of linear regression and time-series techniques. This slope was the response variable in a regression-tree model to predict the effects of environmental gradients and species-level differences on growth trends. Growth patterns at 27 sites from the American Southwest were consistent with quasi-periodic patterns of drought. Either 12 or 32 of the 185 sites demonstrated patterns of increasing growth between 1850 and 1980 A.D., depending on the standardization technique used. Pronounced growth increases were associated with high-elevation sites (above 3000 m) and high-latitude sites in maritime climates. Future research focused on these high-elevation and high-latitude sites should address the precise mechanisms responsible for increased 20th century growth.

  4. Comparison of FTIR-ATR and Raman spectroscopy in determination of VLDL triglycerides in blood serum with PLS regression

    NASA Astrophysics Data System (ADS)

    Oleszko, Adam; Hartwich, Jadwiga; Wójtowicz, Anna; Gąsior-Głogowska, Marlena; Huras, Hubert; Komorowska, Małgorzata

    2017-08-01

    Hypertriglyceridemia, related with triglyceride (TG) in plasma above 1.7 mmol/L is one of the cardiovascular risk factors. Very low density lipoproteins (VLDL) are the main TG carriers. Despite being time consuming, demanding well-qualified staff and expensive instrumentation, ultracentrifugation technique still remains the gold standard for the VLDL isolation. Therefore faster and simpler method of VLDL-TG determination is needed. Vibrational spectroscopy, including FT-IR and Raman, is widely used technique in lipid and protein research. The aim of this study was assessment of Raman and FT-IR spectroscopy in determination of VLDL-TG directly in serum with the isolation step omitted. TG concentration in serum and in ultracentrifugated VLDL fractions from 32 patients were measured with reference colorimetric method. FT-IR and Raman spectra of VLDL and serum samples were acquired. Partial least square (PLS) regression was used for calibration and leave-one-out cross validation. Our results confirmed possibility of reagent-free determination of VLDL-TG directly in serum with both Raman and FT-IR spectroscopy. Quantitative VLDL testing by FT-IR and/or Raman spectroscopy applied directly to maternal serum seems to be promising screening test to identify women with increased risk of adverse pregnancy outcomes and patient friendly method of choice based on ease of performance, accuracy and efficiency.

  5. Sparse Regression as a Sparse Eigenvalue Problem

    NASA Technical Reports Server (NTRS)

    Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai

    2008-01-01

    We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization

  6. Logistic regression applied to natural hazards: rare event logistic regression with replications

    NASA Astrophysics Data System (ADS)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  7. Development of Multiple Regression Equations To Predict Fourth Graders' Achievement in Reading and Selected Content Areas.

    ERIC Educational Resources Information Center

    Hafner, Lawrence E.

    A study developed a multiple regression prediction equation for each of six selected achievement variables in a popular standardized test of achievement. Subjects, 42 fourth-grade pupils randomly selected across several classes in a large elementary school in a north Florida city, were administered several standardized tests to determine predictor…

  8. Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2011-01-01

    Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…

  9. Comparative study of some robust statistical methods: weighted, parametric, and nonparametric linear regression of HPLC convoluted peak responses using internal standard method in drug bioavailability studies.

    PubMed

    Korany, Mohamed A; Maher, Hadir M; Galal, Shereen M; Ragab, Marwa A A

    2013-05-01

    This manuscript discusses the application and the comparison between three statistical regression methods for handling data: parametric, nonparametric, and weighted regression (WR). These data were obtained from different chemometric methods applied to the high-performance liquid chromatography response data using the internal standard method. This was performed on a model drug Acyclovir which was analyzed in human plasma with the use of ganciclovir as internal standard. In vivo study was also performed. Derivative treatment of chromatographic response ratio data was followed by convolution of the resulting derivative curves using 8-points sin x i polynomials (discrete Fourier functions). This work studies and also compares the application of WR method and Theil's method, a nonparametric regression (NPR) method with the least squares parametric regression (LSPR) method, which is considered the de facto standard method used for regression. When the assumption of homoscedasticity is not met for analytical data, a simple and effective way to counteract the great influence of the high concentrations on the fitted regression line is to use WR method. WR was found to be superior to the method of LSPR as the former assumes that the y-direction error in the calibration curve will increase as x increases. Theil's NPR method was also found to be superior to the method of LSPR as the former assumes that errors could occur in both x- and y-directions and that might not be normally distributed. Most of the results showed a significant improvement in the precision and accuracy on applying WR and NPR methods relative to LSPR.

  10. Non-destructive analysis of sensory traits of dry-cured loins by MRI-computer vision techniques and data mining.

    PubMed

    Caballero, Daniel; Antequera, Teresa; Caro, Andrés; Ávila, María Del Mar; G Rodríguez, Pablo; Perez-Palacios, Trinidad

    2017-07-01

    Magnetic resonance imaging (MRI) combined with computer vision techniques have been proposed as an alternative or complementary technique to determine the quality parameters of food in a non-destructive way. The aim of this work was to analyze the sensory attributes of dry-cured loins using this technique. For that, different MRI acquisition sequences (spin echo, gradient echo and turbo 3D), algorithms for MRI analysis (GLCM, NGLDM, GLRLM and GLCM-NGLDM-GLRLM) and predictive data mining techniques (multiple linear regression and isotonic regression) were tested. The correlation coefficient (R) and mean absolute error (MAE) were used to validate the prediction results. The combination of spin echo, GLCM and isotonic regression produced the most accurate results. In addition, the MRI data from dry-cured loins seems to be more suitable than the data from fresh loins. The application of predictive data mining techniques on computational texture features from the MRI data of loins enables the determination of the sensory traits of dry-cured loins in a non-destructive way. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  11. Simple to complex modeling of breathing volume using a motion sensor.

    PubMed

    John, Dinesh; Staudenmayer, John; Freedson, Patty

    2013-06-01

    To compare simple and complex modeling techniques to estimate categories of low, medium, and high ventilation (VE) from ActiGraph™ activity counts. Vertical axis ActiGraph™ GT1M activity counts, oxygen consumption and VE were measured during treadmill walking and running, sports, household chores and labor-intensive employment activities. Categories of low (<19.3 l/min), medium (19.3 to 35.4 l/min) and high (>35.4 l/min) VEs were derived from activity intensity classifications (light <2.9 METs, moderate 3.0 to 5.9 METs and vigorous >6.0 METs). We examined the accuracy of two simple techniques (multiple regression and activity count cut-point analyses) and one complex (random forest technique) modeling technique in predicting VE from activity counts. Prediction accuracy of the complex random forest technique was marginally better than the simple multiple regression method. Both techniques accurately predicted VE categories almost 80% of the time. The multiple regression and random forest techniques were more accurate (85 to 88%) in predicting medium VE. Both techniques predicted the high VE (70 to 73%) with greater accuracy than low VE (57 to 60%). Actigraph™ cut-points for light, medium and high VEs were <1381, 1381 to 3660 and >3660 cpm. There were minor differences in prediction accuracy between the multiple regression and the random forest technique. This study provides methods to objectively estimate VE categories using activity monitors that can easily be deployed in the field. Objective estimates of VE should provide a better understanding of the dose-response relationship between internal exposure to pollutants and disease. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Fetal MRI as a complementary technique after prenatal diagnosis of persistent vitelline artery in an otherwise normal fetus.

    PubMed

    Bravo, Coral; De León-Luis, Juan; Gámez, Francisco; Ruiz, Yolanda; Pintado, Pilar; Pérez, Ricardo; Ortiz-Quintana, Luis

    2013-10-01

    Prenatal ultrasound is the standard for the diagnosis of fetal anomalies. However, fetal MRI has emerged as a valuable diagnosis tool to complete the study of fetal malformations. Type II single umbilical artery results from the absence of both umbilical arteries and persistence of the vitelline artery. It has been described only in fetuses with sirenomelia or caudal regression syndrome. We report a favorable outcome in a normal fetus in which prenatal ultrasound and MRI showed a single umbilical artery arising from the aorta. The etiology of such a finding and its possible consequences are discussed. Copyright © 2013 Wiley Periodicals, Inc.

  13. Towards an Early Software Effort Estimation Based on Functional and Non-Functional Requirements

    NASA Astrophysics Data System (ADS)

    Kassab, Mohamed; Daneva, Maya; Ormandjieva, Olga

    The increased awareness of the non-functional requirements as a key to software project and product success makes explicit the need to include them in any software project effort estimation activity. However, the existing approaches to defining size-based effort relationships still pay insufficient attention to this need. This paper presents a flexible, yet systematic approach to the early requirements-based effort estimation, based on Non-Functional Requirements ontology. It complementarily uses one standard functional size measurement model and a linear regression technique. We report on a case study which illustrates the application of our solution approach in context and also helps evaluate our experiences in using it.

  14. Squeezing observational data for better causal inference: Methods and examples for prevention research.

    PubMed

    Garcia-Huidobro, Diego; Michael Oakes, J

    2017-04-01

    Randomised controlled trials (RCTs) are typically viewed as the gold standard for causal inference. This is because effects of interest can be identified with the fewest assumptions, especially imbalance in background characteristics. Yet because conducting RCTs are expensive, time consuming and sometimes unethical, observational studies are frequently used to study causal associations. In these studies, imbalance, or confounding, is usually controlled with multiple regression, which entails strong assumptions. The purpose of this manuscript is to describe strengths and weaknesses of several methods to control for confounding in observational studies, and to demonstrate their use in cross-sectional dataset that use patient registration data from the Juan Pablo II Primary Care Clinic in La Pintana-Chile. The dataset contains responses from 5855 families who provided complete information on family socio-demographics, family functioning and health problems among their family members. We employ regression adjustment, stratification, restriction, matching, propensity score matching, standardisation and inverse probability weighting to illustrate the approaches to better causal inference in non-experimental data and compare results. By applying study design and data analysis techniques that control for confounding in different ways than regression adjustment, researchers may strengthen the scientific relevance of observational studies. © 2016 International Union of Psychological Science.

  15. Refining cost-effectiveness analyses using the net benefit approach and econometric methods: an example from a trial of anti-depressant treatment.

    PubMed

    Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony

    2013-04-01

    Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.

  16. Simulation training and resident performance of singleton vaginal breech delivery.

    PubMed

    Deering, Shad; Brown, Jill; Hodor, Jonathon; Satin, Andrew J

    2006-01-01

    To determine whether simulation training improves resident competency in the management of a simulated vaginal breech delivery. Without advance notice or training, residents from 2 obstetrics and gynecology residency programs participated in a standardized simulation scenario of management of an imminent term vaginal breech delivery. The scenario used an obstetric birth simulator and human actors, with the encounters digitally recorded. Residents then received a training session with the simulator on the proper techniques for vaginal breech delivery. Two weeks later they were retested using a similar simulation scenario. A physician, blinded to training status, graded the residents' performance using a standardized evaluation sheet. Statistical analysis included the Wilcoxon signed rank test, McNemar chi2, regression analysis, and paired t test as appropriate with a P value of less than .05 considered significant. Twenty residents from 2 institutions completed all parts of the study protocol. Trained residents had significantly higher scores in 8 of 12 critical delivery components (P < .05). Overall performance of the delivery and safety in performing the delivery also improved significantly (P = .001 for both). Simulation training improved resident performance in the management of a simulated vaginal breech delivery. Performance of a term breech vaginal delivery is well suited for simulation training, because it is uncommon and inevitable, and improper technique may result in significant injury. II-2.

  17. Avoid lost discoveries, because of violations of standard assumptions, by using modern robust statistical methods.

    PubMed

    Wilcox, Rand; Carlson, Mike; Azen, Stan; Clark, Florence

    2013-03-01

    Recently, there have been major advances in statistical techniques for assessing central tendency and measures of association. The practical utility of modern methods has been documented extensively in the statistics literature, but they remain underused and relatively unknown in clinical trials. Our objective was to address this issue. STUDY DESIGN AND PURPOSE: The first purpose was to review common problems associated with standard methodologies (low power, lack of control over type I errors, and incorrect assessments of the strength of the association). The second purpose was to summarize some modern methods that can be used to circumvent such problems. The third purpose was to illustrate the practical utility of modern robust methods using data from the Well Elderly 2 randomized controlled trial. In multiple instances, robust methods uncovered differences among groups and associations among variables that were not detected by classic techniques. In particular, the results demonstrated that details of the nature and strength of the association were sometimes overlooked when using ordinary least squares regression and Pearson correlation. Modern robust methods can make a practical difference in detecting and describing differences between groups and associations between variables. Such procedures should be applied more frequently when analyzing trial-based data. Copyright © 2013 Elsevier Inc. All rights reserved.

  18. Detecting sea-level hazards: Simple regression-based methods for calculating the acceleration of sea level

    USGS Publications Warehouse

    Doran, Kara S.; Howd, Peter A.; Sallenger,, Asbury H.

    2016-01-04

    Recent studies, and most of their predecessors, use tide gage data to quantify SL acceleration, ASL(t). In the current study, three techniques were used to calculate acceleration from tide gage data, and of those examined, it was determined that the two techniques based on sliding a regression window through the time series are more robust compared to the technique that fits a single quadratic form to the entire time series, particularly if there is temporal variation in the magnitude of the acceleration. The single-fit quadratic regression method has been the most commonly used technique in determining acceleration in tide gage data. The inability of the single-fit method to account for time-varying acceleration may explain some of the inconsistent findings between investigators. Properly quantifying ASL(t) from field measurements is of particular importance in evaluating numerical models of past, present, and future SLR resulting from anticipated climate change.

  19. Liquid electrolyte informatics using an exhaustive search with linear regression.

    PubMed

    Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato

    2018-06-14

    Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.

  20. Forest type mapping of the Interior West

    Treesearch

    Bonnie Ruefenacht; Gretchen G. Moisen; Jock A. Blackard

    2004-01-01

    This paper develops techniques for the mapping of forest types in Arizona, New Mexico, and Wyoming. The methods involve regression-tree modeling using a variety of remote sensing and GIS layers along with Forest Inventory Analysis (FIA) point data. Regression-tree modeling is a fast and efficient technique of estimating variables for large data sets with high accuracy...

  1. Volatility forecasting for low-volatility portfolio selection in the US and the Korean equity markets

    NASA Astrophysics Data System (ADS)

    Kim, Saejoon

    2018-01-01

    We consider the problem of low-volatility portfolio selection which has been the subject of extensive research in the field of portfolio selection. To improve the currently existing techniques that rely purely on past information to select low-volatility portfolios, this paper investigates the use of time series regression techniques that make forecasts of future volatility to select the portfolios. In particular, for the first time, the utility of support vector regression and its enhancements as portfolio selection techniques is provided. It is shown that our regression-based portfolio selection provides attractive outperformances compared to the benchmark index and the portfolio defined by a well-known strategy on the data-sets of the S&P 500 and the KOSPI 200.

  2. Predicting birth weight with conditionally linear transformation models.

    PubMed

    Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten

    2016-12-01

    Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.

  3. Comparison of Nine Statistical Model Based Warfarin Pharmacogenetic Dosing Algorithms Using the Racially Diverse International Warfarin Pharmacogenetic Consortium Cohort Database

    PubMed Central

    Liu, Rong; Li, Xi; Zhang, Wei; Zhou, Hong-Hao

    2015-01-01

    Objective Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, racially-diverse cohort. Methods MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied in warfarin dose algorithms in a cohort from the International Warfarin Pharmacogenetics Consortium database. Covariates obtained by stepwise regression from 80% of randomly selected patients were used to develop algorithms. To compare the performances of these algorithms, the mean percentage of patients whose predicted dose fell within 20% of the actual dose (mean percentage within 20%) and the mean absolute error (MAE) were calculated in the remaining 20% of patients. The performances of these techniques in different races, as well as the dose ranges of therapeutic warfarin were compared. Robust results were obtained after 100 rounds of resampling. Results BART, MARS and SVR were statistically indistinguishable and significantly out performed all the other approaches in the whole cohort (MAE: 8.84–8.96 mg/week, mean percentage within 20%: 45.88%–46.35%). In the White population, MARS and BART showed higher mean percentage within 20% and lower mean MAE than those of MLR (all p values < 0.05). In the Asian population, SVR, BART, MARS and LAR performed the same as MLR. MLR and LAR optimally performed among the Black population. When patients were grouped in terms of warfarin dose range, all machine learning techniques except ANN and LAR showed significantly higher mean percentage within 20%, and lower MAE (all p values < 0.05) than MLR in the low- and high- dose ranges. Conclusion Overall, machine learning-based techniques, BART, MARS and SVR performed superior than MLR in warfarin pharmacogenetic dosing. Differences of algorithms’ performances exist among the races. Moreover, machine learning-based algorithms tended to perform better in the low- and high- dose ranges than MLR. PMID:26305568

  4. Normalization Ridge Regression in Practice I: Comparisons Between Ordinary Least Squares, Ridge Regression and Normalization Ridge Regression.

    ERIC Educational Resources Information Center

    Bulcock, J. W.

    The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…

  5. On statistical inference in time series analysis of the evolution of road safety.

    PubMed

    Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora

    2013-11-01

    Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.

  6. Digression and Value Concatenation to Enable Privacy-Preserving Regression.

    PubMed

    Li, Xiao-Bai; Sarkar, Sumit

    2014-09-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.

  7. New analysis methods to push the boundaries of diagnostic techniques in the environmental sciences

    NASA Astrophysics Data System (ADS)

    Lungaroni, M.; Murari, A.; Peluso, E.; Gelfusa, M.; Malizia, A.; Vega, J.; Talebzadeh, S.; Gaudio, P.

    2016-04-01

    In the last years, new and more sophisticated measurements have been at the basis of the major progress in various disciplines related to the environment, such as remote sensing and thermonuclear fusion. To maximize the effectiveness of the measurements, new data analysis techniques are required. First data processing tasks, such as filtering and fitting, are of primary importance, since they can have a strong influence on the rest of the analysis. Even if Support Vector Regression is a method devised and refined at the end of the 90s, a systematic comparison with more traditional non parametric regression methods has never been reported. In this paper, a series of systematic tests is described, which indicates how SVR is a very competitive method of non-parametric regression that can usefully complement and often outperform more consolidated approaches. The performance of Support Vector Regression as a method of filtering is investigated first, comparing it with the most popular alternative techniques. Then Support Vector Regression is applied to the problem of non-parametric regression to analyse Lidar surveys for the environments measurement of particulate matter due to wildfires. The proposed approach has given very positive results and provides new perspectives to the interpretation of the data.

  8. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    ERIC Educational Resources Information Center

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  9. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

  10. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  11. A Weighted Least Squares Approach To Robustify Least Squares Estimates.

    ERIC Educational Resources Information Center

    Lin, Chowhong; Davenport, Ernest C., Jr.

    This study developed a robust linear regression technique based on the idea of weighted least squares. In this technique, a subsample of the full data of interest is drawn, based on a measure of distance, and an initial set of regression coefficients is calculated. The rest of the data points are then taken into the subsample, one after another,…

  12. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    PubMed

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  13. A diagnostic analysis of the VVP single-doppler retrieval technique

    NASA Technical Reports Server (NTRS)

    Boccippio, Dennis J.

    1995-01-01

    A diagnostic analysis of the VVP (volume velocity processing) retrieval method is presented, with emphasis on understanding the technique as a linear, multivariate regression. Similarities and differences to the velocity-azimuth display and extended velocity-azimuth display retrieval techniques are discussed, using this framework. Conventional regression diagnostics are then employed to quantitatively determine situations in which the VVP technique is likely to fail. An algorithm for preparation and analysis of a robust VVP retrieval is developed and applied to synthetic and actual datasets with high temporal and spatial resolution. A fundamental (but quantifiable) limitation to some forms of VVP analysis is inadequate sampling dispersion in the n space of the multivariate regression, manifest as a collinearity between the basis functions of some fitted parameters. Such collinearity may be present either in the definition of these basis functions or in their realization in a given sampling configuration. This nonorthogonality may cause numerical instability, variance inflation (decrease in robustness), and increased sensitivity to bias from neglected wind components. It is shown that these effects prevent the application of VVP to small azimuthal sectors of data. The behavior of the VVP regression is further diagnosed over a wide range of sampling constraints, and reasonable sector limits are established.

  14. A self-trained classification technique for producing 30 m percent-water maps from Landsat data

    USGS Publications Warehouse

    Rover, Jennifer R.; Wylie, Bruce K.; Ji, Lei

    2010-01-01

    Small bodies of water can be mapped with moderate-resolution satellite data using methods where water is mapped as subpixel fractions using field measurements or high-resolution images as training datasets. A new method, developed from a regression-tree technique, uses a 30 m Landsat image for training the regression tree that, in turn, is applied to the same image to map subpixel water. The self-trained method was evaluated by comparing the percent-water map with three other maps generated from established percent-water mapping methods: (1) a regression-tree model trained with a 5 m SPOT 5 image, (2) a regression-tree model based on endmembers and (3) a linear unmixing classification technique. The results suggest that subpixel water fractions can be accurately estimated when high-resolution satellite data or intensively interpreted training datasets are not available, which increases our ability to map small water bodies or small changes in lake size at a regional scale.

  15. Validation of Fluorescence Spectroscopy to Detect Adulteration of Edible Oil in Extra Virgin Olive Oil (EVOO) by Applying Chemometrics.

    PubMed

    Ali, Hina; Saleem, Muhammad; Anser, Muhammad Ramzan; Khan, Saranjam; Ullah, Rahat; Bilal, Muhammad

    2018-01-01

    Due to high price and nutritional values of extra virgin olive oil (EVOO), it is vulnerable to adulteration internationally. Refined oil or other vegetable oils are commonly blended with EVOO and to unmask such fraud, quick, and reliable technique needs to be standardized and developed. Therefore, in this study, adulteration of edible oil (sunflower oil) is made with pure EVOO and analyzed using fluorescence spectroscopy (excitation wavelength at 350 nm) in conjunction with principal component analysis (PCA) and partial least squares (PLS) regression. Fluorescent spectra contain fingerprints of chlorophyll and carotenoids that are characteristics of EVOO and differentiated it from sunflower oil. A broad intense hump corresponding to conjugated hydroperoxides is seen in sunflower oil in the range of 441-489 nm with the maximum at 469 nm whereas pure EVOO has low intensity doublet peaks in this region at 441 nm and 469 nm. Visible changes in spectra are observed in adulterated EVOO by increasing the concentration of sunflower oil, with an increase in doublet peak and correspondingly decrease in chlorophyll peak intensity. Principal component analysis showed a distinct clustering of adulterated samples of different concentrations. Subsequently, the PLS regression model was best fitted over the complete data set on the basis of coefficient of determination (R 2 ), standard error of calibration (SEC), and standard error of prediction (SEP) of values 0.99, 0.617, and 0.623 respectively. In addition to adulterant, test samples and imported commercial brands of EVOO were also used for prediction and validation of the models. Fluorescence spectroscopy combined with chemometrics showed its robustness to identify and quantify the specified adulterant in pure EVOO.

  16. Prediction by regression and intrarange data scatter in surface-process studies

    USGS Publications Warehouse

    Toy, T.J.; Osterkamp, W.R.; Renard, K.G.

    1993-01-01

    Modeling is a major component of contemporary earth science, and regression analysis occupies a central position in the parameterization, calibration, and validation of geomorphic and hydrologic models. Although this methodology can be used in many ways, we are primarily concerned with the prediction of values for one variable from another variable. Examination of the literature reveals considerable inconsistency in the presentation of the results of regression analysis and the occurrence of patterns in the scatter of data points about the regression line. Both circumstances confound utilization and evaluation of the models. Statisticians are well aware of various problems associated with the use of regression analysis and offer improved practices; often, however, their guidelines are not followed. After a review of the aforementioned circumstances and until standard criteria for model evaluation become established, we recommend, as a minimum, inclusion of scatter diagrams, the standard error of the estimate, and sample size in reporting the results of regression analyses for most surface-process studies. ?? 1993 Springer-Verlag.

  17. Selection of vegetation indices for mapping the sugarcane condition around the oil and gas field of North West Java Basin, Indonesia

    NASA Astrophysics Data System (ADS)

    Muji Susantoro, Tri; Wikantika, Ketut; Saepuloh, Asep; Handoyo Harsolumakso, Agus

    2018-05-01

    Selection of vegetation indices in plant mapping is needed to provide the best information of plant conditions. The methods used in this research are the standard deviation and the linear regression. This research tried to determine the vegetation indices used for mapping the sugarcane conditions around oil and gas fields. The data used in this study is Landsat 8 OLI/TIRS. The standard deviation analysis on the 23 vegetation indices with 27 samples has resulted in the six highest standard deviations of vegetation indices, termed as GRVI, SR, NLI, SIPI, GEMI and LAI. The standard deviation values are 0.47; 0.43; 0.30; 0.17; 0.16 and 0.13. Regression correlation analysis on the 23 vegetation indices with 280 samples has resulted in the six vegetation indices, termed as NDVI, ENDVI, GDVI, VARI, LAI and SIPI. This was performed based on regression correlation with the lowest value R2 than 0,8. The combined analysis of the standard deviation and the regression correlation has obtained the five vegetation indices, termed as NDVI, ENDVI, GDVI, LAI and SIPI. The results of the analysis of both methods show that a combination of two methods needs to be done to produce a good analysis of sugarcane conditions. It has been clarified through field surveys and showed good results for the prediction of microseepages.

  18. Methods for estimating flood frequency in Montana based on data through water year 1998

    USGS Publications Warehouse

    Parrett, Charles; Johnson, Dave R.

    2004-01-01

    Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.

  19. Linear regression analysis for comparing two measurers or methods of measurement: but which regression?

    PubMed

    Ludbrook, John

    2010-07-01

    1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software.

  20. Predictors of the number of under-five malnourished children in Bangladesh: application of the generalized poisson regression model

    PubMed Central

    2013-01-01

    Background Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. Methods The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. Results The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance < mean) property. Our study also identify several significant predictors of the outcome variable namely mother’s education, father’s education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Conclusions Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh. PMID:23297699

  1. Automated Assessment of Child Vocalization Development Using LENA.

    PubMed

    Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance

    2017-07-12

    To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.

  2. The impact of social and organizational factors on workers' coping with musculoskeletal symptoms.

    PubMed

    Torp, S; Riise, T; Moen, B E

    2001-07-01

    Workers with musculoskeletal symptoms are often advised to cope with their symptoms by changing their working technique and by using lifting equipment. The main objective of this study was to test the hypothesis that negative social and organizational factors where people are employed may prevent workers from implementing these coping strategies. A total of 1,567 automobile garage workers (72%) returned a questionnaire concerning coping with musculoskeletal symptoms and social and organizational factors. When job demands, decision authority, social support, and management support related to health, environment, and safety (HES) were used as predictor variables in a multiple regression model, coping as the outcome variable was correlated with decision authority, social support, and HES-related management support (standardized beta=.079,.12, and.13, respectively). When an index for health-related support and control was added to the model, it correlated with coping (standardized beta=.36), whereas the other relationships disappeared. Decision authority and social support entail health-related support and control that, in turn, influences coping.

  3. Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

    PubMed

    Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

    2014-03-01

    In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.

  4. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  5. Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

    Treesearch

    Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

    2006-01-01

    We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.

  6. Techniques for estimating streamflow characteristics in the Eastern and Interior coal provinces of the United States

    USGS Publications Warehouse

    Wetzel, Kim L.; Bettandorff, J.M.

    1986-01-01

    Techniques are presented for estimating various streamflow characteristics, such as peak flows, mean monthly and annual flows, flow durations, and flow volumes, at ungaged sites on unregulated streams in the Eastern Coal region. Streamflow data and basin characteristics for 629 gaging stations were used to develop multiple-linear-regression equations. Separate equations were developed for the Eastern and Interior Coal Provinces. Drainage area is an independent variable common to all equations. Other variables needed, depending on the streamflow characteristic, are mean annual precipitation, mean basin elevation, main channel length, basin storage, main channel slope, and forest cover. A ratio of the observed 50- to 90-percent flow durations was used in the development of relations to estimate low-flow frequencies in the Eastern Coal Province. Relations to estimate low flows in the Interior Coal Province are not presented because the standard errors were greater than 0.7500 log units and were considered to be of poor reliability.

  7. Estimating Interaction Effects With Incomplete Predictor Variables

    PubMed Central

    Enders, Craig K.; Baraldi, Amanda N.; Cham, Heining

    2014-01-01

    The existing missing data literature does not provide a clear prescription for estimating interaction effects with missing data, particularly when the interaction involves a pair of continuous variables. In this article, we describe maximum likelihood and multiple imputation procedures for this common analysis problem. We outline 3 latent variable model specifications for interaction analyses with missing data. These models apply procedures from the latent variable interaction literature to analyses with a single indicator per construct (e.g., a regression analysis with scale scores). We also discuss multiple imputation for interaction effects, emphasizing an approach that applies standard imputation procedures to the product of 2 raw score predictors. We thoroughly describe the process of probing interaction effects with maximum likelihood and multiple imputation. For both missing data handling techniques, we outline centering and transformation strategies that researchers can implement in popular software packages, and we use a series of real data analyses to illustrate these methods. Finally, we use computer simulations to evaluate the performance of the proposed techniques. PMID:24707955

  8. Reduction of central line-associated bloodstream infections in a pediatric hematology/oncology population.

    PubMed

    Wilson, Matthew Z; Deeter, Deana; Rafferty, Colleen; Comito, Melanie M; Hollenbeak, Christopher S

    2014-01-01

    This study reports the results of an initiative to reduce central line-associated bloodstream infections (CLABSIs) among pediatric hematology/oncology patients, a population at increased risk for CLABSI. The study design was a pre-post comparison of a series of specific interventions over 40 months. Logistic regression was used to determine if the risk of developing CLABSI decreased in the postintervention period, after controlling for covariates. The overall CLABSI rate fell from 9 infections per 1000 line days at the beginning of the study to zero in a cohort of 291 patients encompassing 2107 admissions. Admissions during the intervention period had an 86% reduction in odds of developing a CLABSI, controlling for other factors. At the study team's institution, an initiative that standardized blood culturing techniques, lab draw times, line care techniques, and provided physician and nurse education was able to eliminate CLABSI among pediatric hematology/oncology patients. © 2013 by the American College of Medical Quality.

  9. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data

    PubMed Central

    Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.

    2012-01-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115

  10. Optimizing Hybrid Metrology: Rigorous Implementation of Bayesian and Combined Regression

    PubMed Central

    Henn, Mark-Alexander; Silver, Richard M.; Villarrubia, John S.; Zhang, Nien Fan; Zhou, Hui; Barnes, Bryan M.; Ming, Bin; Vladár, András E.

    2015-01-01

    Hybrid metrology, e.g., the combination of several measurement techniques to determine critical dimensions, is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of 3-D structures but also a more realistic estimation of the corresponding uncertainties. Recent developments at the National Institute of Standards and Technology (NIST) feature the combination of optical critical dimension (OCD) measurements and scanning electron microscope (SEM) results. The hybrid methodology offers the potential to make measurements of essential 3-D attributes that may not be otherwise feasible. However, combining techniques gives rise to essential challenges in error analysis and comparing results from different instrument models, especially the effect of systematic and highly correlated errors in the measurement on the χ2 function that is minimized. Both hypothetical examples and measurement data are used to illustrate solutions to these challenges. PMID:26681991

  11. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  12. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  13. Relationship between non-standard work arrangements and work-related accident absence in Belgium

    PubMed Central

    Alali, Hanan; Braeckman, Lutgart; Van Hecke, Tanja; De Clercq, Bart; Janssens, Heidi; Wahab, Magd Abdel

    2017-01-01

    Objectives: The main objective of this study is to examine the relationship between indicators of non-standard work arrangements, including precarious contract, long working hours, multiple jobs, shift work, and work-related accident absence, using a representative Belgian sample and considering several socio-demographic and work characteristics. Methods: This study was based on the data of the fifth European Working Conditions Survey (EWCS). For the analysis, the sample was restricted to 3343 respondents from Belgium who were all employed workers. The associations between non-standard work arrangements and work-related accident absence were studied with multivariate logistic regression modeling techniques while adjusting for several confounders. Results: During the last 12 months, about 11.7% of workers were absent from work because of work-related accident. A multivariate regression model showed an increased injury risk for those performing shift work (OR 1.546, 95% CI 1.074-2.224). The relationship between contract type and occupational injuries was not significant (OR 1.163, 95% CI 0.739-1.831). Furthermore, no statistically significant differences were observed for those performing long working hours (OR 1.217, 95% CI 0.638-2.321) and those performing multiple jobs (OR 1.361, 95% CI 0.827-2.240) in relation to work-related accident absence. Those who rated their health as bad, low educated workers, workers from the construction sector, and those exposed to biomechanical exposure (BM) were more frequent victims of work-related accident absence. No significant gender difference was observed. Conclusion: Indicators of non-standard work arrangements under this study, except shift work, were not significantly associated with work-related accident absence. To reduce the burden of occupational injuries, not only risk reduction strategies and interventions are needed but also policy efforts are to be undertaken to limit shift work. In general, preventive measures and more training on the job are needed to ensure the safety and well-being of all workers. PMID:28111414

  14. Relationship between non-standard work arrangements and work-related accident absence in Belgium.

    PubMed

    Alali, Hanan; Braeckman, Lutgart; Van Hecke, Tanja; De Clercq, Bart; Janssens, Heidi; Wahab, Magd Abdel

    2017-03-28

    The main objective of this study is to examine the relationship between indicators of non-standard work arrangements, including precarious contract, long working hours, multiple jobs, shift work, and work-related accident absence, using a representative Belgian sample and considering several socio-demographic and work characteristics. This study was based on the data of the fifth European Working Conditions Survey (EWCS). For the analysis, the sample was restricted to 3343 respondents from Belgium who were all employed workers. The associations between non-standard work arrangements and work-related accident absence were studied with multivariate logistic regression modeling techniques while adjusting for several confounders. During the last 12 months, about 11.7% of workers were absent from work because of work-related accident. A multivariate regression model showed an increased injury risk for those performing shift work (OR 1.546, 95% CI 1.074-2.224). The relationship between contract type and occupational injuries was not significant (OR 1.163, 95% CI 0.739-1.831). Furthermore, no statistically significant differences were observed for those performing long working hours (OR 1.217, 95% CI 0.638-2.321) and those performing multiple jobs (OR 1.361, 95% CI 0.827-2.240) in relation to work-related accident absence. Those who rated their health as bad, low educated workers, workers from the construction sector, and those exposed to biomechanical exposure (BM) were more frequent victims of work-related accident absence. No significant gender difference was observed. Indicators of non-standard work arrangements under this study, except shift work, were not significantly associated with work-related accident absence. To reduce the burden of occupational injuries, not only risk reduction strategies and interventions are needed but also policy efforts are to be undertaken to limit shift work. In general, preventive measures and more training on the job are needed to ensure the safety and well-being of all workers.

  15. A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

    PubMed

    Zhang, L; Liu, X J

    2016-06-03

    With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.

  16. Assessing the Impact of Drug Use on Hospital Costs

    PubMed Central

    Stuart, Bruce C; Doshi, Jalpa A; Terza, Joseph V

    2009-01-01

    Objective To assess whether outpatient prescription drug utilization produces offsets in the cost of hospitalization for Medicare beneficiaries. Data Sources/Study Setting The study analyzed a sample (N=3,101) of community-dwelling fee-for-service U.S. Medicare beneficiaries drawn from the 1999 and 2000 Medicare Current Beneficiary Surveys. Study Design Using a two-part model specification, we regressed any hospital admission (part 1: probit) and hospital spending by those with one or more admissions (part 2: nonlinear least squares regression) on drug use in a standard model with strong covariate controls and a residual inclusion instrumental variable (IV) model using an exogenous measure of drug coverage as the instrument. Principal Findings The covariate control model predicted that each additional prescription drug used (mean=30) raised hospital spending by $16 (p<.001). The residual inclusion IV model prediction was that each additional prescription fill reduced hospital spending by $104 (p<.001). Conclusions The findings indicate that drug use is associated with cost offsets in hospitalization among Medicare beneficiaries, once omitted variable bias is corrected using an IV technique appropriate for nonlinear applications. PMID:18783453

  17. Hybrid PSO-ASVR-based method for data fitting in the calibration of infrared radiometer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Sen; Li, Chengwei, E-mail: heikuanghit@163.com

    2016-06-15

    The present paper describes a hybrid particle swarm optimization-adaptive support vector regression (PSO-ASVR)-based method for data fitting in the calibration of infrared radiometer. The proposed hybrid PSO-ASVR-based method is based on PSO in combination with Adaptive Processing and Support Vector Regression (SVR). The optimization technique involves setting parameters in the ASVR fitting procedure, which significantly improves the fitting accuracy. However, its use in the calibration of infrared radiometer has not yet been widely explored. Bearing this in mind, the PSO-ASVR-based method, which is based on the statistical learning theory, is successfully used here to get the relationship between the radiationmore » of a standard source and the response of an infrared radiometer. Main advantages of this method are the flexible adjustment mechanism in data processing and the optimization mechanism in a kernel parameter setting of SVR. Numerical examples and applications to the calibration of infrared radiometer are performed to verify the performance of PSO-ASVR-based method compared to conventional data fitting methods.« less

  18. Life stress and atherosclerosis: a pathway through unhealthy lifestyle.

    PubMed

    Mainous, Arch G; Everett, Charles J; Diaz, Vanessa A; Player, Marty S; Gebregziabher, Mulugeta; Smith, Daniel W

    2010-01-01

    To examine the relationship between a general measure of chronic life stress and atherosclerosis among middle aged adults without clinical cardiovascular disease via pathways through unhealthy lifestyle characteristics. We conducted an analysis of The Multi-Ethnic Study of Atherosclerosis (MESA). The MESA collected in 2000 includes 5,773 participants, aged 45-84. We computed standard regression techniques to examine the relationship between life stress and atherosclerosis as well as path analysis with hypothesized paths from stress to atherosclerosis through unhealthy lifestyle. Our outcome was sub-clinical atherosclerosis measured as presence of coronary artery calcification (CAC). A logistic regression adjusted for potential confounding variables along with the unhealthy lifestyle characteristics of smoking, excessive alcohol use, high caloric intake, sedentary lifestyle, and obesity yielded no significant relationship between chronic life stress (OR 0.93, 95% CI 0.80-1.08) and CAC. However, significant indirect pathways between chronic life stress and CAC through smoking (p = .007), and sedentary lifestyle (p = .03) and caloric intake (.002) through obesity were found. These results suggest that life stress is related to atherosclerosis once paths of unhealthy coping behaviors are considered.

  19. Using CART to Identify Thresholds and Hierarchies in the Determinants of Funding Decisions.

    PubMed

    Schilling, Chris; Mortimer, Duncan; Dalziel, Kim

    2017-02-01

    There is much interest in understanding decision-making processes that determine funding outcomes for health interventions. We use classification and regression trees (CART) to identify cost-effectiveness thresholds and hierarchies in the determinants of funding decisions. The hierarchical structure of CART is suited to analyzing complex conditional and nonlinear relationships. Our analysis uncovered hierarchies where interventions were grouped according to their type and objective. Cost-effectiveness thresholds varied markedly depending on which group the intervention belonged to: lifestyle-type interventions with a prevention objective had an incremental cost-effectiveness threshold of $2356, suggesting that such interventions need to be close to cost saving or dominant to be funded. For lifestyle-type interventions with a treatment objective, the threshold was much higher at $37,024. Lower down the tree, intervention attributes such as the level of patient contribution and the eligibility for government reimbursement influenced the likelihood of funding within groups of similar interventions. Comparison between our CART models and previously published results demonstrated concurrence with standard regression techniques while providing additional insights regarding the role of the funding environment and the structure of decision-maker preferences.

  20. Suppression Situations in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  1. Odds per Adjusted Standard Deviation: Comparing Strengths of Associations for Risk Factors Measured on Different Scales and Across Diseases and Populations

    PubMed Central

    Hopper, John L.

    2015-01-01

    How can the “strengths” of risk factors, in the sense of how well they discriminate cases from controls, be compared when they are measured on different scales such as continuous, binary, and integer? Given that risk estimates take into account other fitted and design-related factors—and that is how risk gradients are interpreted—so should the presentation of risk gradients. Therefore, for each risk factor X0, I propose using appropriate regression techniques to derive from appropriate population data the best fitting relationship between the mean of X0 and all the other covariates fitted in the model or adjusted for by design (X1, X2, … , Xn). The odds per adjusted standard deviation (OPERA) presents the risk association for X0 in terms of the change in risk per s = standard deviation of X0 adjusted for X1, X2, … , Xn, rather than the unadjusted standard deviation of X0 itself. If the increased risk is relative risk (RR)-fold over A adjusted standard deviations, then OPERA = exp[ln(RR)/A] = RRs. This unifying approach is illustrated by considering breast cancer and published risk estimates. OPERA estimates are by definition independent and can be used to compare the predictive strengths of risk factors across diseases and populations. PMID:26520360

  2. Predicting tropical cyclone intensity using satellite measured equivalent blackbody temperatures of cloud tops. [regression analysis

    NASA Technical Reports Server (NTRS)

    Gentry, R. C.; Rodgers, E.; Steranka, J.; Shenk, W. E.

    1978-01-01

    A regression technique was developed to forecast 24 hour changes of the maximum winds for weak (maximum winds less than or equal to 65 Kt) and strong (maximum winds greater than 65 Kt) tropical cyclones by utilizing satellite measured equivalent blackbody temperatures around the storm alone and together with the changes in maximum winds during the preceding 24 hours and the current maximum winds. Independent testing of these regression equations shows that the mean errors made by the equations are lower than the errors in forecasts made by the peristence techniques.

  3. Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

    NASA Astrophysics Data System (ADS)

    Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

    2017-05-01

    This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.

  4. Optimization of Regression Models of Experimental Data Using Confirmation Points

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2010-01-01

    A new search metric is discussed that may be used to better assess the predictive capability of different math term combinations during the optimization of a regression model of experimental data. The new search metric can be determined for each tested math term combination if the given experimental data set is split into two subsets. The first subset consists of data points that are only used to determine the coefficients of the regression model. The second subset consists of confirmation points that are exclusively used to test the regression model. The new search metric value is assigned after comparing two values that describe the quality of the fit of each subset. The first value is the standard deviation of the PRESS residuals of the data points. The second value is the standard deviation of the response residuals of the confirmation points. The greater of the two values is used as the new search metric value. This choice guarantees that both standard deviations are always less or equal to the value that is used during the optimization. Experimental data from the calibration of a wind tunnel strain-gage balance is used to illustrate the application of the new search metric. The new search metric ultimately generates an optimized regression model that was already tested at regression model independent confirmation points before it is ever used to predict an unknown response from a set of regressors.

  5. A method for nonlinear exponential regression analysis

    NASA Technical Reports Server (NTRS)

    Junkin, B. G.

    1971-01-01

    A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.

  6. Using Dual Regression to Investigate Network Shape and Amplitude in Functional Connectivity Analyses

    PubMed Central

    Nickerson, Lisa D.; Smith, Stephen M.; Öngür, Döst; Beckmann, Christian F.

    2017-01-01

    Independent Component Analysis (ICA) is one of the most popular techniques for the analysis of resting state FMRI data because it has several advantageous properties when compared with other techniques. Most notably, in contrast to a conventional seed-based correlation analysis, it is model-free and multivariate, thus switching the focus from evaluating the functional connectivity of single brain regions identified a priori to evaluating brain connectivity in terms of all brain resting state networks (RSNs) that simultaneously engage in oscillatory activity. Furthermore, typical seed-based analysis characterizes RSNs in terms of spatially distributed patterns of correlation (typically by means of simple Pearson's coefficients) and thereby confounds together amplitude information of oscillatory activity and noise. ICA and other regression techniques, on the other hand, retain magnitude information and therefore can be sensitive to both changes in the spatially distributed nature of correlations (differences in the spatial pattern or “shape”) as well as the amplitude of the network activity. Furthermore, motion can mimic amplitude effects so it is crucial to use a technique that retains such information to ensure that connectivity differences are accurately localized. In this work, we investigate the dual regression approach that is frequently applied with group ICA to assess group differences in resting state functional connectivity of brain networks. We show how ignoring amplitude effects and how excessive motion corrupts connectivity maps and results in spurious connectivity differences. We also show how to implement the dual regression to retain amplitude information and how to use dual regression outputs to identify potential motion effects. Two key findings are that using a technique that retains magnitude information, e.g., dual regression, and using strict motion criteria are crucial for controlling both network amplitude and motion-related amplitude effects, respectively, in resting state connectivity analyses. We illustrate these concepts using realistic simulated resting state FMRI data and in vivo data acquired in healthy subjects and patients with bipolar disorder and schizophrenia. PMID:28348512

  7. General Nature of Multicollinearity in Multiple Regression Analysis.

    ERIC Educational Resources Information Center

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  8. CATEGORICAL REGRESSION ANALYSIS OF ACUTE INHALATION TOXICITY DATA FOR HYDROGEN SULFIDE

    EPA Science Inventory

    Categorical regression is one of the tools offered by the U.S. EPA for derivation of acute reference exposures (AREs), which are dose-response assessments for acute exposures to inhaled chemicals. Categorical regression is used as a meta-analytical technique to calculate probabi...

  9. Cytogenetic status and oxidative DNA-damage induced by atorvastatin in human peripheral blood lymphocytes: standard and Fpg-modified comet assay.

    PubMed

    Gajski, Goran; Garaj-Vrhovac, Vera; Orescanin, Visnja

    2008-08-15

    To investigate the genotoxic potential of atorvastatin on human lymphocytes in vitro standard comet assay was used in the evaluation of basal DNA damage and to investigate possible oxidative DNA damage produced by reactive oxygen species (ROS) Fpg-modified version of comet assay was also conducted. In addition to these techniques the new criteria for scoring micronucleus test were applied for more complete detection of baseline damage in binuclear lymphocytes exposed to atorvastatin 80 mg/day in different time periods by virtue of measuring the frequency of micronuclei, nucleoplasmic bridges and nuclear buds. All parameters obtained with the standard comet assay and Fpg-modified comet assay were significantly higher in the treated than in control lymphocytes. The Fpg-modified comet assay showed a significantly greater tail length, tail intensity, and tail moment in all treated lymphocytes than did the standard comet assay, which suggests that oxidative stress is likely to be responsible for DNA damage. DNA damage detected by the standard comet assay indicates that some other mechanism is also involved. In addition to the comet assay, a total number of micronuclei, nucleoplasmic bridges and nuclear buds were significantly higher in the exposed than in controlled lymphocytes. Regression analyses showed a positive correlation between the results obtained by the comet (Fpg-modified and standard) and micronucleus assay. Overall, the study demonstrated that atorvastatin in its highest dose is capable of producing damage on the level of DNA molecule and cell.

  10. Estimates of Median Flows for Streams on the 1999 Kansas Surface Water Register

    USGS Publications Warehouse

    Perry, Charles A.; Wolock, David M.; Artman, Joshua C.

    2004-01-01

    The Kansas State Legislature, by enacting Kansas Statute KSA 82a?2001 et. seq., mandated the criteria for determining which Kansas stream segments would be subject to classification by the State. One criterion for the selection as a classified stream segment is based on the statistic of median flow being equal to or greater than 1 cubic foot per second. As specified by KSA 82a?2001 et. seq., median flows were determined from U.S. Geological Survey streamflow-gaging-station data by using the most-recent 10 years of gaged data (KSA) for each streamflow-gaging station. Median flows also were determined by using gaged data from the entire period of record (all-available hydrology, AAH). Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating median flows for uncontrolled stream segments. The drainage area of the gaging stations on uncontrolled stream segments used in the regression analyses ranged from 2.06 to 12,004 square miles. A logarithmic transformation of the data was needed to develop the best linear relation for computing median flows. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. Tobit analyses of KSA data yielded a model standard error of prediction of 0.285 logarithmic units, and the best equations using Tobit analyses of AAH data had a model standard error of prediction of 0.250 logarithmic units. These regression equations and an interpolation procedure were used to compute median flows for the uncontrolled stream segments on the 1999 Kansas Surface Water Register. Measured median flows from gaging stations were incorporated into the regression-estimated median flows along the stream segments where available. The segments that were uncontrolled were interpolated using gaged data weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled segments of Kansas streams, the median flow information was interpolated between gaging stations using only gaged data weighted by drainage area. Of the 2,232 total stream segments on the Kansas Surface Water Register, 34.5 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second when the KSA analysis was used. When the AAH analysis was used, 36.2 percent of the segments had an estimated median streamflow of less than 1 cubic foot per second. This report supercedes U.S. Geological Survey Water-Resources Investigations Report 02?4292.

  11. Bias in logistic regression due to imperfect diagnostic test results and practical correction approaches.

    PubMed

    Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul

    2015-11-04

    Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.

  12. Robust estimation approach for blind denoising.

    PubMed

    Rabie, Tamer

    2005-11-01

    This work develops a new robust statistical framework for blind image denoising. Robust statistics addresses the problem of estimation when the idealized assumptions about a system are occasionally violated. The contaminating noise in an image is considered as a violation of the assumption of spatial coherence of the image intensities and is treated as an outlier random variable. A denoised image is estimated by fitting a spatially coherent stationary image model to the available noisy data using a robust estimator-based regression method within an optimal-size adaptive window. The robust formulation aims at eliminating the noise outliers while preserving the edge structures in the restored image. Several examples demonstrating the effectiveness of this robust denoising technique are reported and a comparison with other standard denoising filters is presented.

  13. New Concepts on Pathogenesis and Diagnosis of Liver Fibrosis; A Review Article

    PubMed Central

    Ebrahimi, Hedyeh; Naderian, Mohammadreza; Sohrabpour, Amir Ali

    2016-01-01

    Liver fibrosis is a potentially reversible response to hepatic insults, triggered by different chronic diseases most importantly viral hepatitis, alcoholic, and nonalcoholic fatty liver disease. In the course of the chronic liver disease, hepatic fibrogenesis may develop, which is attributed to various types of cells, molecules, and pathways. Activated hepatic stellate cell (HSC), the primary source of extracellular matrix (ECM), is fundamental in pathophysiology of fibrogenesis, and thus is the most attractable target for reversing liver fibrosis. Although, liver biopsy has long been considered as the gold standard for diagnosis and staging of hepatic fibrosis, assessing progression and regression by biopsy is hampered by its limitations. We provide recent views on noninvasive approaches including serum biomarkers and radiologic techniques. PMID:27698966

  14. Improving estimates of streamflow characteristics using LANDSAT-1 (ERTS-1) imagery. [Delmarva Peninsula

    NASA Technical Reports Server (NTRS)

    Hollyday, E. F. (Principal Investigator)

    1975-01-01

    The author has identified the following significant results. Streamflow characteristics in the Delmarva Peninsula derived from the records of daily discharge of 20 gaged basins are representative of the full range in flow conditions and include all of those commonly used for design or planning purposes. They include annual flood peaks with recurrence intervals of 2, 5, 10, 25, and 50 years, mean annual discharge, standard deviation of the mean annual discharge, mean monthly discharges, standard deviation of the mean monthly discharges, low-flow characteristics, flood volume characteristics, and the discharge equalled or exceeded 50 percent of the time. Streamflow and basin characteristics were related by a technique of multiple regression using a digital computer. A control group of equations was computed using basin characteristics derived from maps and climatological records. An experimental group of equations was computed using basin characteristics derived from LANDSAT imagery as well as from maps and climatological records. Based on a reduction in standard error of estimate equal to or greater than 10 percent, the equations for 12 stream flow characteristics were substantially improved by adding to the analyses basin characteristics derived from LANDSAT imagery.

  15. Regression: The Apple Does Not Fall Far From the Tree.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  16. On the adequacy of identified Cole Cole models

    NASA Astrophysics Data System (ADS)

    Xiang, Jianping; Cheng, Daizhan; Schlindwein, F. S.; Jones, N. B.

    2003-06-01

    The Cole-Cole model has been widely used to interpret electrical geophysical data. Normally an iterative computer program is used to invert the frequency domain complex impedance data and simple error estimation is obtained from the squared difference of the measured (field) and calculated values over the full frequency range. Recently a new direct inversion algorithm was proposed for the 'optimal' estimation of the Cole-Cole parameters, which differs from existing inversion algorithms in that the estimated parameters are direct solutions of a set of equations without the need for an initial guess for initialisation. This paper first briefly investigates the advantages and disadvantages of the new algorithm compared to the standard Levenberg-Marquardt "ridge regression" algorithm. Then, and more importantly, we address the adequacy of the models resulting from both the "ridge regression" and the new algorithm, using two different statistical tests and we give objective statistical criteria for acceptance or rejection of the estimated models. The first is the standard χ2 technique. The second is a parameter-accuracy based test that uses a joint multi-normal distribution. Numerical results that illustrate the performance of both testing methods are given. The main goals of this paper are (i) to provide the source code for the new ''direct inversion'' algorithm in Matlab and (ii) to introduce and demonstrate two methods to determine the reliability of a set of data before data processing, i.e., to consider the adequacy of the resulting Cole-Cole model.

  17. Concurrent validation of a neurocognitive assessment protocol for clients with mental illness in job matching as shop sales in supported employment.

    PubMed

    Ng, S S W; Lak, D C C; Lee, S C K; Ng, P P K

    2015-03-01

    Occupational therapists play a major role in the assessment and referral of clients with severe mental illness for supported employment. Nonetheless, there is scarce literature about the content and predictive validity of the process. In addition, the criteria of successful job matching have not been analysed and job supervisors have relied on experience rather than objective standards in recruitment. This study aimed to explore the profile of successful clients working in 'shop sales' in a supportive environment using a neurocognitive assessment protocol, and to validate the protocol against 'internal standards' of the job supervisors. This was a concurrent validation study of criterion-related scales for a single job type. The subjective ratings from the supervisors were concurrently validated against the results of neurocognitive assessment of intellectual function and work-related cognitive behaviour. A regression model was established for clients who succeeded and failed in employment using supervisor's ratings and a cutoff value of 10.5 for the Performance Fitness Rating Scale (R(2) = 0.918, F[41] = 3.794, p = 0.003). Classification And Regression Tree was also plotted to identify the profile of cases, with an overall accuracy of 0.861 (relative error, 0.26). Use of both inference statistics and data mining techniques enables the decision tree of neurocognitive assessments to be more readily applied by therapists in vocational rehabilitation, and thus directly improve the efficiency and efficacy of the process.

  18. Techniques for virtual lung nodule insertion: volumetric and morphometric comparison of projection-based and image-based methods for quantitative CT

    NASA Astrophysics Data System (ADS)

    Robins, Marthony; Solomon, Justin; Sahbaee, Pooyan; Sedlmair, Martin; Choudhury, Kingshuk Roy; Pezeshk, Aria; Sahiner, Berkman; Samei, Ehsan

    2017-09-01

    Virtual nodule insertion paves the way towards the development of standardized databases of hybrid CT images with known lesions. The purpose of this study was to assess three methods (an established and two newly developed techniques) for inserting virtual lung nodules into CT images. Assessment was done by comparing virtual nodule volume and shape to the CT-derived volume and shape of synthetic nodules. 24 synthetic nodules (three sizes, four morphologies, two repeats) were physically inserted into the lung cavity of an anthropomorphic chest phantom (KYOTO KAGAKU). The phantom was imaged with and without nodules on a commercial CT scanner (SOMATOM Definition Flash, Siemens) using a standard thoracic CT protocol at two dose levels (1.4 and 22 mGy CTDIvol). Raw projection data were saved and reconstructed with filtered back-projection and sinogram affirmed iterative reconstruction (SAFIRE, strength 5) at 0.6 mm slice thickness. Corresponding 3D idealized, virtual nodule models were co-registered with the CT images to determine each nodule’s location and orientation. Virtual nodules were voxelized, partial volume corrected, and inserted into nodule-free CT data (accounting for system imaging physics) using two methods: projection-based Technique A, and image-based Technique B. Also a third Technique C based on cropping a region of interest from the acquired image of the real nodule and blending it into the nodule-free image was tested. Nodule volumes were measured using a commercial segmentation tool (iNtuition, TeraRecon, Inc.) and deformation was assessed using the Hausdorff distance. Nodule volumes and deformations were compared between the idealized, CT-derived and virtual nodules using a linear mixed effects regression model which utilized the mean, standard deviation, and coefficient of variation (Mea{{n}RHD} , ST{{D}RHD} and C{{V}RHD}{) }~ of the regional Hausdorff distance. Overall, there was a close concordance between the volumes of the CT-derived and virtual nodules. Percent differences between them were less than 3% for all insertion techniques and were not statistically significant in most cases. Correlation coefficient values were greater than 0.97. The deformation according to the Hausdorff distance was also similar between the CT-derived and virtual nodules with minimal statistical significance in the (C{{V}RHD} ) for Techniques A, B, and C. This study shows that both projection-based and image-based nodule insertion techniques yield realistic nodule renderings with statistical similarity to the synthetic nodules with respect to nodule volume and deformation. These techniques could be used to create a database of hybrid CT images containing nodules of known size, location and morphology.

  19. Techniques for virtual lung nodule insertion: volumetric and morphometric comparison of projection-based and image-based methods for quantitative CT

    PubMed Central

    Robins, Marthony; Solomon, Justin; Sahbaee, Pooyan; Sedlmair, Martin; Choudhury, Kingshuk Roy; Pezeshk, Aria; Sahiner, Berkman; Samei, Ehsan

    2017-01-01

    Virtual nodule insertion paves the way towards the development of standardized databases of hybrid CT images with known lesions. The purpose of this study was to assess three methods (an established and two newly developed techniques) for inserting virtual lung nodules into CT images. Assessment was done by comparing virtual nodule volume and shape to the CT-derived volume and shape of synthetic nodules. 24 synthetic nodules (three sizes, four morphologies, two repeats) were physically inserted into the lung cavity of an anthropomorphic chest phantom (KYOTO KAGAKU). The phantom was imaged with and without nodules on a commercial CT scanner (SOMATOM Definition Flash, Siemens) using a standard thoracic CT protocol at two dose levels (1.4 and 22 mGy CTDIvol). Raw projection data were saved and reconstructed with filtered back-projection and sinogram affirmed iterative reconstruction (SAFIRE, strength 5) at 0.6 mm slice thickness. Corresponding 3D idealized, virtual nodule models were co-registered with the CT images to determine each nodule’s location and orientation. Virtual nodules were voxelized, partial volume corrected, and inserted into nodule-free CT data (accounting for system imaging physics) using two methods: projection-based Technique A, and image-based Technique B. Also a third Technique C based on cropping a region of interest from the acquired image of the real nodule and blending it into the nodule-free image was tested. Nodule volumes were measured using a commercial segmentation tool (iNtuition, TeraRecon, Inc.) and deformation was assessed using the Hausdorff distance. Nodule volumes and deformations were compared between the idealized, CT-derived and virtual nodules using a linear mixed effects regression model which utilized the mean, standard deviation, and coefficient of variation (MeanRHD, and STDRHD CVRHD) of the regional Hausdorff distance. Overall, there was a close concordance between the volumes of the CT-derived and virtual nodules. Percent differences between them were less than 3% for all insertion techniques and were not statistically significant in most cases. Correlation coefficient values were greater than 0.97. The deformation according to the Hausdorff distance was also similar between the CT-derived and virtual nodules with minimal statistical significance in the (CVRHD) for Techniques A, B, and C. This study shows that both projection-based and image-based nodule insertion techniques yield realistic nodule renderings with statistical similarity to the synthetic nodules with respect to nodule volume and deformation. These techniques could be used to create a database of hybrid CT images containing nodules of known size, location and morphology. PMID:28786399

  20. Age Regression in the Treatment of Anger in a Prison Setting.

    ERIC Educational Resources Information Center

    Eisel, Harry E.

    1988-01-01

    Incorporated hypnotherapy with age regression into cognitive therapeutic approach with prisoners having history of anger. Technique involved age regression to establish first significant event causing current anger, catharsis of feelings for original event, and reorientation of event while under hypnosis. Results indicated decrease in acting-out…

  1. Twist Model Development and Results from the Active Aeroelastic Wing F/A-18 Aircraft

    NASA Technical Reports Server (NTRS)

    Lizotte, Andrew M.; Allen, Michael J.

    2007-01-01

    Understanding the wing twist of the active aeroelastic wing (AAW) F/A-18 aircraft is a fundamental research objective for the program and offers numerous benefits. In order to clearly understand the wing flexibility characteristics, a model was created to predict real-time wing twist. A reliable twist model allows the prediction of twist for flight simulation, provides insight into aircraft performance uncertainties, and assists with computational fluid dynamic and aeroelastic issues. The left wing of the aircraft was heavily instrumented during the first phase of the active aeroelastic wing program allowing deflection data collection. Traditional data processing steps were taken to reduce flight data, and twist predictions were made using linear regression techniques. The model predictions determined a consistent linear relationship between the measured twist and aircraft parameters, such as surface positions and aircraft state variables. Error in the original model was reduced in some cases by using a dynamic pressure-based assumption. This technique produced excellent predictions for flight between the standard test points and accounted for nonlinearities in the data. This report discusses data processing techniques and twist prediction validation, and provides illustrative and quantitative results.

  2. 3D-liquid chromatography as a complex mixture characterization tool for knowledge-based downstream process development.

    PubMed

    Hanke, Alexander T; Tsintavi, Eleni; Ramirez Vazquez, Maria Del Pilar; van der Wielen, Luuk A M; Verhaert, Peter D E M; Eppink, Michel H M; van de Sandt, Emile J A X; Ottens, Marcel

    2016-09-01

    Knowledge-based development of chromatographic separation processes requires efficient techniques to determine the physicochemical properties of the product and the impurities to be removed. These characterization techniques are usually divided into approaches that determine molecular properties, such as charge, hydrophobicity and size, or molecular interactions with auxiliary materials, commonly in the form of adsorption isotherms. In this study we demonstrate the application of a three-dimensional liquid chromatography approach to a clarified cell homogenate containing a therapeutic enzyme. Each separation dimension determines a molecular property relevant to the chromatographic behavior of each component. Matching of the peaks across the different separation dimensions and against a high-resolution reference chromatogram allows to assign the determined parameters to pseudo-components, allowing to determine the most promising technique for the removal of each impurity. More detailed process design using mechanistic models requires isotherm parameters. For this purpose, the second dimension consists of multiple linear gradient separations on columns in a high-throughput screening compatible format, that allow regression of isotherm parameters with an average standard error of 8%. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1283-1291, 2016. © 2016 American Institute of Chemical Engineers.

  3. On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve.

    PubMed

    Eng, Kevin H; Schiller, Emily; Morrell, Kayla

    2015-11-03

    Researchers developing biomarkers for cancer prognosis from quantitative gene expression data are often faced with an odd methodological discrepancy: while Cox's proportional hazards model, the appropriate and popular technique, produces a continuous and relative risk score, it is hard to cast the estimate in clear clinical terms like median months of survival and percent of patients affected. To produce a familiar Kaplan-Meier plot, researchers commonly make the decision to dichotomize a continuous (often unimodal and symmetric) score. It is well known in the statistical literature that this procedure induces significant bias. We illustrate the liabilities of common techniques for categorizing a risk score and discuss alternative approaches. We promote the use of the restricted mean survival (RMS) and the corresponding RMS curve that may be thought of as an analog to the best fit line from simple linear regression. Continuous biomarker workflows should be modified to include the more rigorous statistical techniques and descriptive plots described in this article. All statistics discussed can be computed via standard functions in the Survival package of the R statistical programming language. Example R language code for the RMS curve is presented in the appendix.

  4. Twist Model Development and Results From the Active Aeroelastic Wing F/A-18 Aircraft

    NASA Technical Reports Server (NTRS)

    Lizotte, Andrew; Allen, Michael J.

    2005-01-01

    Understanding the wing twist of the active aeroelastic wing F/A-18 aircraft is a fundamental research objective for the program and offers numerous benefits. In order to clearly understand the wing flexibility characteristics, a model was created to predict real-time wing twist. A reliable twist model allows the prediction of twist for flight simulation, provides insight into aircraft performance uncertainties, and assists with computational fluid dynamic and aeroelastic issues. The left wing of the aircraft was heavily instrumented during the first phase of the active aeroelastic wing program allowing deflection data collection. Traditional data processing steps were taken to reduce flight data, and twist predictions were made using linear regression techniques. The model predictions determined a consistent linear relationship between the measured twist and aircraft parameters, such as surface positions and aircraft state variables. Error in the original model was reduced in some cases by using a dynamic pressure-based assumption and by using neural networks. These techniques produced excellent predictions for flight between the standard test points and accounted for nonlinearities in the data. This report discusses data processing techniques and twist prediction validation, and provides illustrative and quantitative results.

  5. Differentiation of four Aspergillus species and one Zygosaccharomyces with two electronic tongues based on different measurement techniques.

    PubMed

    Söderström, C; Rudnitskaya, A; Legin, A; Krantz-Rülcker, C

    2005-09-29

    Two electronic tongues based on different measurement techniques were applied to the discrimination of four molds and one yeast. Chosen microorganisms were different species of Aspergillus and yeast specie Zygosaccharomyces bailii, which are known as food contaminants. The electronic tongue developed in Linköping University was based on voltammetry. Four working electrodes made of noble metals were used in a standard three-electrode configuration in this case. The St. Petersburg electronic tongue consisted of 27 potentiometric chemical sensors with enhanced cross-sensitivity. Sensors with chalcogenide glass and plasticized PVC membranes were used. Two sets of samples were measured using both electronic tongues. Firstly, broths were measured in which either one of the molds or the yeast grew until late logarithmic phase or border of the stationary phase. Broths inoculated by either one of molds or the yeast was measured at five different times during microorganism growth. Data were evaluated using principal component analysis (PCA), partial least square regression (PLS) and linear discriminant analysis (LDA). It was found that both measurement techniques could differentiate between fungi species. Merged data from both electronic tongues improved differentiation of the samples in selected cases.

  6. Novel Method for Incorporating Model Uncertainties into Gravitational Wave Parameter Estimates

    NASA Astrophysics Data System (ADS)

    Moore, Christopher J.; Gair, Jonathan R.

    2014-12-01

    Posterior distributions on parameters computed from experimental data using Bayesian techniques are only as accurate as the models used to construct them. In many applications, these models are incomplete, which both reduces the prospects of detection and leads to a systematic error in the parameter estimates. In the analysis of data from gravitational wave detectors, for example, accurate waveform templates can be computed using numerical methods, but the prohibitive cost of these simulations means this can only be done for a small handful of parameters. In this Letter, a novel method to fold model uncertainties into data analysis is proposed; the waveform uncertainty is analytically marginalized over using with a prior distribution constructed by using Gaussian process regression to interpolate the waveform difference from a small training set of accurate templates. The method is well motivated, easy to implement, and no more computationally expensive than standard techniques. The new method is shown to perform extremely well when applied to a toy problem. While we use the application to gravitational wave data analysis to motivate and illustrate the technique, it can be applied in any context where model uncertainties exist.

  7. Novel 3D ultrasound image-based biomarkers based on a feature selection from a 2D standardized vessel wall thickness map: a tool for sensitive assessment of therapies for carotid atherosclerosis

    NASA Astrophysics Data System (ADS)

    Chiu, Bernard; Li, Bing; Chow, Tommy W. S.

    2013-09-01

    With the advent of new therapies and management strategies for carotid atherosclerosis, there is a parallel need for measurement tools or biomarkers to evaluate the efficacy of these new strategies. 3D ultrasound has been shown to provide reproducible measurements of plaque area/volume and vessel wall volume. However, since carotid atherosclerosis is a focal disease that predominantly occurs at bifurcations, biomarkers based on local plaque change may be more sensitive than global volumetric measurements in demonstrating efficacy of new therapies. The ultimate goal of this paper is to develop a biomarker that is based on the local distribution of vessel-wall-plus-plaque thickness change (VWT-Change) that has occurred during the course of a clinical study. To allow comparison between different treatment groups, the VWT-Change distribution of each subject must first be mapped to a standardized domain. In this study, we developed a technique to map the 3D VWT-Change distribution to a 2D standardized template. We then applied a feature selection technique to identify regions on the 2D standardized map on which subjects in different treatment groups exhibit greater difference in VWT-Change. The proposed algorithm was applied to analyse the VWT-Change of 20 subjects in a placebo-controlled study of the effect of atorvastatin (Lipitor). The average VWT-Change for each subject was computed (i) over all points in the 2D map and (ii) over feature points only. For the average computed over all points, 97 subjects per group would be required to detect an effect size of 25% that of atorvastatin in a six-month study. The sample size is reduced to 25 subjects if the average were computed over feature points only. The introduction of this sensitive quantification technique for carotid atherosclerosis progression/regression would allow many proof-of-principle studies to be performed before a more costly and longer study involving a larger population is held to confirm the treatment efficacy.

  8. Radiation Dose-Response Model for Locally Advanced Rectal Cancer After Preoperative Chemoradiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Appelt, Ane L., E-mail: ane.lindegaard.appelt@slb.regionsyddanmark.dk; University of Southern Denmark, Odense; Ploen, John

    2013-01-01

    Purpose: Preoperative chemoradiation therapy (CRT) is part of the standard treatment of locally advanced rectal cancers. Tumor regression at the time of operation is desirable, but not much is known about the relationship between radiation dose and tumor regression. In the present study we estimated radiation dose-response curves for various grades of tumor regression after preoperative CRT. Methods and Materials: A total of 222 patients, treated with consistent chemotherapy and radiation therapy techniques, were considered for the analysis. Radiation therapy consisted of a combination of external-beam radiation therapy and brachytherapy. Response at the time of operation was evaluated from themore » histopathologic specimen and graded on a 5-point scale (TRG1-5). The probability of achieving complete, major, and partial response was analyzed by ordinal logistic regression, and the effect of including clinical parameters in the model was examined. The radiation dose-response relationship for a specific grade of histopathologic tumor regression was parameterized in terms of the dose required for 50% response, D{sub 50,i}, and the normalized dose-response gradient, {gamma}{sub 50,i}. Results: A highly significant dose-response relationship was found (P=.002). For complete response (TRG1), the dose-response parameters were D{sub 50,TRG1} = 92.0 Gy (95% confidence interval [CI] 79.3-144.9 Gy), {gamma}{sub 50,TRG1} = 0.982 (CI 0.533-1.429), and for major response (TRG1-2) D{sub 50,TRG1} and {sub 2} = 72.1 Gy (CI 65.3-94.0 Gy), {gamma}{sub 50,TRG1} and {sub 2} = 0.770 (CI 0.338-1.201). Tumor size and N category both had a significant effect on the dose-response relationships. Conclusions: This study demonstrated a significant dose-response relationship for tumor regression after preoperative CRT for locally advanced rectal cancer for tumor dose levels in the range of 50.4-70 Gy, which is higher than the dose range usually considered.« less

  9. [Predicting the probability of development and progression of primary open angle glaucoma by regression modeling].

    PubMed

    Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V

    2018-01-01

    Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.

  10. Glaucoma progression detection with frequency doubling technology (FDT) compared to standard automated perimetry (SAP) in the Groningen Longitudinal Glaucoma Study.

    PubMed

    Wesselink, Christiaan; Jansonius, Nomdo M

    2017-09-01

    To determine the usefulness of frequency doubling perimetry (FDT) for progression detection in glaucoma, compared to standard automated perimetry (SAP). Data were used from 150 eyes of 150 glaucoma patients from the Groningen Longitudinal Glaucoma Study. After baseline, SAP was performed approximately yearly; FDT every other year. First and last visit had to contain both tests. Using linear regression, progression velocities were calculated for SAP (Humphrey Field Analyzer) mean deviation (MD) and FDT MD and the number of test locations with a total deviation probability below p < 0.01 (TD). Progression velocity tertiles were determined and eyes were classified as slowly, intermediately, or fast progressing for both techniques. Comparison between SAP and FDT classifications were made using a Mantel Haenszel chi-square test. Longitudinal signal-to-noise ratios (LSNRs) were calculated, per patient and per technique, defined as progression velocity divided by the standard deviation of the residuals. Mean (SD) follow-up was 6.4 (1.7) years; median (interquartile range [IQR]) baseline SAP MD -6.6 (-14.2 to -3.6) dB. On average 8.2 and 4.5 tests were performed for SAP and FDT, respectively. Median (IQR) MD slope was -0.16 (-0.46 to +0.02) dB/year for SAP and -0.05 (-0.39 to +0.17) dB/year for FDT. Mantel Haenszel chi-squares of SAP MD vs FDT MD and TD were 12.5 (p < 0.001) and 15.8 (p < 0.001), respectively. LSNRs for SAP MD (median -0.17 yr -1 ) were better than those for FDT MD (-0.04 yr -1 ; p = 0.010). FDT may be a useful technique for monitoring glaucoma progression in patients who cannot perform SAP reliably. © 2017 The Authors Ophthalmic & Physiological Optics © 2017 The College of Optometrists.

  11. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Treesearch

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Kunkun, E-mail: ktg@illinois.edu; Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence; Congedo, Pietro M.

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable formore » real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.« less

  13. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  14. Modeling Group Differences in OLS and Orthogonal Regression: Implications for Differential Validity Studies

    ERIC Educational Resources Information Center

    Kane, Michael T.; Mroch, Andrew A.

    2010-01-01

    In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…

  15. Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

    ERIC Educational Resources Information Center

    Schafer, William D.; Wang, Yuh-Yin

    A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…

  16. The discovery of indicator variables for QSAR using inductive logic programming

    NASA Astrophysics Data System (ADS)

    King, Ross D.; Srinivasan, Ashwin

    1997-11-01

    A central problem in forming accurate regression equations in QSAR studies isthe selection of appropriate descriptors for the compounds under study. Wedescribe a novel procedure for using inductive logic programming (ILP) todiscover new indicator variables (attributes) for QSAR problems, and show thatthese improve the accuracy of the derived regression equations. ILP techniqueshave previously been shown to work well on drug design problems where thereis a large structural component or where clear comprehensible rules arerequired. However, ILP techniques have had the disadvantage of only being ableto make qualitative predictions (e.g. active, inactive) and not to predictreal numbers (regression). We unify ILP and linear regression techniques togive a QSAR method that has the strength of ILP at describing stericstructure, with the familiarity and power of linear regression. We evaluatedthe utility of this new QSAR technique by examining the prediction ofbiological activity with and without the addition of new structural indicatorvariables formed by ILP. In three out of five datasets examined the additionof ILP variables produced statistically better results (P < 0.01) over theoriginal description. The new ILP variables did not increase the overallcomplexity of the derived QSAR equations and added insight into possiblemechanisms of action. We conclude that ILP can aid in the process of drugdesign.

  17. Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

    USGS Publications Warehouse

    Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

    2015-09-28

    Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.

  18. Single isotope evaluation of pulmonary capillary protein leak (ARDS model) using computerized gamma scintigraphy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tatum, J.L.; Strash, A.M.; Sugerman, H.J.

    Using a canine oleic acid model, a computerized gamma scintigraphic technique was evaluated to determine 1) ability to detect pulmonary capillary protein leak in a model temporally consistent with clinical adult respiratory distress syndrome (ARDS), 2) the possibility of providing a quantitative index of leak, and 3) the feasibility of closely spaced repeat evaluations. Study animals received oleic acid (controls, n . 10; 0.05 ml/kg, n . 10; 0.10 ml/kg, n . 12; 0.15 ml/kg, n . 6) 3 hours prior to a tracer dose of technetium-99m (/sup 99/mTc) HSA. One animal in each dose group also received two repeatmore » tracer injections spaced a minimum of 45 minutes apart. Digital images were obtained with a conventional gamma camera interfaced to a dedicated medical computer. Lung: heart ratio versus time curves were generated, and a slope index was calculated for each curve. Slope index values for all doses were significantly greater than control values (P(t) less than 0.0001). Each incremental dose increase was also significantly greater than the previous dose level. Oleic acid dose versus slope index fitted a linear regression model with r . 0.94. Repeat dosing produced index values with standard deviations less than the group sample standard deviations. We feel this technique may have application in the clinical study of pulmonary permeability edema.« less

  19. Characterizing nonconstant instrumental variance in emerging miniaturized analytical techniques.

    PubMed

    Noblitt, Scott D; Berg, Kathleen E; Cate, David M; Henry, Charles S

    2016-04-07

    Measurement variance is a crucial aspect of quantitative chemical analysis. Variance directly affects important analytical figures of merit, including detection limit, quantitation limit, and confidence intervals. Most reported analyses for emerging analytical techniques implicitly assume constant variance (homoskedasticity) by using unweighted regression calibrations. Despite the assumption of constant variance, it is known that most instruments exhibit heteroskedasticity, where variance changes with signal intensity. Ignoring nonconstant variance results in suboptimal calibrations, invalid uncertainty estimates, and incorrect detection limits. Three techniques where homoskedasticity is often assumed were covered in this work to evaluate if heteroskedasticity had a significant quantitative impact-naked-eye, distance-based detection using paper-based analytical devices (PADs), cathodic stripping voltammetry (CSV) with disposable carbon-ink electrode devices, and microchip electrophoresis (MCE) with conductivity detection. Despite these techniques representing a wide range of chemistries and precision, heteroskedastic behavior was confirmed for each. The general variance forms were analyzed, and recommendations for accounting for nonconstant variance discussed. Monte Carlo simulations of instrument responses were performed to quantify the benefits of weighted regression, and the sensitivity to uncertainty in the variance function was tested. Results show that heteroskedasticity should be considered during development of new techniques; even moderate uncertainty (30%) in the variance function still results in weighted regression outperforming unweighted regressions. We recommend utilizing the power model of variance because it is easy to apply, requires little additional experimentation, and produces higher-precision results and more reliable uncertainty estimates than assuming homoskedasticity. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Calibration of BK Virus Nucleic Acid Amplification Testing to the 1st WHO International Standard for BK Virus

    PubMed Central

    Tan, Susanna K.; Milligan, Stephen; Sahoo, Malaya K.; Taylor, Nathaniel

    2017-01-01

    ABSTRACT Significant interassay variability in the quantification of BK virus (BKV) DNA precludes establishing broadly applicable thresholds for the management of BKV infection in transplantation. The 1st WHO International Standard for BKV (primary standard) was introduced in 2016 as a common calibrator for improving the harmonization of BKV nucleic acid amplification testing (NAAT) and enabling comparisons of biological measurements worldwide. Here, we evaluated the Altona RealStar BKV assay (Altona) and calibrated the results to the international unit (IU) using the Exact Diagnostics BKV verification panel, a secondary standard traceable to the primary standard. The primary and secondary standards on Altona had nearly identical linear regression equations (primary standard, Y = 1.05X − 0.28, R2 = 0.99; secondary standard, Y = 1.04X − 0.26, R2 = 0.99) and conversion factors (primary standard, 1.11 IU/copy; secondary standard, 1.09 IU/copy). A comparison of Altona with a laboratory-developed BKV NAAT assay in IU/ml versus copies/ml using Passing-Bablok regression revealed similar regression lines, no proportional bias, and improvement in the systematic bias (95% confidence interval of intercepts: copies/ml, −0.52 to −1.01; IU/ml, 0.07 to −0.36). Additionally, Bland-Altman analyses revealed a clinically significant reduction of bias when results were reported in IU/ml (IU/ml, −0.10 log10; copies/ml, −0.70 log10). These results indicate that the use of a common calibrator improved the agreement between the two assays. As clinical laboratories worldwide use calibrators traceable to the primary standard to harmonize BKV NAAT results, we anticipate improved interassay comparisons with a potential for establishing broadly applicable quantitative BKV DNA load cutoffs for clinical practice. PMID:28053213

  1. Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.

    ERIC Educational Resources Information Center

    Dunbar, Stephen B.; And Others

    This paper considers the application of Bayesian techniques for simultaneous estimation to the specification of regression weights for selection tests used in various technical training courses in the Marine Corps. Results of a method for m-group regression developed by Molenaar and Lewis (1979) suggest that common weights for training courses…

  2. Quantile Regression in the Study of Developmental Sciences

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…

  3. Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)

    DTIC Science & Technology

    1987-10-01

    Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE

  4. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  5. On the reliable and flexible solution of practical subset regression problems

    NASA Technical Reports Server (NTRS)

    Verhaegen, M. H.

    1987-01-01

    A new algorithm for solving subset regression problems is described. The algorithm performs a QR decomposition with a new column-pivoting strategy, which permits subset selection directly from the originally defined regression parameters. This, in combination with a number of extensions of the new technique, makes the method a very flexible tool for analyzing subset regression problems in which the parameters have a physical meaning.

  6. The mechanical properties of high speed GTAW weld and factors of nonlinear multiple regression model under external transverse magnetic field

    NASA Astrophysics Data System (ADS)

    Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou

    2013-05-01

    A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.

  7. Validation of Regression-Based Myogenic Correction Techniques for Scalp and Source-Localized EEG

    PubMed Central

    McMenamin, Brenton W.; Shackman, Alexander J.; Maxwell, Jeffrey S.; Greischar, Lawrence L.; Davidson, Richard J.

    2008-01-01

    EEG and EEG source-estimation are susceptible to electromyographic artifacts (EMG) generated by the cranial muscles. EMG can mask genuine effects or masquerade as a legitimate effect - even in low frequencies, such as alpha (8–13Hz). Although regression-based correction has been used previously, only cursory attempts at validation exist and the utility for source-localized data is unknown. To address this, EEG was recorded from 17 participants while neurogenic and myogenic activity were factorially varied. We assessed the sensitivity and specificity of four regression-based techniques: between-subjects, between-subjects using difference-scores, within-subjects condition-wise, and within-subject epoch-wise on the scalp and in data modeled using the LORETA algorithm. Although within-subject epoch-wise showed superior performance on the scalp, no technique succeeded in the source-space. Aside from validating the novel epoch-wise methods on the scalp, we highlight methods requiring further development. PMID:19298626

  8. Techniques for detecting effects of urban and rural land-use practices on stream-water chemistry in selected watersheds in Texas, Minnesota,and Illinois

    USGS Publications Warehouse

    Walker, J.F.

    1993-01-01

    Selected statistical techniques were applied to three urban watersheds in Texas and Minnesota and three rural watersheds in Illinois. For the urban watersheds, single- and paired-site data-collection strategies were considered. The paired-site strategy was much more effective than the singlesite strategy for detecting changes. Analysis of storm load regression residuals demonstrated the potential utility of regressions for variability reduction. For the rural watersheds, none of the selected techniques were effective at identifying changes, primarily due to a small degree of management-practice implementation, potential errors introduced through the estimation of storm load, and small sample sizes. A Monte Carlo sensitivity analysis was used to determine the percent change in water chemistry that could be detected for each watershed. In most instances, the use of regressions improved the ability to detect changes.

  9. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  10. Evaluation of interpolation techniques for the creation of gridded daily precipitation (1 × 1 km2); Cyprus, 1980-2010

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Bruggeman, Adriana; Hadjinicolaou, Panos; Pashiardis, Stelios; Lange, Manfred A.

    2014-01-01

    High-resolution gridded daily data sets are essential for natural resource management and the analyses of climate changes and their effects. This study aims to evaluate the performance of 15 simple or complex interpolation techniques in reproducing daily precipitation at a resolution of 1 km2 over topographically complex areas. Methods are tested considering two different sets of observation densities and different rainfall amounts. We used rainfall data that were recorded at 74 and 145 observational stations, respectively, spread over the 5760 km2 of the Republic of Cyprus, in the Eastern Mediterranean. Regression analyses utilizing geographical copredictors and neighboring interpolation techniques were evaluated both in isolation and combined. Linear multiple regression (LMR) and geographically weighted regression methods (GWR) were tested. These included a step-wise selection of covariables, as well as inverse distance weighting (IDW), kriging, and 3D-thin plate splines (TPS). The relative rank of the different techniques changes with different station density and rainfall amounts. Our results indicate that TPS performs well for low station density and large-scale events and also when coupled with regression models. It performs poorly for high station density. The opposite is observed when using IDW. Simple IDW performs best for local events, while a combination of step-wise GWR and IDW proves to be the best method for large-scale events and high station density. This study indicates that the use of step-wise regression with a variable set of geographic parameters can improve the interpolation of large-scale events because it facilitates the representation of local climate dynamics.

  11. Selection of a Geostatistical Method to Interpolate Soil Properties of the State Crop Testing Fields using Attributes of a Digital Terrain Model

    NASA Astrophysics Data System (ADS)

    Sahabiev, I. A.; Ryazanov, S. S.; Kolcova, T. G.; Grigoryan, B. R.

    2018-03-01

    The three most common techniques to interpolate soil properties at a field scale—ordinary kriging (OK), regression kriging with multiple linear regression drift model (RK + MLR), and regression kriging with principal component regression drift model (RK + PCR)—were examined. The results of the performed study were compiled into an algorithm of choosing the most appropriate soil mapping technique. Relief attributes were used as the auxiliary variables. When spatial dependence of a target variable was strong, the OK method showed more accurate interpolation results, and the inclusion of the auxiliary data resulted in an insignificant improvement in prediction accuracy. According to the algorithm, the RK + PCR method effectively eliminates multicollinearity of explanatory variables. However, if the number of predictors is less than ten, the probability of multicollinearity is reduced, and application of the PCR becomes irrational. In that case, the multiple linear regression should be used instead.

  12. Determination of solid-propellant transient regression rates using a microwave Doppler shift technique

    NASA Technical Reports Server (NTRS)

    Strand, L. D.; Schultz, A. L.; Reedy, G. K.

    1972-01-01

    A microwave Doppler shift system, with increased resolution over earlier microwave techniques, was developed for the purpose of measuring the regression rates of solid propellants during rapid pressure transients. A continuous microwave beam is transmitted to the base of a burning propellant sample cast in a metal waveguide tube. A portion of the wave is reflected from the regressing propellant-flame zone interface. The phase angle difference between the incident and reflected signals and its time differential are continuously measured using a high resolution microwave network analyzer and related instrumentation. The apparent propellant regression rate is directly proportional to this latter differential measurement. Experiments were conducted to verify the (1) spatial and time resolution of the system, (2) effect of propellant surface irregularities and compressibility on the measurements, and (3) accuracy of the system for quasi-steady-state regression rate measurements. The microwave system was also used in two different transient combustion experiments: in a rapid depressurization bomb, and in the high-frequency acoustic pressure environment of a T-burner.

  13. Methods for estimating the magnitude and frequency of floods for urban and small, rural streams in Georgia, South Carolina, and North Carolina, 2011

    USGS Publications Warehouse

    Feaster, Toby D.; Gotvald, Anthony J.; Weaver, J. Curtis

    2014-01-01

    Reliable estimates of the magnitude and frequency of floods are essential for the design of transportation and water-conveyance structures, flood-insurance studies, and flood-plain management. Such estimates are particularly important in densely populated urban areas. In order to increase the number of streamflow-gaging stations (streamgages) available for analysis, expand the geographical coverage that would allow for application of regional regression equations across State boundaries, and build on a previous flood-frequency investigation of rural U.S Geological Survey streamgages in the Southeast United States, a multistate approach was used to update methods for determining the magnitude and frequency of floods in urban and small, rural streams that are not substantially affected by regulation or tidal fluctuations in Georgia, South Carolina, and North Carolina. The at-site flood-frequency analysis of annual peak-flow data for urban and small, rural streams (through September 30, 2011) included 116 urban streamgages and 32 small, rural streamgages, defined in this report as basins draining less than 1 square mile. The regional regression analysis included annual peak-flow data from an additional 338 rural streamgages previously included in U.S. Geological Survey flood-frequency reports and 2 additional rural streamgages in North Carolina that were not included in the previous Southeast rural flood-frequency investigation for a total of 488 streamgages included in the urban and small, rural regression analysis. The at-site flood-frequency analyses for the urban and small, rural streamgages included the expected moments algorithm, which is a modification of the Bulletin 17B log-Pearson type III method for fitting the statistical distribution to the logarithms of the annual peak flows. Where applicable, the flood-frequency analysis also included low-outlier and historic information. Additionally, the application of a generalized Grubbs-Becks test allowed for the detection of multiple potentially influential low outliers. Streamgage basin characteristics were determined using geographical information system techniques. Initial ordinary least squares regression simulations reduced the number of basin characteristics on the basis of such factors as statistical significance, coefficient of determination, Mallow’s Cp statistic, and ease of measurement of the explanatory variable. Application of generalized least squares regression techniques produced final predictive (regression) equations for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probability flows for urban and small, rural ungaged basins for three hydrologic regions (HR1, Piedmont–Ridge and Valley; HR3, Sand Hills; and HR4, Coastal Plain), which previously had been defined from exploratory regression analysis in the Southeast rural flood-frequency investigation. Because of the limited availability of urban streamgages in the Coastal Plain of Georgia, South Carolina, and North Carolina, additional urban streamgages in Florida and New Jersey were used in the regression analysis for this region. Including the urban streamgages in New Jersey allowed for the expansion of the applicability of the predictive equations in the Coastal Plain from 3.5 to 53.5 square miles. Average standard error of prediction for the predictive equations, which is a measure of the average accuracy of the regression equations when predicting flood estimates for ungaged sites, range from 25.0 percent for the 10-percent annual exceedance probability regression equation for the Piedmont–Ridge and Valley region to 73.3 percent for the 0.2-percent annual exceedance probability regression equation for the Sand Hills region.

  14. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  15. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    NASA Astrophysics Data System (ADS)

    Kristoufek, Ladislav

    2015-02-01

    We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.

  16. Aircraft noise annoyance in recreational areas after changes in noise exposure: comments on Krog and Engdahl (2004).

    PubMed

    Klaeboe, Ronny

    2005-09-01

    When Gardermoen replaced Fornebu as the main airport for Oslo, aircraft noise levels increased in recreational areas near Gardermoen and decreased in areas near Fornebu. Krog and Engdahl [J. Acoust. Soc. Am. 116, 323-333 (2004)] estimate that recreationists' annoyance from aircraft noise in these areas changed more than would be anticipated from the actual noise changes. However, the sizes of their estimated "situation" effects are not credible. One possible reason for the anomalous results is that standard regression assumptions become violated when motivational factors are inserted into the regression model. Standardized regression coefficients (beta values) should also not be utilized for comparisons across equations.

  17. Assessment of a high-SNR chemical-shift-encoded MRI with complex reconstruction for proton density fat fraction (PDFF) estimation overall and in the low-fat range.

    PubMed

    Park, Charlie C; Hooker, Catherine; Hooker, Jonathan C; Bass, Emily; Haufe, William; Schlein, Alexandra; Covarrubias, Yesenia; Heba, Elhamy; Bydder, Mark; Wolfson, Tanya; Gamst, Anthony; Loomba, Rohit; Schwimmer, Jeffrey; Hernando, Diego; Reeder, Scott B; Middleton, Michael; Sirlin, Claude B; Hamilton, Gavin

    2018-04-29

    Improving the signal-to-noise ratio (SNR) of chemical-shift-encoded MRI acquisition with complex reconstruction (MRI-C) may improve the accuracy and precision of noninvasive proton density fat fraction (PDFF) quantification in patients with hepatic steatosis. To assess the accuracy of high SNR (Hi-SNR) MRI-C versus standard MRI-C acquisition to estimate hepatic PDFF in adult and pediatric nonalcoholic fatty liver disease (NAFLD) using an MR spectroscopy (MRS) sequence as the reference standard. Prospective. In all, 231 adult and pediatric patients with known or suspected NAFLD. PDFF estimated at 3T by three MR techniques: standard MRI-C; a Hi-SNR MRI-C variant with increased slice thickness, decreased matrix size, and no parallel imaging; and MRS (reference standard). MRI-PDFF was measured by image analysts using a region of interest coregistered with the MRS-PDFF voxel. Linear regression analyses were used to assess accuracy and precision of MRI-estimated PDFF for MRS-PDFF as a function of MRI-PDFF using the standard and Hi-SNR MRI-C for all patients and for patients with MRS-PDFF <10%. In all, 271 exams from 231 patients were included (mean MRS-PDFF: 12.6% [SD: 10.4]; range: 0.9-41.9). High agreement between MRI-PDFF and MRS-PDFF was demonstrated across the overall range of PDFF, with a regression slope of 1.035 for the standard MRI-C and 1.008 for Hi-SNR MRI-C. Hi-SNR MRI-C, compared to standard MRI-C, provided small but statistically significant improvements in the slope (respectively, 1.008 vs. 1.035, P = 0.004) and mean bias (0.412 vs. 0.673, P < 0.0001) overall. In the low-fat patients only, Hi-SNR MRI-C provided improvements in the slope (1.058 vs. 1.190, P = 0.002), mean bias (0.168 vs. 0.368, P = 0.007), intercept (-0.153 vs. -0.796, P < 0.0001), and borderline improvement in the R 2 (0.888 vs. 0.813, P = 0.01). Compared to standard MRI-C, Hi-SNR MRI-C provides slightly higher MRI-PDFF estimation accuracy across the overall range of PDFF and improves both accuracy and precision in the low PDFF range. 1 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2018. © 2018 International Society for Magnetic Resonance in Medicine.

  18. Low-dose computed tomography scans with automatic exposure control for patients of different ages undergoing cardiac PET/CT and SPECT/CT.

    PubMed

    Yang, Ching-Ching; Yang, Bang-Hung; Tu, Chun-Yuan; Wu, Tung-Hsin; Liu, Shu-Hsin

    2017-06-01

    This study aimed to evaluate the efficacy of automatic exposure control (AEC) in order to optimize low-dose computed tomography (CT) protocols for patients of different ages undergoing cardiac PET/CT and single-photon emission computed tomography/computed tomography (SPECT/CT). One PET/CT and one SPECT/CT were used to acquire CT images for four anthropomorphic phantoms representative of 1-year-old, 5-year-old and 10-year-old children and an adult. For the hybrid systems investigated in this study, the radiation dose and image quality of cardiac CT scans performed with AEC activated depend mainly on the selection of a predefined image quality index. Multiple linear regression methods were used to analyse image data from anthropomorphic phantom studies to investigate the effects of body size and predefined image quality index on CT radiation dose in cardiac PET/CT and SPECT/CT scans. The regression relationships have a coefficient of determination larger than 0.9, indicating a good fit to the data. According to the regression models, low-dose protocols using the AEC technique were optimized for patients of different ages. In comparison with the standard protocol with AEC activated for adult cardiac examinations used in our clinical routine practice, the optimized paediatric protocols in PET/CT allow 32.2, 63.7 and 79.2% CT dose reductions for anthropomorphic phantoms simulating 10-year-old, 5-year-old and 1-year-old children, respectively. The corresponding results for cardiac SPECT/CT are 8.4, 51.5 and 72.7%. AEC is a practical way to reduce CT radiation dose in cardiac PET/CT and SPECT/CT, but the AEC settings should be determined properly for optimal effect. Our results show that AEC does not eliminate the need for paediatric protocols and CT examinations using the AEC technique should be optimized for paediatric patients to reduce the radiation dose as low as reasonably achievable.

  19. Hyperspectral imaging for predicting the allicin and soluble solid content of garlic with variable selection algorithms and chemometric models.

    PubMed

    Rahman, Anisur; Faqeerzada, Mohammad A; Cho, Byoung-Kwan

    2018-03-14

    Allicin and soluble solid content (SSC) in garlic is the responsible for its pungent flavor and odor. However, current conventional methods such as the use of high-pressure liquid chromatography and a refractometer have critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to predict allicin and SSC in garlic using hyperspectral imaging in combination with variable selection algorithms and calibration models. Hyperspectral images of 100 garlic cloves were acquired that covered two spectral ranges, from which the mean spectra of each clove were extracted. The calibration models included partial least squares (PLS) and least squares-support vector machine (LS-SVM) regression, as well as different spectral pre-processing techniques, from which the highest performing spectral preprocessing technique and spectral range were selected. Then, variable selection methods, such as regression coefficients, variable importance in projection (VIP) and the successive projections algorithm (SPA), were evaluated for the selection of effective wavelengths (EWs). Furthermore, PLS and LS-SVM regression methods were applied to quantitatively predict the quality attributes of garlic using the selected EWs. Of the established models, the SPA-LS-SVM model obtained an Rpred2 of 0.90 and standard error of prediction (SEP) of 1.01% for SSC prediction, whereas the VIP-LS-SVM model produced the best result with an Rpred2 of 0.83 and SEP of 0.19 mg g -1 for allicin prediction in the range 1000-1700 nm. Furthermore, chemical images of garlic were developed using the best predictive model to facilitate visualization of the spatial distributions of allicin and SSC. The present study clearly demonstrates that hyperspectral imaging combined with an appropriate chemometrics method can potentially be employed as a fast, non-invasive method to predict the allicin and SSC in garlic. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  20. Improvement of near infrared spectroscopic (NIRS) analysis of caffeine in roasted Arabica coffee by variable selection method of stability competitive adaptive reweighted sampling (SCARS).

    PubMed

    Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping

    2013-10-01

    Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Improvement of near infrared spectroscopic (NIRS) analysis of caffeine in roasted Arabica coffee by variable selection method of stability competitive adaptive reweighted sampling (SCARS)

    NASA Astrophysics Data System (ADS)

    Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P.; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping

    2013-10-01

    Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS.

  2. Statistical summary of selected physical, chemical, and toxicity characteristics and estimates of annual constituent loads in urban stormwater, Maricopa County, Arizona

    USGS Publications Warehouse

    Fossum, Kenneth D.; O'Day, Christie M.; Wilson, Barbara J.; Monical, Jim E.

    2001-01-01

    Stormwater and streamflow in Maricopa County were monitored to (1) describe the physical, chemical, and toxicity characteristics of stormwater from areas having different land uses, (2) describe the physical, chemical, and toxicity characteristics of streamflow from areas that receive urban stormwater, and (3) estimate constituent loads in stormwater. Urban stormwater and streamflow had similar ranges in most constituent concentrations. The mean concentration of dissolved solids in urban stormwater was lower than in streamflow from the Salt River and Indian Bend Wash. Urban stormwater, however, had a greater chemical oxygen demand and higher concentrations of most nutrients. Mean seasonal loads and mean annual loads of 11 constituents and volumes of runoff were estimated for municipalities in the metropolitan Phoenix area, Arizona, by adjusting regional regression equations of loads. This adjustment procedure uses the original regional regression equation and additional explanatory variables that were not included in the original equation. The adjusted equations had standard errors that ranged from 161 to 196 percent. The large standard errors of the prediction result from the large variability of the constituent concentration data used in the regression analysis. Adjustment procedures produced unsatisfactory results for nine of the regressions?suspended solids, dissolved solids, total phosphorus, dissolved phosphorus, total recoverable cadmium, total recoverable copper, total recoverable lead, total recoverable zinc, and storm runoff. These equations had no consistent direction of bias and no other additional explanatory variables correlated with the observed loads. A stepwise-multiple regression or a three-variable regression (total storm rainfall, drainage area, and impervious area) and local data were used to develop local regression equations for these nine constituents. These equations had standard errors from 15 to 183 percent.

  3. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.

  4. Standardized pivot shift test improves measurement accuracy.

    PubMed

    Hoshino, Yuichi; Araujo, Paulo; Ahlden, Mattias; Moore, Charity G; Kuroda, Ryosuke; Zaffagnini, Stefano; Karlsson, Jon; Fu, Freddie H; Musahl, Volker

    2012-04-01

    The variability of the pivot shift test techniques greatly interferes with achieving a quantitative and generally comparable measurement. The purpose of this study was to compare the variation of the quantitative pivot shift measurements with different surgeons' preferred techniques to a standardized technique. The hypothesis was that standardizing the pivot shift test would improve consistency in the quantitative evaluation when compared with surgeon-specific techniques. A whole lower body cadaveric specimen was prepared to have a low-grade pivot shift on one side and high-grade pivot shift on the other side. Twelve expert surgeons performed the pivot shift test using (1) their preferred technique and (2) a standardized technique. Electromagnetic tracking was utilized to measure anterior tibial translation and acceleration of the reduction during the pivot shift test. The variation of the measurement was compared between the surgeons' preferred technique and the standardized technique. The anterior tibial translation during pivot shift test was similar between using surgeons' preferred technique (left 24.0 ± 4.3 mm; right 15.5 ± 3.8 mm) and using standardized technique (left 25.1 ± 3.2 mm; right 15.6 ± 4.0 mm; n.s.). However, the variation in acceleration was significantly smaller with the standardized technique (left 3.0 ± 1.3 mm/s(2); right 2.5 ± 0.7 mm/s(2)) compared with the surgeons' preferred technique (left 4.3 ± 3.3 mm/s(2); right 3.4 ± 2.3 mm/s(2); both P < 0.01). Standardizing the pivot shift test maneuver provides a more consistent quantitative evaluation and may be helpful in designing future multicenter clinical outcome trials. Diagnostic study, Level I.

  5. Algorithm For Solution Of Subset-Regression Problems

    NASA Technical Reports Server (NTRS)

    Verhaegen, Michel

    1991-01-01

    Reliable and flexible algorithm for solution of subset-regression problem performs QR decomposition with new column-pivoting strategy, enables selection of subset directly from originally defined regression parameters. This feature, in combination with number of extensions, makes algorithm very flexible for use in analysis of subset-regression problems in which parameters have physical meanings. Also extended to enable joint processing of columns contaminated by noise with those free of noise, without using scaling techniques.

  6. CAHOST: An Excel Workbook for Facilitating the Johnson-Neyman Technique for Two-Way Interactions in Multiple Regression.

    PubMed

    Carden, Stephen W; Holtzman, Nicholas S; Strube, Michael J

    2017-01-01

    When using multiple regression, researchers frequently wish to explore how the relationship between two variables is moderated by another variable; this is termed an interaction. Historically, two approaches have been used to probe interactions: the pick-a-point approach and the Johnson-Neyman (JN) technique. The pick-a-point approach has limitations that can be avoided using the JN technique. Currently, the software available for implementing the JN technique and creating corresponding figures lacks several desirable features-most notably, ease of use and figure quality. To fill this gap in the literature, we offer a free Microsoft Excel 2013 workbook, CAHOST (a concatenation of the first two letters of the authors' last names), that allows the user to seamlessly create publication-ready figures of the results of the JN technique.

  7. The Collinearity Free and Bias Reduced Regression Estimation Project: The Theory of Normalization Ridge Regression. Report No. 2.

    ERIC Educational Resources Information Center

    Bulcock, J. W.; And Others

    Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…

  8. Use of Multiple Regression and Use-Availability Analyses in Determining Habitat Selection by Gray Squirrels (Sciurus Carolinensis)

    Treesearch

    John W. Edwards; Susan C. Loeb; David C. Guynn

    1994-01-01

    Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...

  9. Comparison of a new noncoplanar intensity-modulated radiation therapy technique for craniospinal irradiation with 3 coplanar techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hansen, Anders T., E-mail: andehans@rm.dk; Lukacova, Slavka; Lassen-Ramshad, Yasmin

    2015-01-01

    When standard conformal x-ray technique for craniospinal irradiation is used, it is a challenge to achieve satisfactory dose coverage of the target including the area of the cribriform plate, while sparing organs at risk. We present a new intensity-modulated radiation therapy (IMRT), noncoplanar technique, for delivering irradiation to the cranial part and compare it with 3 other techniques and previously published results. A total of 13 patients who had previously received craniospinal irradiation with standard conformal x-ray technique were reviewed. New treatment plans were generated for each patient using the noncoplanar IMRT-based technique, a coplanar IMRT-based technique, and a coplanarmore » volumetric-modulated arch therapy (VMAT) technique. Dosimetry data for all patients were compared with the corresponding data from the conventional treatment plans. The new noncoplanar IMRT technique substantially reduced the mean dose to organs at risk compared with the standard radiation technique. The 2 other coplanar techniques also reduced the mean dose to some of the critical organs. However, this reduction was not as substantial as the reduction obtained by the noncoplanar technique. Furthermore, compared with the standard technique, the IMRT techniques reduced the total calculated radiation dose that was delivered to the normal tissue, whereas the VMAT technique increased this dose. Additionally, the coverage of the target was significantly improved by the noncoplanar IMRT technique. Compared with the standard technique, the coplanar IMRT and the VMAT technique did not improve the coverage of the target significantly. All the new planning techniques increased the number of monitor units (MU) used—the noncoplanar IMRT technique by 99%, the coplanar IMRT technique by 122%, and the VMAT technique by 26%—causing concern for leak radiation. The noncoplanar IMRT technique covered the target better and decreased doses to organs at risk compared with the other techniques. All the new techniques increased the number of MU compared with the standard technique.« less

  10. Experimental investigation of paraffin-based fuels for hybrid rocket propulsion

    NASA Astrophysics Data System (ADS)

    Galfetti, L.; Merotto, L.; Boiocchi, M.; Maggi, F.; DeLuca, L. T.

    2013-03-01

    Solid fuels for hybrid rockets were characterized in the framework of a research project aimed to develop a new generation of solid fuels, combining at the same time good mechanical and ballistic properties. Original techniques were implemented in order to improve paraffin-based fuels. The first strengthening technique involves the use of a polyurethane foam (PUF); a second technique is based on thermoplastic polymers mixed at molecular level with the paraffin binder. A ballistic characterization of paraffin-based hybrid rocket solid fuels was performed, considering pure wax-based fuels and fuels doped with suitable metal additives. Nano-Al powders and metal hydrides (magnesium hydride (MgH2), lithium aluminum hydride (LiAlH4 )) were used as fillers in paraffin matrices. The results of this investigation show a strong correlation between the measured viscosity of the melted paraffin layer and the regression rate: a decrease of viscosity increases the regression rate. This trend is due to the increasing development of entrainment phenomena, which strongly increase the regression rate. Addition of LiAlH4 (mass fraction 10%) can further increase the regression rate up to 378% with respect to the pure HTPB regression rate, taken as baseline reference fuel. The highest regression rates were found for the Solid Wax (SW) composition, added with 5% MgH2 mass fraction; at 350 kg/(m2s) oxygen mass flux, the measured regression rate, averaged in space and time, was 2.5 mm/s, which is approximately five times higher than that of the pure HTPB composition. Compositions added with nanosized aluminum powders were compared with those added with MgH2, using gel or solid wax.

  11. Correcting for the influence of sampling conditions on biomarkers of exposure to phenols and phthalates: a 2-step standardization method based on regression residuals.

    PubMed

    Mortamais, Marion; Chevrier, Cécile; Philippat, Claire; Petit, Claire; Calafat, Antonia M; Ye, Xiaoyun; Silva, Manori J; Brambilla, Christian; Eijkemans, Marinus J C; Charles, Marie-Aline; Cordier, Sylvaine; Slama, Rémy

    2012-04-26

    Environmental epidemiology and biomonitoring studies typically rely on biological samples to assay the concentration of non-persistent exposure biomarkers. Between-participant variations in sampling conditions of these biological samples constitute a potential source of exposure misclassification. Few studies attempted to correct biomarker levels for this error. We aimed to assess the influence of sampling conditions on concentrations of urinary biomarkers of select phenols and phthalates, two widely-produced families of chemicals, and to standardize biomarker concentrations on sampling conditions. Urine samples were collected between 2002 and 2006 among 287 pregnant women from Eden and Pélagie cohorts, from which phthalates and phenols metabolites levels were assayed. We applied a 2-step standardization method based on regression residuals. First, the influence of sampling conditions (including sampling hour, duration of storage before freezing) and of creatinine levels on biomarker concentrations were characterized using adjusted linear regression models. In the second step, the model estimates were used to remove the variability in biomarker concentrations due to sampling conditions and to standardize concentrations as if all samples had been collected under the same conditions (e.g., same hour of urine collection). Sampling hour was associated with concentrations of several exposure biomarkers. After standardization for sampling conditions, median concentrations differed by--38% for 2,5-dichlorophenol to +80 % for a metabolite of diisodecyl phthalate. However, at the individual level, standardized biomarker levels were strongly correlated (correlation coefficients above 0.80) with unstandardized measures. Sampling conditions, such as sampling hour, should be systematically collected in biomarker-based studies, in particular when the biomarker half-life is short. The 2-step standardization method based on regression residuals that we proposed in order to limit the impact of heterogeneity in sampling conditions could be further tested in studies describing levels of biomarkers or their influence on health.

  12. Comparison of Preloaded Bougie versus Standard Bougie Technique for Endotracheal Intubation in a Cadaveric Model.

    PubMed

    Baker, Jay B; Maskell, Kevin F; Matlock, Aaron G; Walsh, Ryan M; Skinner, Carl G

    2015-07-01

    We compared intubating with a preloaded bougie (PB) against standard bougie technique in terms of success rates, time to successful intubation and provider preference on a cadaveric airway model. In this prospective, crossover study, healthcare providers intubated a cadaver using the PB technique and the standard bougie technique. Participants were randomly assigned to start with either technique. Following standardized training and practice, procedural success and time for each technique was recorded for each participant. Subsequently, participants were asked to rate their perceived ease of intubation on a visual analogue scale of 1 to 10 (1=difficult and 10=easy) and to select which technique they preferred. 47 participants with variable experience intubating were enrolled at an emergency medicine intern airway course. The success rate of all groups for both techniques was equal (95.7%). The range of times to completion for the standard bougie technique was 16.0-70.2 seconds, with a mean time of 29.7 seconds. The range of times to completion for the PB technique was 15.7-110.9 seconds, with a mean time of 29.4 seconds. There was a non-significant difference of 0.3 seconds (95% confidence interval -2.8 to 3.4 seconds) between the two techniques. Participants rated the relative ease of intubation as 7.3/10 for the standard technique and 7.6/10 for the preloaded technique (p=0.53, 95% confidence interval of the difference -0.97 to 0.50). Thirty of 47 participants subjectively preferred the PB technique (p=0.039). There was no significant difference in success or time to intubation between standard bougie and PB techniques. The majority of participants in this study preferred the PB technique. Until a clear and clinically significant difference is found between these techniques, emergency airway operators should feel confident in using the technique with which they are most comfortable.

  13. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  14. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.

  15. The Impact of Adherence and Instillation Proficiency of Topical Glaucoma Medications on Intraocular Pressure.

    PubMed

    Atey, Tesfay Mehari; Shibeshi, Workineh; T Giorgis, Abeba; Asgedom, Solomon Weldegebreal

    2017-01-01

    The possible sequel of poorly controlled intraocular pressure (IOP) includes treatment failure, unnecessary medication use, and economic burden on patients with glaucoma. To assess the impact of adherence and instillation technique on IOP control. A cross-sectional study was conducted on 359 glaucoma patients in Menelik II Hospital from June 1 to July 31, 2015. After conducting a Q-Q analysis, multiple binary logistic analyses, linear regression analyses, and two-tailed paired t-test were conducted to compare IOP in the baseline versus current measurements. Intraocular pressure was controlled in 59.6% of the patients and was relatively well controlled during the study period (mean ( M ) = 17.911 mmHg, standard deviation ( S ) = 0.323) compared to the baseline ( M = 20.866 mmHg, S = 0.383, t (358) = -6.70, p < 0.0001). A unit increase in the administration technique score resulted in a 0.272 mmHg decrease in IOP ( p = 0.03). Moreover, primary angle-closure glaucoma (adjusted odds ratio (AOR) = 0.347, 95% confidence interval (CI): 0.144-0.836) and two medications (AOR = 1.869, 95% CI: 1.259-9.379) were factors affecting IOP. Good instillation technique of the medications was correlated with a reduction in IOP. Consequently, regular assessment of the instillation technique and IOP should be done for better management of the disease.

  16. The impact of a standardized program on short and long-term outcomes in bariatric surgery.

    PubMed

    Aird, Lisa N F; Hong, Dennis; Gmora, Scott; Breau, Ruth; Anvari, Mehran

    2017-02-01

    The purpose of this study was to determine whether there has been an improvement in short- and long-term clinical outcomes since 2010, when the Ontario Bariatric Network led a province-wide initiative to establish a standardized system of care for bariatric patients. The system includes nine bariatric centers, a centralized referral system, and a research registry. Standardization of procedures has progressed yearly, including guidelines for preoperative assessment and perioperative care. Analysis of the OBN registry data was performed by fiscal year between April 2010 and March 2015. Three-month overall postoperative complication rates and 30 day postoperative mortality were calculated. The mean percentage of weight loss at 1, 2, and 3 years postoperative, and regression of obesity-related diseases were calculated. The analysis of continuous and nominal data was performed using ANOVA, Chi-square, and McNemar's testing. A multiple logistic regression analysis was performed for factors affecting postoperative complication rate. Eight thousand and forty-three patients were included in the bariatric registry between April 2010 and March 2015. Thirty-day mortality was rare (<0.075 %) and showed no significant difference between years. Three-month overall postoperative complication rates significantly decreased with standardization (p < 0.001), as did intra-operative complication rates (p < -0.001). Regression analysis demonstrated increasing standardization to be a predictor of 3 month complication rate OR of 0.59 (95 %CI 0.41-0.85, p = 0.00385). The mean percentage of weight loss at 1, 2, and 3 years postoperative showed stability at 33.2 % (9.0 SD), 34.1 % (10.1 SD), and 32.7 % (10.1 SD), respectively. Sustained regression in obesity-related comorbidities was demonstrated at 1, 2, and 3 years postoperative. Evidence indicates the implementation of a standardized system of bariatric care has contributed to improvements in complication rates and supported prolonged weight loss and regression of obesity-related diseases in patients undergoing bariatric surgery in Ontario.

  17. Determination of Flavonoids in Wine by High Performance Liquid Chromatography

    NASA Astrophysics Data System (ADS)

    da Queija, Celeste; Queirós, M. A.; Rodrigues, Ligia M.

    2001-02-01

    The experiment presented is an application of HPLC to the analysis of flavonoids in wines, designed for students of instrumental methods. It is done in two successive 4-hour laboratory sessions. While the hydrolysis of the wines is in progress, the students prepare the calibration curves with standard solutions of flavonoids and calculate the regression lines and correlation coefficients. During the second session they analyze the hydrolyzed wine samples and calculate the concentrations of the flavonoids using the calibration curves obtained earlier. This laboratory work is very attractive to students because they deal with a common daily product whose components are reported to have preventive and therapeutic effects. Furthermore, students can execute preparative work and apply a more elaborate technique that is nowadays an indispensable tool in instrumental analysis.

  18. Coronary artery wall imaging.

    PubMed

    Keegan, Jennifer

    2015-05-01

    Like X-Ray contrast angiography, MR coronary angiograms show the vessel lumens rather than the vessels themselves. Consequently, outward remodeling of the vessel wall, which occurs in subclinical coronary disease before luminal narrowing, cannot be seen. The current gold standard for assessing the coronary vessel wall is intravascular ultrasound, and more recently, optical coherence tomography, both of which are invasive and use ionizing radiation. A noninvasive, low-risk technique for assessing the vessel wall would be beneficial to cardiologists interested in the early detection of preclinical disease and for the safe monitoring of the progression or regression of disease in longitudinal studies. In this review article, the current state of the art in MR coronary vessel wall imaging is discussed, together with validation studies and recent developments. © 2014 Wiley Periodicals, Inc.

  19. The effect of biological movement variability on the performance of the golf swing in high- and low-handicapped players.

    PubMed

    Bradshaw, Elizabeth J; Keogh, Justin W L; Hume, Patria A; Maulder, Peter S; Nortje, Jacques; Marnewick, Michel

    2009-06-01

    The purpose of this study was to examine the role of neuromotor noise on golf swing performance in high- and low-handicap players. Selected two-dimensional kinematic measures of 20 male golfers (n=10 per high- or low-handicap group) performing 10 golf swings with a 5-iron club was obtained through video analysis. Neuromotor noise was calculated by deducting the standard error of the measurement from the coefficient of variation obtained from intra-individual analysis. Statistical methods included linear regression analysis and one-way analysis of variance using SPSS. Absolute invariance in the key technical positions (e.g., at the top of the backswing) of the golf swing appears to be a more favorable technique for skilled performance.

  20. A birth cohort analysis of dental contact among elderly Americans.

    PubMed Central

    Wolinsky, F D; Arnold, C L

    1989-01-01

    We applied standard cohort and multiple regression techniques to data on the dental utilization rates of 129,191 elderly individuals taken from the 1972, 1973, 1976, 1977, 1980, and 1981 Health Interview Surveys. The results indicate that the marked variation in dental contact rates is a reflection of cohort succession, and not a function of aging per se. Older cohorts having lower dental contact rates are being replaced by younger cohorts having higher dental contact rates. The dental contact rates of the individual birth cohorts themselves are quite stable over time. The results also indicate that economic barriers (especially liquid assets) have become more important than ever before, especially for the oldest-old. These findings have important implications for public policy about the oral health and health care of elderly Americans. PMID:2783297

  1. Cytogenetic status and oxidative DNA-damage induced by atorvastatin in human peripheral blood lymphocytes: Standard and Fpg-modified comet assay

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gajski, Goran; Garaj-Vrhovac, Vera; Orescanin, Visnja

    2008-08-15

    To investigate the genotoxic potential of atorvastatin on human lymphocytes in vitro standard comet assay was used in the evaluation of basal DNA damage and to investigate possible oxidative DNA damage produced by reactive oxygen species (ROS) Fpg-modified version of comet assay was also conducted. In addition to these techniques the new criteria for scoring micronucleus test were applied for more complete detection of baseline damage in binuclear lymphocytes exposed to atorvastatin 80 mg/day in different time periods by virtue of measuring the frequency of micronuclei, nucleoplasmic bridges and nuclear buds. All parameters obtained with the standard comet assay andmore » Fpg-modified comet assay were significantly higher in the treated than in control lymphocytes. The Fpg-modified comet assay showed a significantly greater tail length, tail intensity, and tail moment in all treated lymphocytes than did the standard comet assay, which suggests that oxidative stress is likely to be responsible for DNA damage. DNA damage detected by the standard comet assay indicates that some other mechanism is also involved. In addition to the comet assay, a total number of micronuclei, nucleoplasmic bridges and nuclear buds were significantly higher in the exposed than in controlled lymphocytes. Regression analyses showed a positive correlation between the results obtained by the comet (Fpg-modified and standard) and micronucleus assay. Overall, the study demonstrated that atorvastatin in its highest dose is capable of producing damage on the level of DNA molecule and cell.« less

  2. Research on the effects of urbanization on small stream flow quantity

    DOT National Transportation Integrated Search

    1978-12-01

    This study is a preliminary investigation into the feasibility of using simple techniques to evaluate the effects of urbanization on flood flows in small streams. A number of regression techniques and computer simulation techniques were evaluated, an...

  3. Wavelet regression model in forecasting crude oil price

    NASA Astrophysics Data System (ADS)

    Hamid, Mohd Helmie; Shabri, Ani

    2017-05-01

    This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.

  4. Feasibility of quantification of the distribution of blood flow in the normal human fetal circulation using CMR: a cross-sectional study.

    PubMed

    Seed, Mike; van Amerom, Joshua F P; Yoo, Shi-Joon; Al Nafisi, Bahiyah; Grosse-Wortmann, Lars; Jaeggi, Edgar; Jansz, Michael S; Macgowan, Christopher K

    2012-11-26

    We present the first phase contrast (PC) cardiovascular magnetic resonance (CMR) measurements of the distribution of blood flow in twelve late gestation human fetuses. These were obtained using a retrospective gating technique known as metric optimised gating (MOG). A validation experiment was performed in five adult volunteers where conventional cardiac gating was compared with MOG. Linear regression and Bland Altman plots were used to compare MOG with the gold standard of conventional gating. Measurements using MOG were then made in twelve normal fetuses at a median gestational age of 37 weeks (range 30-39 weeks). Flow was measured in the major fetal vessels and indexed to the fetal weight. There was good correlation between the conventional gated and MOG measurements in the adult validation experiment (R=0.96). Mean flows in ml/min/kg with standard deviations in the major fetal vessels were as follows: combined ventricular output (CVO) 540 ± 101, main pulmonary artery (MPA) 327 ± 68, ascending aorta (AAo) 198 ± 38, superior vena cava (SVC) 147 ± 46, ductus arteriosus (DA) 220 ± 39,pulmonary blood flow (PBF) 106 ± 59,descending aorta (DAo) 273 ± 85, umbilical vein (UV) 160 ± 62, foramen ovale (FO)107 ± 54. Results expressed as mean percentages of the CVO with standard deviations were as follows: MPA 60 ± 4, AAo37 ± 4, SVC 28 ± 7, DA 41 ± 8, PBF 19 ± 10, DAo50 ± 12, UV 30 ± 9, FO 21 ± 12. This study demonstrates how PC CMR with MOG is a feasible technique for measuring the distribution of the normal human fetal circulation in late pregnancy. Our preliminary results are in keeping with findings from previous experimental work in fetal lambs.

  5. Hypothesis Testing Using Factor Score Regression: A Comparison of Four Methods

    ERIC Educational Resources Information Center

    Devlieger, Ines; Mayer, Axel; Rosseel, Yves

    2016-01-01

    In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and…

  6. The Equivalence of Regression Models Using Difference Scores and Models Using Separate Scores for Each Informant: Implications for the Study of Informant Discrepancies

    ERIC Educational Resources Information Center

    Laird, Robert D.; Weems, Carl F.

    2011-01-01

    Research on informant discrepancies has increasingly utilized difference scores. This article demonstrates the statistical equivalence of regression models using difference scores (raw or standardized) and regression models using separate scores for each informant to show that interpretations should be consistent with both models. First,…

  7. SPSS macros to compare any two fitted values from a regression model.

    PubMed

    Weaver, Bruce; Dubois, Sacha

    2012-12-01

    In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.

  8. Comprehensive Assessment of Coronary Artery Disease by Using First-Pass Analysis Dynamic CT Perfusion: Validation in a Swine Model.

    PubMed

    Hubbard, Logan; Lipinski, Jerry; Ziemer, Benjamin; Malkasian, Shant; Sadeghi, Bahman; Javan, Hanna; Groves, Elliott M; Dertli, Brian; Molloi, Sabee

    2018-01-01

    Purpose To retrospectively validate a first-pass analysis (FPA) technique that combines computed tomographic (CT) angiography and dynamic CT perfusion measurement into one low-dose examination. Materials and Methods The study was approved by the animal care committee. The FPA technique was retrospectively validated in six swine (mean weight, 37.3 kg ± 7.5 [standard deviation]) between April 2015 and October 2016. Four to five intermediate-severity stenoses were generated in the left anterior descending artery (LAD), and 20 contrast material-enhanced volume scans were acquired per stenosis. All volume scans were used for maximum slope model (MSM) perfusion measurement, but only two volume scans were used for FPA perfusion measurement. Perfusion measurements in the LAD, left circumflex artery (LCx), right coronary artery, and all three coronary arteries combined were compared with microsphere perfusion measurements by using regression, root-mean-square error, root-mean-square deviation, Lin concordance correlation, and diagnostic outcomes analysis. The CT dose index and size-specific dose estimate per two-volume FPA perfusion measurement were also determined. Results FPA and MSM perfusion measurements (P FPA and P MSM ) in all three coronary arteries combined were related to reference standard microsphere perfusion measurements (P MICRO ), as follows: P FPA_COMBINED = 1.02 P MICRO_COMBINED + 0.11 (r = 0.96) and P MSM_COMBINED = 0.28 P MICRO_COMBINED + 0.23 (r = 0.89). The CT dose index and size-specific dose estimate per two-volume FPA perfusion measurement were 10.8 and 17.8 mGy, respectively. Conclusion The FPA technique was retrospectively validated in a swine model and has the potential to be used for accurate, low-dose vessel-specific morphologic and physiologic assessment of coronary artery disease. © RSNA, 2017.

  9. Quantitative analysis of binary polymorphs mixtures of fusidic acid by diffuse reflectance FTIR spectroscopy, diffuse reflectance FT-NIR spectroscopy, Raman spectroscopy and multivariate calibration.

    PubMed

    Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia

    2017-06-05

    Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Intercomparison of four different in-situ techniques for ambient formaldehyde measurements in urban air

    NASA Astrophysics Data System (ADS)

    Hak, C.; Pundt, I.; Trick, S.; Kern, C.; Platt, U.; Dommen, J.; Ordóñez, C.; Prévôt, A. S. H.; Junkermann, W.; Astorga-Lloréns, C.; Larsen, B. R.; Mellqvist, J.; Strandberg, A.; Yu, Y.; Galle, B.; Kleffmann, J.; Lörzer, J. C.; Braathen, G. O.; Volkamer, R.

    2005-11-01

    Results from an intercomparison of several currently used in-situ techniques for the measurement of atmospheric formaldehyde (CH2O) are presented. The measurements were carried out at Bresso, an urban site in the periphery of Milan (Italy) as part of the FORMAT-I field campaign. Eight instruments were employed by six independent research groups using four different techniques: Differential Optical Absorption Spectroscopy (DOAS), Fourier Transform Infra Red (FTIR) interferometry, the fluorimetric Hantzsch reaction technique (five instruments) and a chromatographic technique employing C18-DNPH-cartridges (2,4-dinitrophenylhydrazine). White type multi-reflection systems were employed for the optical techniques in order to avoid spatial CH2O gradients and ensure the sampling of nearly the same air mass by all instruments. Between 23 and 31 July 2002, up to 13 ppbv of CH2O were observed. The concentrations lay well above the detection limits of all instruments. The formaldehyde concentrations determined with DOAS, FTIR and the Hantzsch instruments were found to agree within ±11%, with the exception of one Hantzsch instrument, which gave systematically higher values. The two hour integrated samples by DNPH yielded up to 25% lower concentrations than the data of the continuously measuring instruments averaged over the same time period. The consistency between the DOAS and the Hantzsch method was better than during previous intercomparisons in ambient air with slopes of the regression line not significantly differing from one. The differences between the individual Hantzsch instruments could be attributed in part to the calibration standards used. Possible systematic errors of the methods are discussed.

  11. Intercomparison of four different in-situ techniques for ambient formaldehyde measurements in urban air

    NASA Astrophysics Data System (ADS)

    Hak, C.; Pundt, I.; Kern, C.; Platt, U.; Dommen, J.; Ordóñez, C.; Prévôt, A. S. H.; Junkermann, W.; Astorga-Lloréns, C.; Larsen, B. R.; Mellqvist, J.; Strandberg, A.; Yu, Y.; Galle, B.; Kleffmann, J.; Lörzer, J. C.; Braathen, G. O.; Volkamer, R.

    2005-05-01

    Results from an intercomparison of several currently used in-situ techniques for the measurement of atmospheric formaldehyde (CH2O) are presented. The measurements were carried out at Bresso, an urban site in the periphery of Milan (Italy) as part of the FORMAT-I field campaign. Eight instruments were employed by six independent research groups using four different techniques: Differential Optical Absorption Spectroscopy (DOAS), Fourier Transform Infra Red (FTIR) interferometry, the fluorimetric Hantzsch reaction technique (five instruments) and a chromatographic technique employing C18-DNPH-cartridges (2,4-dinitrophenylhydrazine). White type multi-reflection systems were employed for the optical techniques in order to avoid spatial CH2O gradients and ensure the sampling of nearly the same air mass by all instruments. Between 23 and 31 July 2002, up to 13 ppbv of CH2O were observed. The concentrations lay well above the detection limits of all instruments. The formaldehyde concentrations determined with DOAS, FTIR and the Hantzsch instruments were found to agree within ±11%, with the exception of one Hantzsch instrument, which gave systematically higher values. The two hour integrated samples by DNPH yielded up to 25% lower concentrations than the data of the continuously measuring instruments averaged over the same time period. The consistency between the DOAS and the Hantzsch method was better than during previous intercomparisons in ambient air with slopes of the regression line not significantly differing from one. The differences between the individual Hantzsch instruments could be attributed in part to the calibration standards used. Possible systematic errors of the methods are discussed.

  12. Combining fibre optic Raman spectroscopy and tactile resonance measurement for tissue characterization

    NASA Astrophysics Data System (ADS)

    Candefjord, Stefan; Nyberg, Morgan; Jalkanen, Ville; Ramser, Kerstin; Lindahl, Olof A.

    2010-12-01

    Tissue characterization is fundamental for identification of pathological conditions. Raman spectroscopy (RS) and tactile resonance measurement (TRM) are two promising techniques that measure biochemical content and stiffness, respectively. They have potential to complement the golden standard--histological analysis. By combining RS and TRM, complementary information about tissue content can be obtained and specific drawbacks can be avoided. The aim of this study was to develop a multivariate approach to compare RS and TRM information. The approach was evaluated on measurements at the same points on porcine abdominal tissue. The measurement points were divided into five groups by multivariate analysis of the RS data. A regression analysis was performed and receiver operating characteristic (ROC) curves were used to compare the RS and TRM data. TRM identified one group efficiently (area under ROC curve 0.99). The RS data showed that the proportion of saturated fat was high in this group. The regression analysis showed that stiffness was mainly determined by the amount of fat and its composition. We concluded that RS provided additional, important information for tissue identification that was not provided by TRM alone. The results are promising for development of a method combining RS and TRM for intraoperative tissue characterization.

  13. Prediction of anthropometric measurements from tooth length--A Dravidian study.

    PubMed

    Sunitha, J; Ananthalakshmi, R; Sathiya, Jeeva J; Nadeem, Jeddy; Dhanarathnam, Shanmugam

    2015-12-01

    Anthropometric measurement is essential for identification of both victims and suspects. Often, this data is not readily available in a crime scene situation. The availability of one data set should help in predicting the other. This study was hypothesised on the basis of a correlation and geometry between the tooth length and various body measurements. To correlate face, palm, foot and stature measurements with tooth length. To derive a regression formula to estimate the various measurements from tooth length. The present study was conducted on Dravidian dental students in the age group 18 - 25 with a sample size of 372. All of the dental and physical parameters were measured using standard anthropometric equipments and techniques. The data was analysed using SPSS software and the methods used for statistical analysis were linear regression analysis and Pearson correlation. The parameters (incisor height (IH), face height (FH), palm length (PL), foot length (FL) and stature (S) showed nil to mild correlation (R = 0.2 ≤ 0.4) except for palm length (PL) and foot length (FL). (R>0.6). It is concluded that odontometric data is not a reliable source for estimating the face height (FH), palm length (PL), foot length (FL) and stature (S).

  14. Feasibility of using a miniature NIR spectrometer to measure volumic mass during alcoholic fermentation.

    PubMed

    Fernández-Novales, Juan; López, María-Isabel; González-Caballero, Virginia; Ramírez, Pilar; Sánchez, María-Teresa

    2011-06-01

    Volumic mass-a key component of must quality control tests during alcoholic fermentation-is of great interest to the winemaking industry. Transmitance near-infrared (NIR) spectra of 124 must samples over the range of 200-1,100-nm were obtained using a miniature spectrometer. The performance of this instrument to predict volumic mass was evaluated using partial least squares (PLS) regression and multiple linear regression (MLR). The validation statistics coefficient of determination (r(2)) and the standard error of prediction (SEP) were r(2) = 0.98, n = 31 and r(2) = 0.96, n = 31, and SEP = 5.85 and 7.49 g/dm(3) for PLS and MLR equations developed to fit reference data for volumic mass and spectral data. Comparison of results from MLR and PLS demonstrates that a MLR model with six significant wavelengths (P < 0.05) fit volumic mass data to transmittance (1/T) data slightly worse than a more sophisticated PLS model using the full scanning range. The results suggest that NIR spectroscopy is a suitable technique for predicting volumic mass during alcoholic fermentation, and that a low-cost NIR instrument can be used for this purpose.

  15. Sparse kernel methods for high-dimensional survival data.

    PubMed

    Evers, Ludger; Messow, Claudia-Martina

    2008-07-15

    Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.

  16. Evaluation of age determination techniques for gray wolves

    USGS Publications Warehouse

    Landon, D.B.; Waite, C.A.; Peterson, R.O.; Mech, L.D.

    1998-01-01

    We evaluated tooth wear, cranial suture fusion, closure of the canine pulp cavity, and cementum annuli as methods of age determination for known- and unknown-age gray wolves (Canis lupus) from Alaska, Minnesota, Ontario, and Isle Royale, Michigan. We developed age classes for cranial suture closure and tooth wear. We used measurement data obtained from known-age captive and wild wolves to generate a regression equation to predict age based on the degree of closure of the canine pulp cavity. Cementum annuli were studied in known- and unknown-age animals, and calcified, unstained thin sections were found to provide clear annulus patterns under polarized transmitted light. Annuli counts varied among observers, partly because of variation in the pattern of annuli in different regions of the cementum. This variation emphasizes the need for standardized models of cementum analysis. Cranial suture fusion is of limited utility in age determination, while tooth wear can be used to estimate age of adult wolves within 4 years. Wolves lt 7 years old could be aged to within 13 years with the regression equation for closure of the canine pulp cavity. Although inaccuracy remains a problem, cementum-annulus counts were the most promising means of estimating age for gray wolves.

  17. Near infrared spectroscopy for prediction of antioxidant compounds in the honey.

    PubMed

    Escuredo, Olga; Seijo, M Carmen; Salvador, Javier; González-Martín, M Inmaculada

    2013-12-15

    The selection of antioxidant variables in honey is first time considered applying the near infrared (NIR) spectroscopic technique. A total of 60 honey samples were used to develop the calibration models using the modified partial least squares (MPLS) regression method and 15 samples were used for external validation. Calibration models on honey matrix for the estimation of phenols, flavonoids, vitamin C, antioxidant capacity (DPPH), oxidation index and copper using near infrared (NIR) spectroscopy has been satisfactorily obtained. These models were optimised by cross-validation, and the best model was evaluated according to multiple correlation coefficient (RSQ), standard error of cross-validation (SECV), ratio performance deviation (RPD) and root mean standard error (RMSE) in the prediction set. The result of these statistics suggested that the equations developed could be used for rapid determination of antioxidant compounds in honey. This work shows that near infrared spectroscopy can be considered as rapid tool for the nondestructive measurement of antioxidant constitutes as phenols, flavonoids, vitamin C and copper and also the antioxidant capacity in the honey. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. Comparison of the acetyl bromide spectrophotometric method with other analytical lignin methods for determining lignin concentration in forage samples.

    PubMed

    Fukushima, Romualdo S; Hatfield, Ronald D

    2004-06-16

    Present analytical methods to quantify lignin in herbaceous plants are not totally satisfactory. A spectrophotometric method, acetyl bromide soluble lignin (ABSL), has been employed to determine lignin concentration in a range of plant materials. In this work, lignin extracted with acidic dioxane was used to develop standard curves and to calculate the derived linear regression equation (slope equals absorptivity value or extinction coefficient) for determining the lignin concentration of respective cell wall samples. This procedure yielded lignin values that were different from those obtained with Klason lignin, acid detergent acid insoluble lignin, or permanganate lignin procedures. Correlations with in vitro dry matter or cell wall digestibility of samples were highest with data from the spectrophotometric technique. The ABSL method employing as standard lignin extracted with acidic dioxane has the potential to be employed as an analytical method to determine lignin concentration in a range of forage materials. It may be useful in developing a quick and easy method to predict in vitro digestibility on the basis of the total lignin content of a sample.

  19. Proposed standard-weight (W(s)) equations for kokanee, golden trout and bull trout

    USGS Publications Warehouse

    Hyatt, M.H.; Hubert, W.A.

    2000-01-01

    We developed standard-weight (W(s)) equations for kokanee (lacustrine Oncorhynchus nerka), golden trout (O. aguabonita), and bull trout (Salvelinus confluentus) using the regression-line-percentile technique. The W(s) equation for kokanee of 120-550 mm TL is log10 W(s) = -5.062 + 3.033 log10 TL, when W(s) is in grams and TL is total length in millimeters; the English-unit equivalent is log10 W(s) = -3.458 + 3.033 log10 TL, when W(s) is in pounds and TL is total length in inches. The W(s) equation for golden trout of 120-530 mm TL is log10 W(s) = -5.088 + 3.041 log10 TL, with the English-unit equivalent being log10 W(s) = -3.473 + 3.041 log10 TL. The W(s) equation for bull trout of 120-850 mm TL is log10 W(s) = -5.327 + 3.115 log10 TL, with the English-unit equivalent being log10 W(s) = -3.608 + 3.115 log10 TL.

  20. Using "Excel" for White's Test--An Important Technique for Evaluating the Equality of Variance Assumption and Model Specification in a Regression Analysis

    ERIC Educational Resources Information Center

    Berenson, Mark L.

    2013-01-01

    There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…

  1. Time series forecasting using ERNN and QR based on Bayesian model averaging

    NASA Astrophysics Data System (ADS)

    Pwasong, Augustine; Sathasivam, Saratha

    2017-08-01

    The Bayesian model averaging technique is a multi-model combination technique. The technique was employed to amalgamate the Elman recurrent neural network (ERNN) technique with the quadratic regression (QR) technique. The amalgamation produced a hybrid technique known as the hybrid ERNN-QR technique. The potentials of forecasting with the hybrid technique are compared with the forecasting capabilities of individual techniques of ERNN and QR. The outcome revealed that the hybrid technique is superior to the individual techniques in the mean square error sense.

  2. Reference values for airway resistance in newborns, infants and preschoolers from a Latin American population.

    PubMed

    Gochicoa, Laura G; Thomé-Ortiz, Laura P; Furuya, María E Y; Canto, Raquel; Ruiz-García, Martha E; Zúñiga-Vázquez, Guillermo; Martínez-Ramírez, Filiberto; Vargas, Mario H

    2012-05-01

    Several studies have determined reference values for airway resistance measured by the interrupter technique (Rint) in paediatric populations, but only one has been done on Latin American children, and no studies have been performed on Mexican children. Moreover, these previous studies mostly included children aged 3 years and older; therefore, information regarding Rint reference values for newborns and infants is scarce. Rint measurements were performed on preschool children attending eight kindergartens (Group 1) and also on sedated newborns, infants and preschool children admitted to a tertiary-level paediatric hospital due to non-cardiopulmonary disorders (Group 2). In both groups, Rint values were inversely associated with age, weight and height, but the strongest association was with height. The linear regression equation for Group 1 (n = 209, height 86-129 cm) was Rint = 2.153 - 0.012 × height (cm) (standard deviation of residuals 0.181 kPa/L/s). The linear regression equation for Group 2 (n = 55, height 52-113 cm) was Rint = 4.575 - 0.035 × height (cm) (standard deviation of residuals 0.567 kPa/L/s). Girls tended to have slightly higher Rint values than boys, a difference that diminished with increasing height. In this study, Rint reference values applicable to Mexican children were determined, and these values are probably also applicable to other paediatric populations with similar Spanish-Amerindian ancestries. There was an inverse relationship between Rint and height, with relatively large between-subject variability. © 2012 The Authors. Respirology © 2012 Asian Pacific Society of Respirology.

  3. Relative Motion of the WDS 05110+3203 STF 648 System, With a Protocol for Calculating Relative Motion

    NASA Astrophysics Data System (ADS)

    Wiley, E. O.

    2010-07-01

    Relative motion studies of visual double stars can be investigated using least squares regression techniques and readily accessible programs such as Microsoft Excel and a calculator. Optical pairs differ from physical pairs under most geometries in both their simple scatter plots and their regression models. A step-by-step protocol for estimating the rectilinear elements of an optical pair is presented. The characteristics of physical pairs using these techniques are discussed.

  4. Near-infrared diffuse reflection systems for chlorophyll content of tomato leaves measurement

    NASA Astrophysics Data System (ADS)

    Jiang, Huanyu; Ying, Yibin; Lu, Huishan

    2006-10-01

    In this study, two measuring systems for chlorophyll content of tomato leaves were developed based on near-infrared spectral techniques. The systems mainly consists of a FT-IR spectrum analyzer, optic fiber diffuses reflection accessories and data card. Diffuse reflectance of intact tomato leaves was measured by an optics fiber optic fiber diffuses reflection accessory and a smart diffuses reflection accessory. Calibration models were developed from spectral and constituent measurements. 90 samples served as the calibration sets and 30 samples served as the validation sets. Partial least squares (PLS) and principal component regression (PCR) technique were used to develop the prediction models by different data preprocessing. The best model for chlorophyll content had a high correlation efficient of 0.9348 and a low standard error of prediction RMSEP of 4.79 when we select full range (12500-4000 cm -1), MSC path length correction method by the log(1/R). The results of this study suggest that FT-NIR method can be feasible to detect chlorophyll content of tomato leaves rapidly and nondestructively.

  5. Rapid evaluation technique to differentiate mushroom disease-related moulds by detecting microbial volatile organic compounds using HS-SPME-GC-MS.

    PubMed

    Radványi, Dalma; Gere, Attila; Jókai, Zsuzsa; Fodor, Péter

    2015-01-01

    Headspace solid-phase microextraction (HS-SPME) coupled with gas chromatography-mass spectrometry (GC-MS) was used to analyse microbial volatile organic compounds (MVOCs) of mushroom disease-related microorganisms. Mycogone perniciosa, Lecanicillum fungicola var. fungicola, and Trichoderma aggressivum f. europaeum species, which are typically harmful in mushroom cultivation, were examined, and Agaricus bisporus (bisporic button mushroom) was also examined as a control. For internal standard, a mixture of alkanes was used; these were introduced as the memory effect of primed septa in the vial seal. Several different marker compounds were found in each sample, which enabled us to distinguish the different moulds and the mushroom mycelium from each other. Monitoring of marker compounds enabled us to investigate the behaviour of moulds. The records of the temporal pattern changes were used to produce partial least squares regression (PLS-R) models that enabled determination of the exact time of contamination (the infection time of the media). Using these evaluation techniques, the presence of mushroom disease-related fungi can be easily detected and monitored via their emitted MVOCs.

  6. Technique for simulating peak-flow hydrographs in Maryland

    USGS Publications Warehouse

    Dillow, Jonathan J.A.

    1998-01-01

    The efficient design and management of many bridges, culverts, embankments, and flood-protection structures may require the estimation of time-of-inundation and (or) storage of floodwater relating to such structures. These estimates can be made on the basis of information derived from the peak-flow hydrograph. Average peak-flow hydrographs corresponding to a peak discharge of specific recurrence interval can be simulated for drainage basins having drainage areas less than 500 square miles in Maryland, using a direct technique of known accuracy. The technique uses dimensionless hydrographs in conjunction with estimates of basin lagtime and instantaneous peak flow. Ordinary least-squares regression analysis was used to develop an equation for estimating basin lagtime in Maryland. Drainage area, main channel slope, forest cover, and impervious area were determined to be the significant explanatory variables necessary to estimate average basin lagtime at the 95-percent confidence interval. Qualitative variables included in the equation adequately correct for geographic bias across the State. The average standard error of prediction associated with the equation is approximated as plus or minus (+/-) 37.6 percent. Volume correction factors may be applied to the basin lagtime on the basis of a comparison between actual and estimated hydrograph volumes prior to hydrograph simulation. Three dimensionless hydrographs were developed and tested using data collected during 278 significant rainfall-runoff events at 81 stream-gaging stations distributed throughout Maryland and Delaware. The data represent a range of drainage area sizes and basin conditions. The technique was verified by applying it to the simulation of 20 peak-flow events and comparing actual and simulated hydrograph widths at 50 and 75 percent of the observed peak-flow levels. The events chosen are considered extreme in that the average recurrence interval of the selected peak flows is 130 years. The average standard errors of prediction were +/- 61 and +/- 56 percent at the 50 and 75 percent of peak-flow hydrograph widths, respectively.

  7. Age-related factors in the relationship between foot measurements and living stature and body weight.

    PubMed

    Atamturk, Derya; Duyar, Izzet

    2008-11-01

    The measurements of feet and footprints are especially important in forensic identification, as they have been used to predict the body height and weight of victims or suspects. It can be observed that the subjects of forensic-oriented studies are generally young adults. That is to say, researchers rarely take into consideration the body's proportional changes with age. Hence, the aim of this study is to generate equations which take age and sex into consideration, when stature and body weight are estimated from foot and footprints dimensions. With this aim in mind, we measured the stature, body weight, foot length and breadth, heel breadth, footprint length and breadth, and footprint heel breadth of 516 volunteers (253 males and 263 females) aged between 17.6 and 82.9 years using standard measurement techniques. The sample population was divided randomly into two groups. Group 1, the study group, consisted of 80% of the sample (n = 406); the remaining 20% were assigned to the cross-validation group or Group 2 (n = 110). In the first stage of the study, we produced equations for estimating stature and weight using a stepwise regression technique. Then, their reliability was tested on Group 2 members. Statistical analyses showed that the ratios of foot dimensions to stature and body weight change considerably with age and sex. Consequently, the regression equations which include these variables yielded more reliable results. Our results indicated that age and sex should be taken into consideration when predicting human body height and weight for forensic purposes.

  8. A Novel Continuous Blood Pressure Estimation Approach Based on Data Mining Techniques.

    PubMed

    Miao, Fen; Fu, Nan; Zhang, Yuan-Ting; Ding, Xiao-Rong; Hong, Xi; He, Qingyun; Li, Ye

    2017-11-01

    Continuous blood pressure (BP) estimation using pulse transit time (PTT) is a promising method for unobtrusive BP measurement. However, the accuracy of this approach must be improved for it to be viable for a wide range of applications. This study proposes a novel continuous BP estimation approach that combines data mining techniques with a traditional mechanism-driven model. First, 14 features derived from simultaneous electrocardiogram and photoplethysmogram signals were extracted for beat-to-beat BP estimation. A genetic algorithm-based feature selection method was then used to select BP indicators for each subject. Multivariate linear regression and support vector regression were employed to develop the BP model. The accuracy and robustness of the proposed approach were validated for static, dynamic, and follow-up performance. Experimental results based on 73 subjects showed that the proposed approach exhibited excellent accuracy in static BP estimation, with a correlation coefficient and mean error of 0.852 and -0.001 ± 3.102 mmHg for systolic BP, and 0.790 and -0.004 ± 2.199 mmHg for diastolic BP. Similar performance was observed for dynamic BP estimation. The robustness results indicated that the estimation accuracy was lower by a certain degree one day after model construction but was relatively stable from one day to six months after construction. The proposed approach is superior to the state-of-the-art PTT-based model for an approximately 2-mmHg reduction in the standard derivation at different time intervals, thus providing potentially novel insights for cuffless BP estimation.

  9. Probabilistic Forecasting of Surface Ozone with a Novel Statistical Approach

    NASA Technical Reports Server (NTRS)

    Balashov, Nikolay V.; Thompson, Anne M.; Young, George S.

    2017-01-01

    The recent change in the Environmental Protection Agency's surface ozone regulation, lowering the surface ozone daily maximum 8-h average (MDA8) exceedance threshold from 75 to 70 ppbv, poses significant challenges to U.S. air quality (AQ) forecasters responsible for ozone MDA8 forecasts. The forecasters, supplied by only a few AQ model products, end up relying heavily on self-developed tools. To help U.S. AQ forecasters, this study explores a surface ozone MDA8 forecasting tool that is based solely on statistical methods and standard meteorological variables from the numerical weather prediction (NWP) models. The model combines the self-organizing map (SOM), which is a clustering technique, with a step wise weighted quadratic regression using meteorological variables as predictors for ozone MDA8. The SOM method identifies different weather regimes, to distinguish between various modes of ozone variability, and groups them according to similarity. In this way, when a regression is developed for a specific regime, data from the other regimes are also used, with weights that are based on their similarity to this specific regime. This approach, regression in SOM (REGiS), yields a distinct model for each regime taking into account both the training cases for that regime and other similar training cases. To produce probabilistic MDA8 ozone forecasts, REGiS weighs and combines all of the developed regression models on the basis of the weather patterns predicted by an NWP model. REGiS is evaluated over the San Joaquin Valley in California and the northeastern plains of Colorado. The results suggest that the model performs best when trained and adjusted separately for an individual AQ station and its corresponding meteorological site.

  10. Adaptive surrogate modeling by ANOVA and sparse polynomial dimensional decomposition for global sensitivity analysis in fluid simulation

    NASA Astrophysics Data System (ADS)

    Tang, Kunkun; Congedo, Pietro M.; Abgrall, Rémi

    2016-06-01

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.

  11. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  12. Comparison of Two Surface Contamination Sampling Techniques Conducted for the Characterization of Two Pajarito Site Manhattan Project National Historic Park Properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lopez, Tammy Ann

    Technical Area-18 (TA-18), also known as Pajarito Site, is located on Los Alamos National Laboratory property and has historic buildings that will be included in the Manhattan Project National Historic Park. Characterization studies of metal contamination were needed in two of the four buildings that are on the historic registry in this area, a “battleship” bunker building (TA-18-0002) and the Pond cabin (TA-18-0029). However, these two buildings have been exposed to the elements, are decades old, and have porous and rough surfaces (wood and concrete). Due to these conditions, it was questioned whether standard wipe sampling would be adequate tomore » detect surface dust metal contamination in these buildings. Thus, micro-vacuum and surface wet wipe sampling techniques were performed side-by-side at both buildings and results were compared statistically. A two-tail paired t-test revealed that the micro-vacuum and wet wipe techniques were statistically different for both buildings. Further mathematical analysis revealed that the wet wipe technique picked up more metals from the surface than the microvacuum technique. Wet wipes revealed concentrations of beryllium and lead above internal housekeeping limits; however, using an yttrium normalization method with linear regression analysis between beryllium and yttrium revealed a correlation indicating that the beryllium levels were likely due to background and not operational contamination. PPE and administrative controls were implemented for National Park Service (NPS) and Department of Energy (DOE) tours as a result of this study. Overall, this study indicates that the micro-vacuum technique may not be an efficient technique to sample for metal dust contamination.« less

  13. Parameter estimation in Cox models with missing failure indicators and the OPPERA study.

    PubMed

    Brownstein, Naomi C; Cai, Jianwen; Slade, Gary D; Bair, Eric

    2015-12-30

    In a prospective cohort study, examining all participants for incidence of the condition of interest may be prohibitively expensive. For example, the "gold standard" for diagnosing temporomandibular disorder (TMD) is a physical examination by a trained clinician. In large studies, examining all participants in this manner is infeasible. Instead, it is common to use questionnaires to screen for incidence of TMD and perform the "gold standard" examination only on participants who screen positively. Unfortunately, some participants may leave the study before receiving the "gold standard" examination. Within the framework of survival analysis, this results in missing failure indicators. Motivated by the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, a large cohort study of TMD, we propose a method for parameter estimation in survival models with missing failure indicators. We estimate the probability of being an incident case for those lacking a "gold standard" examination using logistic regression. These estimated probabilities are used to generate multiple imputations of case status for each missing examination that are combined with observed data in appropriate regression models. The variance introduced by the procedure is estimated using multiple imputation. The method can be used to estimate both regression coefficients in Cox proportional hazard models as well as incidence rates using Poisson regression. We simulate data with missing failure indicators and show that our method performs as well as or better than competing methods. Finally, we apply the proposed method to data from the OPPERA study. Copyright © 2015 John Wiley & Sons, Ltd.

  14. Introduction to the use of regression models in epidemiology.

    PubMed

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  15. The use of the temporal scan statistic to detect methicillin-resistant Staphylococcus aureus clusters in a community hospital.

    PubMed

    Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott

    2014-07-08

    In healthcare facilities, conventional surveillance techniques using rule-based guidelines may result in under- or over-reporting of methicillin-resistant Staphylococcus aureus (MRSA) outbreaks, as these guidelines are generally unvalidated. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting MRSA clusters, validate clusters using molecular techniques and hospital records, and determine significant differences in the rate of MRSA cases using regression models. Patients admitted to a community hospital between August 2006 and February 2011, and identified with MRSA>48 hours following hospital admission, were included in this study. Between March 2010 and February 2011, MRSA specimens were obtained for spa typing. MRSA clusters were investigated using a retrospective temporal scan statistic. Tests were conducted on a monthly scale and significant clusters were compared to MRSA outbreaks identified by hospital personnel. Associations between the rate of MRSA cases and the variables year, month, and season were investigated using a negative binomial regression model. During the study period, 735 MRSA cases were identified and 167 MRSA isolates were spa typed. Nine different spa types were identified with spa type 2/t002 (88.6%) the most prevalent. The temporal scan statistic identified significant MRSA clusters at the hospital (n=2), service (n=16), and ward (n=10) levels (P ≤ 0.05). Seven clusters were concordant with nine MRSA outbreaks identified by hospital staff. For the remaining clusters, seven events may have been equivalent to true outbreaks and six clusters demonstrated possible transmission events. The regression analysis indicated years 2009-2011, compared to 2006, and months March and April, compared to January, were associated with an increase in the rate of MRSA cases (P ≤ 0.05). The application of the temporal scan statistic identified several MRSA clusters that were not detected by hospital personnel. The identification of specific years and months with increased MRSA rates may be attributable to several hospital level factors including the presence of other pathogens. Within hospitals, the incorporation of the temporal scan statistic to standard surveillance techniques is a valuable tool for healthcare workers to evaluate surveillance strategies and aid in the identification of MRSA clusters.

  16. Study on for soluble solids contents measurement of grape juice beverage based on Vis/NIRS and chemomtrics

    NASA Astrophysics Data System (ADS)

    Wu, Di; He, Yong

    2007-11-01

    The aim of this study is to investigate the potential of the visible and near infrared spectroscopy (Vis/NIRS) technique for non-destructive measurement of soluble solids contents (SSC) in grape juice beverage. 380 samples were studied in this paper. Smoothing way of Savitzky-Golay and standard normal variate were applied for the pre-processing of spectral data. Least-squares support vector machines (LS-SVM) with RBF kernel function was applied to developing the SSC prediction model based on the Vis/NIRS absorbance data. The determination coefficient for prediction (Rp2) of the results predicted by LS-SVM model was 0. 962 and root mean square error (RMSEP) was 0. 434137. It is concluded that Vis/NIRS technique can quantify the SSC of grape juice beverage fast and non-destructively.. At the same time, LS-SVM model was compared with PLS and back propagation neural network (BP-NN) methods. The results showed that LS-SVM was superior to the conventional linear and non-linear methods in predicting SSC of grape juice beverage. In this study, the generation ability of LS-SVM, PLS and BP-NN models were also investigated. It is concluded that LS-SVM regression method is a promising technique for chemometrics in quantitative prediction.

  17. Chroma intra prediction based on inter-channel correlation for HEVC.

    PubMed

    Zhang, Xingyu; Gisquet, Christophe; François, Edouard; Zou, Feng; Au, Oscar C

    2014-01-01

    In this paper, we investigate a new inter-channel coding mode called LM mode proposed for the next generation video coding standard called high efficiency video coding. This mode exploits inter-channel correlation using reconstructed luma to predict chroma linearly with parameters derived from neighboring reconstructed luma and chroma pixels at both encoder and decoder to avoid overhead signaling. In this paper, we analyze the LM mode and prove that the LM parameters for predicting original chroma and reconstructed chroma are statistically the same. We also analyze the error sensitivity of the LM parameters. We identify some LM mode problematic situations and propose three novel LM-like modes called LMA, LML, and LMO to address the situations. To limit the increase in complexity due to the LM-like modes, we propose some fast algorithms with the help of some new cost functions. We further identify some potentially-problematic conditions in the parameter estimation (including regression dilution problem) and introduce a novel model correction technique to detect and correct those conditions. Simulation results suggest that considerable BD-rate reduction can be achieved by the proposed LM-like modes and model correction technique. In addition, the performance gain of the two techniques appears to be essentially additive when combined.

  18. Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression

    NASA Astrophysics Data System (ADS)

    Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra

    2013-03-01

    SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.

  19. Strategies of experiment standardization and response optimization in a rat model of hemorrhagic shock and chronic hypertension.

    PubMed

    Reynolds, Penny S; Tamariz, Francisco J; Barbee, Robert Wayne

    2010-04-01

    Exploratory pilot studies are crucial to best practice in research but are frequently conducted without a systematic method for maximizing the amount and quality of information obtained. We describe the use of response surface regression models and simultaneous optimization methods to develop a rat model of hemorrhagic shock in the context of chronic hypertension, a clinically relevant comorbidity. Response surface regression model was applied to determine optimal levels of two inputs--dietary NaCl concentration (0.49%, 4%, and 8%) and time on the diet (4, 6, 8 weeks)--to achieve clinically realistic and stable target measures of systolic blood pressure while simultaneously maximizing critical oxygen delivery (a measure of vulnerability to hemorrhagic shock) and body mass M. Simultaneous optimization of the three response variables was performed though a dimensionality reduction strategy involving calculation of a single aggregate measure, the "desirability" function. Optimal conditions for inducing systolic blood pressure of 208 mmHg, critical oxygen delivery of 4.03 mL/min, and M of 290 g were determined to be 4% [NaCl] for 5 weeks. Rats on the 8% diet did not survive past 7 weeks. Response surface regression model and simultaneous optimization method techniques are commonly used in process engineering but have found little application to date in animal pilot studies. These methods will ensure both the scientific and ethical integrity of experimental trials involving animals and provide powerful tools for the development of novel models of clinically interacting comorbidities with shock.

  20. Predicting the potential distribution of invasive exotic species using GIS and information-theoretic approaches: A case of ragweed (Ambrosia artemisiifolia L.) distribution in China

    USGS Publications Warehouse

    Hao, Chen; LiJun, Chen; Albright, Thomas P.

    2007-01-01

    Invasive exotic species pose a growing threat to the economy, public health, and ecological integrity of nations worldwide. Explaining and predicting the spatial distribution of invasive exotic species is of great importance to prevention and early warning efforts. We are investigating the potential distribution of invasive exotic species, the environmental factors that influence these distributions, and the ability to predict them using statistical and information-theoretic approaches. For some species, detailed presence/absence occurrence data are available, allowing the use of a variety of standard statistical techniques. However, for most species, absence data are not available. Presented with the challenge of developing a model based on presence-only information, we developed an improved logistic regression approach using Information Theory and Frequency Statistics to produce a relative suitability map. This paper generated a variety of distributions of ragweed (Ambrosia artemisiifolia L.) from logistic regression models applied to herbarium specimen location data and a suite of GIS layers including climatic, topographic, and land cover information. Our logistic regression model was based on Akaike's Information Criterion (AIC) from a suite of ecologically reasonable predictor variables. Based on the results we provided a new Frequency Statistical method to compartmentalize habitat-suitability in the native range. Finally, we used the model and the compartmentalized criterion developed in native ranges to "project" a potential distribution onto the exotic ranges to build habitat-suitability maps. ?? Science in China Press 2007.

  1. Automatic reconstruction of surge deposit thicknesses. Applications to some Italian volcanoes

    NASA Astrophysics Data System (ADS)

    Armienti, P.; Pareschi, M. T.

    1987-04-01

    The energy cone concept has been adopted to describe some kinds of surge deposits. The energy cone parameters (height and slope) are evaluated through a regression technique which utilizes deposit thicknesses and the correspondent quotes and heights of the energy cone. The regression also allows to evaluate a coefficient of proportionality linking the deposit thickness to the distance between topographic surface and energy line for a given eruption. Moreover, if an accurate topography is available (in this case a reconstruction of a digitalized topography of the Phlegrean Fields and of the Vesuvius), the energy cone parameters, obtained by the backfitted technique, can be used to evaluate the order of magnitude of the deposit volumes. The hazard map for a surge localized at the Solfatara (Phlegraean Fields, Naples) has been computed. The values of the energy cone parameters and the volume have been assumed to be equal to those estimated with the regression technique applied to a past surge eruption in the same area.

  2. Comparison of a Full Food-Frequency Questionnaire with the Three-Day Unweighted Food Records in Young Polish Adult Women: Implications for Dietary Assessment

    PubMed Central

    Kowalkowska, Joanna; Slowinska, Malgorzata A.; Slowinski, Dariusz; Dlugosz, Anna; Niedzwiedzka, Ewa; Wadolowska, Lidia

    2013-01-01

    The food frequency questionnaire (FFQ) and the food record (FR) are among the most common methods used in dietary research. It is important to know that is it possible to use both methods simultaneously in dietary assessment and prepare a single, comprehensive interpretation. The aim of this study was to compare the energy and nutritional value of diets, determined by the FFQ and by the three-day food records of young women. The study involved 84 female students aged 21–26 years (mean of 22.2 ± 0.8 years). Completing the FFQ was preceded by obtaining unweighted food records covering three consecutive days. Energy and nutritional value of diets was assessed for both methods (FFQ-crude, FR-crude). Data obtained for FFQ-crude were adjusted with beta-coefficient equaling 0.5915 (FFQ-adjusted) and regression analysis (FFQ-regressive). The FFQ-adjusted was calculated as FR-crude/FFQ-crude ratio of mean daily energy intake. FFQ-regressive was calculated for energy and each nutrient separately using regression equation, including FFQ-crude and FR-crude as covariates. For FR-crude and FFQ-crude the energy value of diets was standardized to 2000 kcal (FR-standardized, FFQ-standardized). Methods of statistical comparison included a dependent samples t-test, a chi-square test, and the Bland-Altman method. The mean energy intake in FFQ-crude was significantly higher than FR-crude (2740.5 kcal vs. 1621.0 kcal, respectively). For FR-standardized and FFQ-standardized, significance differences were found in the mean intake of 18 out of 31 nutrients, for FR-crude and FFQ-adjusted in 13 out of 31 nutrients and FR-crude and FFQ-regressive in 11 out of 31 nutrients. The Bland-Altman method showed an overestimation of energy and nutrient intake by FFQ-crude in comparison to FR-crude, e.g., total protein was overestimated by 34.7 g/day (95% Confidence Interval, CI: −29.6, 99.0 g/day) and fat by 48.6 g/day (95% CI: −36.4, 133.6 g/day). After regressive transformation of FFQ, the absolute difference between FFQ-regressive and FR-crude equaled 0.0 g/day and 95% CI were much better (e.g., for total protein 95% CI: −32.7, 32.7 g/day, for fat 95% CI: −49.6, 49.6 g/day). In conclusion, differences in nutritional value of diets resulted from overestimating energy intake by the FFQ in comparison to the three-day unweighted food records. Adjustment of energy and nutrient intake applied for the FFQ using various methods, particularly regression equations, significantly improved the agreement between results obtained by both methods and dietary assessment. To obtain the most accurate results in future studies using this FFQ, energy and nutrient intake should be adjusted by the regression equations presented in this paper. PMID:23877089

  3. June and August median streamflows estimated for ungaged streams in southern Maine

    USGS Publications Warehouse

    Lombard, Pamela J.

    2010-01-01

    Methods for estimating June and August median streamflows were developed for ungaged, unregulated streams in southern Maine. The methods apply to streams with drainage areas ranging in size from 0.4 to 74 square miles, with percentage of basin underlain by a sand and gravel aquifer ranging from 0 to 84 percent, and with distance from the centroid of the basin to a Gulf of Maine line paralleling the coast ranging from 14 to 94 miles. Equations were developed with data from 4 long-term continuous-record streamgage stations and 27 partial-record streamgage stations. Estimates of median streamflows at the continuous-record and partial-record stations are presented. A mathematical technique for estimating standard low-flow statistics, such as June and August median streamflows, at partial-record streamgage stations was applied by relating base-flow measurements at these stations to concurrent daily streamflows at nearby long-term (at least 10 years of record) continuous-record streamgage stations (index stations). Weighted least-squares regression analysis (WLS) was used to relate estimates of June and August median streamflows at streamgage stations to basin characteristics at these same stations to develop equations that can be used to estimate June and August median streamflows on ungaged streams. WLS accounts for different periods of record at the gaging stations. Three basin characteristics-drainage area, percentage of basin underlain by a sand and gravel aquifer, and distance from the centroid of the basin to a Gulf of Maine line paralleling the coast-are used in the final regression equation to estimate June and August median streamflows for ungaged streams. The three-variable equation to estimate June median streamflow has an average standard error of prediction from -35 to 54 percent. The three-variable equation to estimate August median streamflow has an average standard error of prediction from -45 to 83 percent. Simpler one-variable equations that use only drainage area to estimate June and August median streamflows were developed for use when less accuracy is acceptable. These equations have average standard errors of prediction from -46 to 87 percent and from -57 to 133 percent, respectively.

  4. Statistical methods for efficient design of community surveys of response to noise: Random coefficients regression models

    NASA Technical Reports Server (NTRS)

    Tomberlin, T. J.

    1985-01-01

    Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.

  5. Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case

    NASA Astrophysics Data System (ADS)

    Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann

    2017-04-01

    Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.

  6. Functional Data Analysis for Dynamical System Identification of Behavioral Processes

    PubMed Central

    Trail, Jessica B.; Collins, Linda M.; Rivera, Daniel E.; Li, Runze; Piper, Megan E.; Baker, Timothy B.

    2014-01-01

    Efficient new technology has made it straightforward for behavioral scientists to collect anywhere from several dozen to several thousand dense, repeated measurements on one or more time-varying variables. These intensive longitudinal data (ILD) are ideal for examining complex change over time, but present new challenges that illustrate the need for more advanced analytic methods. For example, in ILD the temporal spacing of observations may be irregular, and individuals may be sampled at different times. Also, it is important to assess both how the outcome changes over time and the variation between participants' time-varying processes to make inferences about a particular intervention's effectiveness within the population of interest. The methods presented in this article integrate two innovative ILD analytic techniques: functional data analysis and dynamical systems modeling. An empirical application is presented using data from a smoking cessation clinical trial. Study participants provided 42 daily assessments of pre-quit and post-quit withdrawal symptoms. Regression splines were used to approximate smooth functions of craving and negative affect and to estimate the variables' derivatives for each participant. We then modeled the dynamics of nicotine craving using standard input-output dynamical systems models. These models provide a more detailed characterization of the post-quit craving process than do traditional longitudinal models, including information regarding the type, magnitude, and speed of the response to an input. The results, in conjunction with standard engineering control theory techniques, could potentially be used by tobacco researchers to develop a more effective smoking intervention. PMID:24079929

  7. Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

    NASA Astrophysics Data System (ADS)

    Kaskhedikar, Apoorva Prakash

    According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI pertinent to the building type. The ability to identify and rank the important variables is of great importance in practical implementation of the benchmarking tools which rely on query-based building and HVAC variable filters specified by the user.

  8. Reducing Blood Culture Contamination in the Emergency Department: An Interrupted Time Series Quality Improvement Study

    PubMed Central

    Self, Wesley H.; Speroff, Theodore; Grijalva, Carlos G.; McNaughton, Candace D.; Ashburn, Jacki; Liu, Dandan; Arbogast, Patrick G.; Russ, Stephan; Storrow, Alan B.; Talbot, Thomas R.

    2012-01-01

    Objectives Blood culture contamination is a common problem in the emergency department (ED) that leads to unnecessary patient morbidity and health care costs. The study objective was to develop and evaluate the effectiveness of a quality improvement (QI) intervention for reducing blood culture contamination in an ED. Methods The authors developed a QI intervention to reduce blood culture contamination in the ED and then evaluated its effectiveness in a prospective interrupted times series study. The QI intervention involved changing the technique of blood culture specimen collection from the traditional clean procedure, to a new sterile procedure, with standardized use of sterile gloves and a new materials kit containing a 2% chlorhexidine skin antisepsis device, a sterile fenestrated drape, a sterile needle, and a procedural checklist. The intervention was implemented in a university-affiliated ED and its effect on blood culture contamination evaluated by comparing the biweekly percentages of blood cultures contaminated during a 48-week baseline period (clean technique), and 48-week intervention period (sterile technique), using segmented regression analysis with adjustment for secular trends and first-order autocorrelation. The goal was to achieve and maintain a contamination rate below 3%. Results During the baseline period, 321 out of 7,389 (4.3%) cultures were contaminated, compared to 111 of 6,590 (1.7%) during the intervention period (p < 0.001). In the segmented regression model, the intervention was associated with an immediate 2.9% (95% CI = 2.2% to 3.2%) absolute reduction in contamination. The contamination rate was maintained below 3% during each biweekly interval throughout the intervention period. Conclusions A QI assessment of ED blood culture contamination led to development of a targeted intervention to convert the process of blood culture collection from a clean to a fully sterile procedure. Implementation of this intervention led to an immediate and sustained reduction of contamination in an ED with a high baseline contamination rate. PMID:23570482

  9. Use of geostationary meteorological satellite images in convective rain estimation for flash-flood forecasting

    NASA Astrophysics Data System (ADS)

    Wardah, T.; Abu Bakar, S. H.; Bardossy, A.; Maznorizan, M.

    2008-07-01

    SummaryFrequent flash-floods causing immense devastation in the Klang River Basin of Malaysia necessitate an improvement in the real-time forecasting systems being used. The use of meteorological satellite images in estimating rainfall has become an attractive option for improving the performance of flood forecasting-and-warning systems. In this study, a rainfall estimation algorithm using the infrared (IR) information from the Geostationary Meteorological Satellite-5 (GMS-5) is developed for potential input in a flood forecasting system. Data from the records of GMS-5 IR images have been retrieved for selected convective cells to be trained with the radar rain rate in a back-propagation neural network. The selected data as inputs to the neural network, are five parameters having a significant correlation with the radar rain rate: namely, the cloud-top brightness-temperature of the pixel of interest, the mean and the standard deviation of the temperatures of the surrounding five by five pixels, the rate of temperature change, and the sobel operator that indicates the temperature gradient. In addition, three numerical weather prediction (NWP) products, namely the precipitable water content, relative humidity, and vertical wind, are also included as inputs. The algorithm is applied for the areal rainfall estimation in the upper Klang River Basin and compared with another technique that uses power-law regression between the cloud-top brightness-temperature and radar rain rate. Results from both techniques are validated against previously recorded Thiessen areal-averaged rainfall values with coefficient correlation values of 0.77 and 0.91 for the power-law regression and the artificial neural network (ANN) technique, respectively. An extra lead time of around 2 h is gained when the satellite-based ANN rainfall estimation is coupled with a rainfall-runoff model to forecast a flash-flood event in the upper Klang River Basin.

  10. A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods.

    PubMed

    Torija, Antonio J; Ruiz, Diego P

    2015-02-01

    The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R(2)=0.94 and mean absolute error (MAE)=1.14-1.16 dB(A)). Copyright © 2014 Elsevier B.V. All rights reserved.

  11. KEY COMPARISON: Final report on international key comparison CCQM-K53: Oxygen in nitrogen

    NASA Astrophysics Data System (ADS)

    Lee, Jeongsoon; Bok Lee, Jin; Moon, Dong Min; Seog Kim, Jin; van der Veen, Adriaan M. H.; Besley, Laurie; Heine, Hans-Joachim; Martin, Belén; Konopelko, L. A.; Kato, Kenji; Shimosaka, Takuya; Perez Castorena, Alejandro; Macé, Tatiana; Milton, Martin J. T.; Kelley, Mike; Guenther, Franklin; Botha, Angelique

    2010-01-01

    Gravimetry is used as the primary method for the preparation of primary standard gas mixtures in most national metrology institutes, and it requires the combined abilities of purity assessment, weighing technique and analytical skills. At the CCQM GAWG meeting in October 2005, it was agreed that KRISS should coordinate a key comparison, CCQM-K53, on the gravimetric preparation of gas, at a level of 100 µmol/mol of oxygen in nitrogen. KRISS compared the gravimetric value of each cylinder with an analytical instrument. A preparation for oxygen gas standard mixture requires particular care to be accurate, because oxygen is a major component of the atmosphere. Key issues for this comparison are related to (1) the gravimetric technique which needs at least two steps for dilution, (2) oxygen impurity in nitrogen, and (3) argon impurity in nitrogen. The key comparison reference value is obtained from the linear regression line (with origin) of a selected set of participants. The KCRV subset, except one, agree with each other. The standard deviation of the x-residuals of this group (which consists of NMIJ, VSL, NIST, NPL, BAM, KRISS and CENAM) is 0.056 µmol/mol and consistent with the uncertainties given to their standard mixtures. The standard deviation of the residuals of all participating laboratory is 0.182 µmol/mol. With respect to impurity analysis, overall argon amounts of the cylinders are in the region of about 3 µmol/mol however; four cylinders showed an argon amount fraction over 10 µmol/mol. Two of these are inconsistent with the KCRV subset. The explicit separation between two peaks of oxygen and argon in the GC chromatogram is essential to maintain analytical capability. Additionally oxygen impurity analysis in nitrogen is indispensable to ensure the preparative capability. Main text. To reach the main text of this paper, click on Final Report. Note that this text is that which appears in Appendix B of the BIPM key comparison database kcdb.bipm.org/. The final report has been peer-reviewed and approved for publication by the CCQM, according to the provisions of the CIPM Mutual Recognition Arrangement (MRA).

  12. Comparison of anchor-based and distributional approaches in estimating important difference in common cold.

    PubMed

    Barrett, Bruce; Brown, Roger; Mundt, Marlon

    2008-02-01

    Evaluative health-related quality-of-life instruments used in clinical trials should be able to detect small but important changes in health status. Several approaches to minimal important difference (MID) and responsiveness have been developed. To compare anchor-based and distributional approaches to important difference and responsiveness for the Wisconsin Upper Respiratory Symptom Survey (WURSS), an illness-specific quality of life outcomes instrument. Participants with community-acquired colds self-reported daily using the WURSS-44. Distribution-based methods calculated standardized effect size (ES) and standard error of measurement (SEM). Anchor-based methods compared daily interval changes to global ratings of change, using: (1) standard MID methods based on correspondence to ratings of "a little better" or "somewhat better," and (2) two-level multivariate regression models. About 150 adults were monitored throughout their colds (1,681 sick days.): 88% were white, 69% were women, and 50% had completed college. The mean age was 35.5 years (SD = 14.7). WURSS scores increased 2.2 points from the first to second day, and then dropped by an average of 8.2 points per day from days 2 to 7. The SEM averaged 9.1 during these 7 days. Standard methods yielded a between day MID of 22 points. Regression models of MID projected 11.3-point daily changes. Dividing these estimates of small-but-important-difference by pooled SDs yielded coefficients of .425 for standard MID, .218 for regression model, .177 for SEM, and .157 for ES. These imply per-group sample sizes of 870 using ES, 616 for SEM, 302 for regression model, and 89 for standard MID, assuming alpha = .05, beta = .20 (80% power), and two-tailed testing. Distribution and anchor-based approaches provide somewhat different estimates of small but important difference, which in turn can have substantial impact on trial design.

  13. Radiographic failure and rates of re-operation after acromioclavicular joint reconstruction: a comparison of surgical techniques.

    PubMed

    Spencer, H T; Hsu, L; Sodl, J; Arianjam, A; Yian, E H

    2016-04-01

    To compare radiographic failure and re-operation rates of anatomical coracoclavicular (CC) ligament reconstructional techniques with non-anatomical techniques after chronic high grade acromioclavicular (AC) joint injuries. We reviewed chronic AC joint reconstructions within a region-wide healthcare system to identify surgical technique, complications, radiographic failure and re-operations. Procedures fell into four categories: (1) modified Weaver-Dunn, (2) allograft fixed through coracoid and clavicular tunnels, (3) allograft loop coracoclavicular fixation, and (4) combined allograft loop and synthetic cortical button fixation. Among 167 patients (mean age 38.1 years, (standard deviation (sd) 14.7) treated at least a four week interval after injury, 154 had post-operative radiographs available for analysis. Radiographic failure occurred in 33/154 cases (21.4%), with the lowest rate in Technique 4 (2/42 4.8%, p = 0.001). Half the failures occurred by six weeks, and the Kaplan-Meier survivorship at 24 months was 94.4% (95% confidence interval (CI) 79.6 to 98.6) for Technique 4 and 69.9% (95% CI 59.4 to 78.3) for the other techniques when combined. In multivariable survival analysis, Technique 4 had better survival than other techniques (Hazard Ratio 0.162, 95% CI 0.039 to 0.068, p = 0.013). Among 155 patients with a minimum of six months post-operative insurance coverage, re-operation occurred in 9.7% (15 patients). However, in multivariable logistic regression, Technique 4 did not reach a statistically significant lower risk for re-operation (odds ratio 0.254, 95% CI 0.05 to 1.3, p = 0.11). In this retrospective series, anatomical CC ligament reconstruction using combined synthetic cortical button and allograft loop fixation had the lowest rate of radiographic failure. Anatomical coracoclavicular ligament reconstruction using combined synthetic cortical button and allograft loop fixation had the lowest rate of radiographic failure. ©2016 The British Editorial Society of Bone & Joint Surgery.

  14. Self-Concept and Participation in School Activities Reanalyzed.

    ERIC Educational Resources Information Center

    Winne, Philip H.; Walsh, John

    1980-01-01

    Yarworth and Gauthier (EJ 189 606) examined whether self-concept variables enhanced predictions about students' participation in school activities, using unstructured stepwise regression techniques. A reanalysis of their data using hierarchial regression models tested their hypothesis more appropriately, and uncovered multicollinearity and…

  15. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    PubMed

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  16. The analytical representation of viscoelastic material properties using optimization techniques

    NASA Technical Reports Server (NTRS)

    Hill, S. A.

    1993-01-01

    This report presents a technique to model viscoelastic material properties with a function of the form of the Prony series. Generally, the method employed to determine the function constants requires assuming values for the exponential constants of the function and then resolving the remaining constants through linear least-squares techniques. The technique presented here allows all the constants to be analytically determined through optimization techniques. This technique is employed in a computer program named PRONY and makes use of commercially available optimization tool developed by VMA Engineering, Inc. The PRONY program was utilized to compare the technique against previously determined models for solid rocket motor TP-H1148 propellant and V747-75 Viton fluoroelastomer. In both cases, the optimization technique generated functions that modeled the test data with at least an order of magnitude better correlation. This technique has demonstrated the capability to use small or large data sets and to use data sets that have uniformly or nonuniformly spaced data pairs. The reduction of experimental data to accurate mathematical models is a vital part of most scientific and engineering research. This technique of regression through optimization can be applied to other mathematical models that are difficult to fit to experimental data through traditional regression techniques.

  17. Regression Simulation Model. Appendix X. Users Manual,

    DTIC Science & Technology

    1981-03-01

    change as the prediction equations become refined. Whereas no notice will be provided when the changes are made, the programs will be modified such that...NATIONAL BUREAU Of STANDARDS 1963 A ___,_ __ _ __ _ . APPENDIX X ( R4/ EGRESSION IMULATION ’jDEL. Ape’A ’) 7 USERS MANUA submitted to The Great River...regression analysis and to establish a prediction equation (model). The prediction equation contains the partial regression coefficients (B-weights) which

  18. Inspiratory capacity at inflation hold in ventilated newborns: a surrogate measure for static compliance of the respiratory system.

    PubMed

    Hentschel, Roland; Semar, Nicole; Guttmann, Josef

    2012-09-01

    To study appropriateness of respiratory system compliance calculation using an inflation hold and compare it with ventilator readouts of pressure and tidal volume as well as with measurement of compliance of the respiratory system with the single-breath-single-occlusion technique gained with a standard lung function measurement. Prospective clinical trial. Level III neonatal unit of a university hospital. Sixty-seven newborns, born prematurely or at term, ventilated for a variety of pathologic conditions. A standardized sigh maneuver with a predefined peak inspiratory pressure of 30 cm H2O, termed inspiratory capacity at inflation hold, was applied. Using tidal volume, exhaled from inspiratory pause down to ambient pressure, as displayed by the ventilator, and predefined peak inspiratory pressure, compliance at inspiratory capacity at inflation hold conditions could be calculated as well as ratio of tidal volume and ventilator pressure using tidal volume and differential pressure at baseline ventilator settings: peak inspiratory pressure minus positive end-expiratory pressure. For the whole cohort, the equation for the regression between tidal volume at inspiratory capacity at inflation hold and compliance of the respiratory system was: compliance of the respiratory system = 0.052 * tidal volume at inspiratory capacity at inflation hold - 0.113, and compliance at inspiratory capacity at inflation hold conditions was closely related to the standard lung function measurement method of compliance of the respiratory system (R = 0.958). In contrast, ratio of tidal volume and ventilator pressure per kilogram calculated from the ventilator readouts and displayed against compliance of the respiratory system per kilogram yielded a broad scatter throughout the whole range of compliance; both were only weakly correlated (R = 0.309) and also the regression line was significantly different from the line of identity (p < .05). Peak inspiratory pressure at study entry did not affect the correlation between compliance at inspiratory capacity at inflation hold conditions and compliance of the respiratory system. After a standard sigh maneuver, inspiratory capacity at inflation hold and the derived quantity compliance at inspiratory capacity at inflation hold conditions can be regarded as a valid, accurate, and reliable surrogate measure for standard compliance of the respiratory system in contrast to ratio of tidal volume and ventilator pressure calculated from the ventilator readouts during ongoing mechanical ventilation at respective ventilator settings.

  19. Nationwide summary of US Geological Survey regional regression equations for estimating magnitude and frequency of floods for ungaged sites, 1993

    USGS Publications Warehouse

    Jennings, M.E.; Thomas, W.O.; Riggs, H.C.

    1994-01-01

    For many years, the U.S. Geological Survey (USGS) has been involved in the development of regional regression equations for estimating flood magnitude and frequency at ungaged sites. These regression equations are used to transfer flood characteristics from gaged to ungaged sites through the use of watershed and climatic characteristics as explanatory or predictor variables. Generally these equations have been developed on a statewide or metropolitan area basis as part of cooperative study programs with specific State Departments of Transportation or specific cities. The USGS, in cooperation with the Federal Highway Administration and the Federal Emergency Management Agency, has compiled all the current (as of September 1993) statewide and metropolitan area regression equations into a micro-computer program titled the National Flood Frequency Program.This program includes regression equations for estimating flood-peak discharges and techniques for estimating a typical flood hydrograph for a given recurrence interval peak discharge for unregulated rural and urban watersheds. These techniques should be useful to engineers and hydrologists for planning and design applications. This report summarizes the statewide regression equations for rural watersheds in each State, summarizes the applicable metropolitan area or statewide regression equations for urban watersheds, describes the National Flood Frequency Program for making these computations, and provides much of the reference information on the extrapolation variables needed to run the program.

  20. Adjusting for overdispersion in piecewise exponential regression models to estimate excess mortality rate in population-based research.

    PubMed

    Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard

    2016-10-01

    In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value <0.001). However, the flexible piecewise exponential model showed the smallest overdispersion parameter (3.2 versus 21.3) for non-flexible piecewise exponential models. We showed that there were no major differences between methods. However, using a flexible piecewise regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.

  1. Standardization of a traditional polyherbo-mineral formulation - Brahmi vati.

    PubMed

    Mishra, Amrita; Mishra, Arun K; Ghosh, Ashoke K; Jha, Shivesh

    2013-01-01

    The present study deals with standardization of an in-house standard preparation and three marketed samples of Brahmi vati, which is a traditional medicine known to be effective in mental disorders, convulsions, weak memory, high fever and hysteria. Preparation and standardization have been done by following modern scientific quality control procedures for raw material and the finished products. The scanning electron microscopic (SEM) analysis showed the reduction of metals and minerals (particle size range 2-5 µm) which indicates the proper preparation of bhasmas, the important ingredient of Brahmi vati. Findings of EDX analysis of all samples of Brahmi vati suggested the absence of Gold, an important constituent of Brahmi vati in two marketed samples. All the samples of Brahmi vati were subjected to quantitative estimation of Bacoside A (marker compound) by HPTLC technique. Extraction of the samples was done in methanol and the chromatograms were developed in Butanol: Glacial acetic acid: water (4.5:0.5:5 v/v) and detected at 225nm. The regression analysis of calibration plots of Bacoside A exhibited linear relationship in the concentration range of 50-300 ng, while the % recovery was found to be 96.06% w/w, thus proving the accuracy and precision of the analysis. The Bacoside A content in the in-house preparation was found to be higher than that of the commercial samples. The proposed HPTLC method was found to be rapid, simple and accurate for quantitative estimation of Bacoside A in different formulations. The results of this study could be used as a model data in the standardization of Brahmi vati.

  2. Odds per adjusted standard deviation: comparing strengths of associations for risk factors measured on different scales and across diseases and populations.

    PubMed

    Hopper, John L

    2015-11-15

    How can the "strengths" of risk factors, in the sense of how well they discriminate cases from controls, be compared when they are measured on different scales such as continuous, binary, and integer? Given that risk estimates take into account other fitted and design-related factors-and that is how risk gradients are interpreted-so should the presentation of risk gradients. Therefore, for each risk factor X0, I propose using appropriate regression techniques to derive from appropriate population data the best fitting relationship between the mean of X0 and all the other covariates fitted in the model or adjusted for by design (X1, X2, … , Xn). The odds per adjusted standard deviation (OPERA) presents the risk association for X0 in terms of the change in risk per s = standard deviation of X0 adjusted for X1, X2, … , Xn, rather than the unadjusted standard deviation of X0 itself. If the increased risk is relative risk (RR)-fold over A adjusted standard deviations, then OPERA = exp[ln(RR)/A] = RR(s). This unifying approach is illustrated by considering breast cancer and published risk estimates. OPERA estimates are by definition independent and can be used to compare the predictive strengths of risk factors across diseases and populations. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Does the perception of fairness and standard of care in the health system depend on the field of study? Results of an empirical analysis

    PubMed Central

    2014-01-01

    Background The main challenge in the context of health care reforms and priority setting is the establishment and/or maintenance of fairness and standard of care. For the political process and interdisciplinary discussion, the subjective perception of the health care system might even be as important as potential objective criteria. Of special interest are the perceptions of academic disciplines, whose representatives act as decision makers in the health care sector. The aim of this study is to explore and compare the subjective perception of fairness and standard of care in the German health care system among students of medicine, law, economics, philosophy, and religion. Methods Between October 2011 and January 2012, we asked freshmen and advanced students of the fields mentioned above to participate in a paper and pencil survey. Prior to this, we formulated hypotheses. The data were analysed by micro econometric regression techniques. Results Data from 1,088 students were included in the study. Medical students, freshmen, and advanced students perceive the standard of care significantly as being better than non-medical students. Differences in the perception of fairness are not significant between the freshmen of the academic disciplines; however, they increase with the number of study terms. Besides the field of study, further variables such as gender and health status have a significant impact on perceptions. Conclusions Our results show that there are differences in the perception of fairness and standard of care between academic disciplines, which might influence the interdisciplinary discussion on health care reforms and priority setting. PMID:24725356

  4. Forest biomass, canopy structure, and species composition relationships with multipolarization L-band synthetic aperture radar data

    NASA Technical Reports Server (NTRS)

    Sader, Steven A.

    1987-01-01

    The effect of forest biomass, canopy structure, and species composition on L-band synthetic aperature radar data at 44 southern Mississippi bottomland hardwood and pine-hardwood forest sites was investigated. Cross-polarization mean digital values for pine forests were significantly correlated with green weight biomass and stand structure. Multiple linear regression with five forest structure variables provided a better integrated measure of canopy roughness and produced highly significant correlation coefficients for hardwood forests using HV/VV ratio only. Differences in biomass levels and canopy structure, including branching patterns and vertical canopy stratification, were important sources of volume scatter affecting multipolarization radar data. Standardized correction techniques and calibration of aircraft data, in addition to development of canopy models, are recommended for future investigations of forest biomass and structure using synthetic aperture radar.

  5. Comparative study between derivative spectrophotometry and multivariate calibration as analytical tools applied for the simultaneous quantitation of Amlodipine, Valsartan and Hydrochlorothiazide

    NASA Astrophysics Data System (ADS)

    Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.

    2013-09-01

    Four simple, accurate and specific methods were developed and validated for the simultaneous estimation of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in commercial tablets. The derivative spectrophotometric methods include Derivative Ratio Zero Crossing (DRZC) and Double Divisor Ratio Spectra-Derivative Spectrophotometry (DDRS-DS) methods, while the multivariate calibrations used are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods were applied successfully in the determination of the drugs in laboratory-prepared mixtures and in commercial pharmaceutical preparations. The validity of the proposed methods was assessed using the standard addition technique. The linearity of the proposed methods is investigated in the range of 2-32, 4-44 and 2-20 μg/mL for AML, VAL and HCT, respectively.

  6. Determination of formaldehyde in Romanian cosmetic products using coupled GC/MS system after SPME extraction

    NASA Astrophysics Data System (ADS)

    Feher, I.; Schmutzer, G.; Voica, C.; Moldovan, Z.

    2013-11-01

    In this study we have made a quick review of some Romanian cosmetic products (shampoo, conditioner, face wash) in order to determine the formaldehyde content as well as other substances called "formaldehyde releasers". The process was performed based on solid-phase microextraction (SPME) followed by gas chromatography/mass spectrometry technique. Prior to SPME extraction we used a derivation step of formaldehyde using pentafluorophenyl hydrazine. The obtained product was adsorbed on SPME devices, then injected and desorbed into the GC/MS injection port. The concentration of formaldehyde (as derived compound) was calculated using calibration curve, having a regression coefficient of 0.9938. The performance parameters of the method were calculated using samples of standard concentration. The method proved to be sensitive, having a quantification limit (LOQ) of 0.15 μg/g.

  7. SCI model structure determination program (OSR) user's guide. [optimal subset regression

    NASA Technical Reports Server (NTRS)

    1979-01-01

    The computer program, OSR (Optimal Subset Regression) which estimates models for rotorcraft body and rotor force and moment coefficients is described. The technique used is based on the subset regression algorithm. Given time histories of aerodynamic coefficients, aerodynamic variables, and control inputs, the program computes correlation between various time histories. The model structure determination is based on these correlations. Inputs and outputs of the program are given.

  8. Robust regression on noisy data for fusion scaling laws

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be; Laboratoire de Physique des Plasmas de l'ERM - Laboratorium voor Plasmafysica van de KMS

    2014-11-15

    We introduce the method of geodesic least squares (GLS) regression for estimating fusion scaling laws. Based on straightforward principles, the method is easily implemented, yet it clearly outperforms established regression techniques, particularly in cases of significant uncertainty on both the response and predictor variables. We apply GLS for estimating the scaling of the L-H power threshold, resulting in estimates for ITER that are somewhat higher than predicted earlier.

  9. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    ERIC Educational Resources Information Center

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  10. Estimating selected low-flow frequency statistics and harmonic-mean flows for ungaged, unregulated streams in Indiana

    USGS Publications Warehouse

    Martin, Gary R.; Fowler, Kathleen K.; Arihood, Leslie D.

    2016-09-06

    Information on low-flow characteristics of streams is essential for the management of water resources. This report provides equations for estimating the 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years and the harmonic-mean flow at ungaged, unregulated stream sites in Indiana. These equations were developed using the low-flow statistics and basin characteristics for 108 continuous-record streamgages in Indiana with at least 10 years of daily mean streamflow data through the 2011 climate year (April 1 through March 31). The equations were developed in cooperation with the Indiana Department of Environmental Management.Regression techniques were used to develop the equations for estimating low-flow frequency statistics and the harmonic-mean flows on the basis of drainage-basin characteristics. A geographic information system was used to measure basin characteristics for selected streamgages. A final set of 25 basin characteristics measured at all the streamgages were evaluated to choose the best predictors of the low-flow statistics.Logistic-regression equations applicable statewide are presented for estimating the probability that selected low-flow frequency statistics equal zero. These equations use the explanatory variables total drainage area, average transmissivity of the full thickness of the unconsolidated deposits within 1,000 feet of the stream network, and latitude of the basin outlet. The percentage of the streamgage low-flow statistics correctly classified as zero or nonzero using the logistic-regression equations ranged from 86.1 to 88.9 percent.Generalized-least-squares regression equations applicable statewide for estimating nonzero low-flow frequency statistics use total drainage area, the average hydraulic conductivity of the top 70 feet of unconsolidated deposits, the slope of the basin, and the index of permeability and thickness of the Quaternary surficial sediments as explanatory variables. The average standard error of prediction of these regression equations ranges from 55.7 to 61.5 percent.Regional weighted-least-squares regression equations were developed for estimating the harmonic-mean flows by dividing the State into three low-flow regions. The Northern region uses total drainage area and the average transmissivity of the entire thickness of unconsolidated deposits as explanatory variables. The Central region uses total drainage area, the average hydraulic conductivity of the entire thickness of unconsolidated deposits, and the index of permeability and thickness of the Quaternary surficial sediments. The Southern region uses total drainage area and the percent of the basin covered by forest. The average standard error of prediction for these equations ranges from 39.3 to 66.7 percent.The regional regression equations are applicable only to stream sites with low flows unaffected by regulation and to stream sites with drainage basin characteristic values within specified limits. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features and for urbanized basins. Extrapolations near and beyond the applicable basin characteristic limits will have unknown errors that may be large. Equations are presented for use in estimating the 90-percent prediction interval of the low-flow statistics estimated by use of the regression equations at a given stream site.The regression equations are to be incorporated into the U.S. Geological Survey StreamStats Web-based application for Indiana. StreamStats allows users to select a stream site on a map and automatically measure the needed basin characteristics and compute the estimated low-flow statistics and associated prediction intervals.

  11. Palus Somni - Anomalies in the correlation of Al/Si X-ray fluorescence intensity ratios and broad-spectrum visible albedos. [lunar surface mineralogy

    NASA Technical Reports Server (NTRS)

    Clark, P. E.; Andre, C. G.; Adler, I.; Weidner, J.; Podwysocki, M.

    1976-01-01

    The positive correlation between Al/Si X-ray fluorescence intensity ratios determined during the Apollo 15 lunar mission and a broad-spectrum visible albedo of the moon is quantitatively established. Linear regression analysis performed on 246 1 degree geographic cells of X-ray fluorescence intensity and visible albedo data points produced a statistically significant correlation coefficient of .78. Three distinct distributions of data were identified as (1) within one standard deviation of the regression line, (2) greater than one standard deviation below the line, and (3) greater than one standard deviation above the line. The latter two distributions of data were found to occupy distinct geographic areas in the Palus Somni region.

  12. Application of nonlinear least-squares regression to ground-water flow modeling, west-central Florida

    USGS Publications Warehouse

    Yobbi, D.K.

    2000-01-01

    A nonlinear least-squares regression technique for estimation of ground-water flow model parameters was applied to an existing model of the regional aquifer system underlying west-central Florida. The regression technique minimizes the differences between measured and simulated water levels. Regression statistics, including parameter sensitivities and correlations, were calculated for reported parameter values in the existing model. Optimal parameter values for selected hydrologic variables of interest are estimated by nonlinear regression. Optimal estimates of parameter values are about 140 times greater than and about 0.01 times less than reported values. Independently estimating all parameters by nonlinear regression was impossible, given the existing zonation structure and number of observations, because of parameter insensitivity and correlation. Although the model yields parameter values similar to those estimated by other methods and reproduces the measured water levels reasonably accurately, a simpler parameter structure should be considered. Some possible ways of improving model calibration are to: (1) modify the defined parameter-zonation structure by omitting and/or combining parameters to be estimated; (2) carefully eliminate observation data based on evidence that they are likely to be biased; (3) collect additional water-level data; (4) assign values to insensitive parameters, and (5) estimate the most sensitive parameters first, then, using the optimized values for these parameters, estimate the entire data set.

  13. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  14. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  15. Longitudinal Monitoring of Antibody Responses against Tumor Cells Using Magneto-nanosensors with a Nanoliter of Blood.

    PubMed

    Lee, Jung-Rok; Chan, Carmel T; Ruderman, Daniel; Chuang, Hui-Yen; Gaster, Richard S; Atallah, Michelle; Mallick, Parag; Lowe, Scott W; Gambhir, Sanjiv S; Wang, Shan X

    2017-11-08

    Each immunoglobulin isotype has unique immune effector functions. The contribution of these functions in the elimination of pathogens and tumors can be determined by monitoring quantitative temporal changes in isotype levels. Here, we developed a novel technique using magneto-nanosensors based on the effect of giant magnetoresistance (GMR) for longitudinal monitoring of total and antigen-specific isotype levels with high precision, using as little as 1 nL of serum. Combining in vitro serologic measurements with in vivo imaging techniques, we investigated the role of the antibody response in the regression of firefly luciferase (FL)-labeled lymphoma cells in spleen, kidney, and lymph nodes in a syngeneic Burkitt's lymphoma mouse model. Regression status was determined by whole body bioluminescent imaging (BLI). The magneto-nanosensors revealed that anti-FL IgG2a and total IgG2a were elevated and sustained in regression mice compared to non-regression mice (p < 0.05). This platform shows promise for monitoring immunotherapy, vaccination, and autoimmunity.

  16. Regression equations for estimating flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-Year recurrence intervals in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2004-01-01

    Multiple linear-regression equations were developed to estimate the magnitudes of floods in Connecticut for recurrence intervals ranging from 2 to 500 years. The equations can be used for nonurban, unregulated stream sites in Connecticut with drainage areas ranging from about 2 to 715 square miles. Flood-frequency data and hydrologic characteristics from 70 streamflow-gaging stations and the upstream drainage basins were used to develop the equations. The hydrologic characteristics?drainage area, mean basin elevation, and 24-hour rainfall?are used in the equations to estimate the magnitude of floods. Average standard errors of prediction for the equations are 31.8, 32.7, 34.4, 35.9, 37.6 and 45.0 percent for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals, respectively. Simplified equations using only one hydrologic characteristic?drainage area?also were developed. The regression analysis is based on generalized least-squares regression techniques. Observed flows (log-Pearson Type III analysis of the annual maximum flows) from five streamflow-gaging stations in urban basins in Connecticut were compared to flows estimated from national three-parameter and seven-parameter urban regression equations. The comparison shows that the three- and seven- parameter equations used in conjunction with the new statewide equations generally provide reasonable estimates of flood flows for urban sites in Connecticut, although a national urban flood-frequency study indicated that the three-parameter equations significantly underestimated flood flows in many regions of the country. Verification of the accuracy of the three-parameter or seven-parameter national regression equations using new data from Connecticut stations was beyond the scope of this study. A technique for calculating flood flows at streamflow-gaging stations using a weighted average also is described. Two estimates of flood flows?one estimate based on the log-Pearson Type III analyses of the annual maximum flows at the gaging station, and the other estimate from the regression equation?are weighted together based on the years of record at the gaging station and the equivalent years of record value determined from the regression. Weighted averages of flood flows for the 2-, 10-, 25-, 50-, 100-, and 500-year recurrence intervals are tabulated for the 70 streamflow-gaging stations used in the regression analysis. Generally, weighted averages give the most accurate estimate of flood flows at gaging stations. An evaluation of the Connecticut's streamflow-gaging network was performed to determine whether the spatial coverage and range of geographic and hydrologic conditions are adequately represented for transferring flood characteristics from gaged to ungaged sites. Fifty-one of 54 stations in the current (2004) network support one or more flood needs of federal, state, and local agencies. Twenty-five of 54 stations in the current network are considered high-priority stations by the U.S. Geological Survey because of their contribution to the longterm understanding of floods, and their application for regionalflood analysis. Enhancements to the network to improve overall effectiveness for regionalization can be made by increasing the spatial coverage of gaging stations, establishing stations in regions of the state that are not well-represented, and adding stations in basins with drainage area sizes not represented. Additionally, the usefulness of the network for characterizing floods can be maintained and improved by continuing operation at the current stations because flood flows can be more accurately estimated at stations with continuous, long-term record.

  17. Methods for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma

    USGS Publications Warehouse

    Esralew, Rachel A.; Smith, S. Jerrod

    2010-01-01

    Flow statistics can be used to provide decision makers with surface-water information needed for activities such as water-supply permitting, flow regulation, and other water rights issues. Flow statistics could be needed at any location along a stream. Most often, streamflow statistics are needed at ungaged sites, where no flow data are available to compute the statistics. Methods are presented in this report for estimating flow-duration and annual mean-flow statistics for ungaged streams in Oklahoma. Flow statistics included the (1) annual (period of record), (2) seasonal (summer-autumn and winter-spring), and (3) 12 monthly duration statistics, including the 20th, 50th, 80th, 90th, and 95th percentile flow exceedances, and the annual mean-flow (mean of daily flows for the period of record). Flow statistics were calculated from daily streamflow information collected from 235 streamflow-gaging stations throughout Oklahoma and areas in adjacent states. A drainage-area ratio method is the preferred method for estimating flow statistics at an ungaged location that is on a stream near a gage. The method generally is reliable only if the drainage-area ratio of the two sites is between 0.5 and 1.5. Regression equations that relate flow statistics to drainage-basin characteristics were developed for the purpose of estimating selected flow-duration and annual mean-flow statistics for ungaged streams that are not near gaging stations on the same stream. Regression equations were developed from flow statistics and drainage-basin characteristics for 113 unregulated gaging stations. Separate regression equations were developed by using U.S. Geological Survey streamflow-gaging stations in regions with similar drainage-basin characteristics. These equations can increase the accuracy of regression equations used for estimating flow-duration and annual mean-flow statistics at ungaged stream locations in Oklahoma. Streamflow-gaging stations were grouped by selected drainage-basin characteristics by using a k-means cluster analysis. Three regions were identified for Oklahoma on the basis of the clustering of gaging stations and a manual delineation of distinguishable hydrologic and geologic boundaries: Region 1 (western Oklahoma excluding the Oklahoma and Texas Panhandles), Region 2 (north- and south-central Oklahoma), and Region 3 (eastern and central Oklahoma). A total of 228 regression equations (225 flow-duration regressions and three annual mean-flow regressions) were developed using ordinary least-squares and left-censored (Tobit) multiple-regression techniques. These equations can be used to estimate 75 flow-duration statistics and annual mean-flow for ungaged streams in the three regions. Drainage-basin characteristics that were statistically significant independent variables in the regression analyses were (1) contributing drainage area; (2) station elevation; (3) mean drainage-basin elevation; (4) channel slope; (5) percentage of forested canopy; (6) mean drainage-basin hillslope; (7) soil permeability; and (8) mean annual, seasonal, and monthly precipitation. The accuracy of flow-duration regression equations generally decreased from high-flow exceedance (low-exceedance probability) to low-flow exceedance (high-exceedance probability) . This decrease may have happened because a greater uncertainty exists for low-flow estimates and low-flow is largely affected by localized geology that was not quantified by the drainage-basin characteristics selected. The standard errors of estimate of regression equations for Region 1 (western Oklahoma) were substantially larger than those standard errors for other regions, especially for low-flow exceedances. These errors may be a result of greater variability in low flow because of increased irrigation activities in this region. Regression equations may not be reliable for sites where the drainage-basin characteristics are outside the range of values of independent vari

  18. 48 CFR 9904.401-50 - Techniques for application.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    .... 9904.401-50 Section 9904.401-50 Federal Acquisition Regulations System COST ACCOUNTING STANDARDS BOARD, OFFICE OF FEDERAL PROCUREMENT POLICY, OFFICE OF MANAGEMENT AND BUDGET PROCUREMENT PRACTICES AND COST ACCOUNTING STANDARDS COST ACCOUNTING STANDARDS 9904.401-50 Techniques for application. (a) The standard...

  19. Therapeutic endorsement enhances compliance with national glaucoma guidelines in Australian and New Zealand optometrists.

    PubMed

    Zangerl, Barbara; Hayen, Andrew; Mitchell, Paul; Jamous, Khalid F; Stapleton, Fiona; Kalloniatis, Michael

    2015-03-01

    Previous studies confirmed that optometrists have access to and confidence in applying clinical tests recommended for glaucoma assessment. Less is known about factors best predicting compliance with national clinical guidelines and thus by inference, the provision of suitable care by primary care ophthalmic practitioners. We utilised the unique two-tiered profession (therapeutic and non-therapeutic scope of practice) in Australia and New Zealand to assess the prospective adherence to glaucoma guidelines dependent on the clinician's background. Australian and New Zealand optometrists were surveyed on ophthalmic techniques for glaucoma assessment, criteria for the evaluation of the optic nerve head, glaucoma risk categories and review times while also recording background, training, and experience. Parameters identifying progression/conversion and patients' risk levels were analysed comparatively to ophthalmologists' opinions. Linear regression analysis identified variables significantly improving the likelihood of concordance with guidelines. Reported application of techniques complied well with glaucoma guidelines although gonioscopy and pachymetry, pupil dilation for optic nerve head examination, and acquisition of permanent records were less frequently employed. The main predictors for entry-level diagnostic standards were therapeutic endorsement together with the associated knowledge of relevant guidance and procedural confidence. Other findings suggested a potential underestimation in the value of optic disc size and intraocular pressure for the prediction of glaucoma risk, while optometrists more frequently relied on the outcomes of non-standardised automated perimetry and auxiliary imaging. Optometrists in Australia and New Zealand may not always exercise optimal clinical acumen regarding techniques/criteria for glaucoma diagnosis. Therapeutic endorsement was gradually adopted in different jurisdictions in various forms since 1999 and is mandatory for registration since late 2014. The result of the two-tiered optometric cohorts suggest that inclusion of therapeutic training as part of the core training is likely a key factor to enhanced compliance with glaucoma guidelines. Improved adherence to the current clinical standards should positively impact on the facilitation of appropriate glaucoma diagnosis and management. Obligatory knowledge and possibly accreditation of available guidelines might ensure a uniform standard in glaucoma testing protocols in concordance with compulsory entry-level skills. © 2015 The Authors Ophthalmic & Physiological Optics © 2015 The College of Optometrists.

  20. Running Technique is an Important Component of Running Economy and Performance

    PubMed Central

    FOLLAND, JONATHAN P.; ALLEN, SAM J.; BLACK, MATTHEW I.; HANDSAKER, JOSEPH C.; FORRESTER, STEPHANIE E.

    2017-01-01

    ABSTRACT Despite an intuitive relationship between technique and both running economy (RE) and performance, and the diverse techniques used by runners to achieve forward locomotion, the objective importance of overall technique and the key components therein remain to be elucidated. Purpose This study aimed to determine the relationship between individual and combined kinematic measures of technique with both RE and performance. Methods Ninety-seven endurance runners (47 females) of diverse competitive standards performed a discontinuous protocol of incremental treadmill running (4-min stages, 1-km·h−1 increments). Measurements included three-dimensional full-body kinematics, respiratory gases to determine energy cost, and velocity of lactate turn point. Five categories of kinematic measures (vertical oscillation, braking, posture, stride parameters, and lower limb angles) and locomotory energy cost (LEc) were averaged across 10–12 km·h−1 (the highest common velocity < velocity of lactate turn point). Performance was measured as season's best (SB) time converted to a sex-specific z-score. Results Numerous kinematic variables were correlated with RE and performance (LEc, 19 variables; SB time, 11 variables). Regression analysis found three variables (pelvis vertical oscillation during ground contact normalized to height, minimum knee joint angle during ground contact, and minimum horizontal pelvis velocity) explained 39% of LEc variability. In addition, four variables (minimum horizontal pelvis velocity, shank touchdown angle, duty factor, and trunk forward lean) combined to explain 31% of the variability in performance (SB time). Conclusions This study provides novel and robust evidence that technique explains a substantial proportion of the variance in RE and performance. We recommend that runners and coaches are attentive to specific aspects of stride parameters and lower limb angles in part to optimize pelvis movement, and ultimately enhance performance. PMID:28263283

  1. When continuous observations just won't do: developing accurate and efficient sampling strategies for the laying hen.

    PubMed

    Daigle, Courtney L; Siegford, Janice M

    2014-03-01

    Continuous observation is the most accurate way to determine animals' actual time budget and can provide a 'gold standard' representation of resource use, behavior frequency, and duration. Continuous observation is useful for capturing behaviors that are of short duration or occur infrequently. However, collecting continuous data is labor intensive and time consuming, making multiple individual or long-term data collection difficult. Six non-cage laying hens were video recorded for 15 h and behavioral data collected every 2 s were compared with data collected using scan sampling intervals of 5, 10, 15, 30, and 60 min and subsamples of 2 second observations performed for 10 min every 30 min, 15 min every 1 h, 30 min every 1.5 h, and 15 min every 2 h. Three statistical approaches were used to provide a comprehensive analysis to examine the quality of the data obtained via different sampling methods. General linear mixed models identified how the time budget from the sampling techniques differed from continuous observation. Correlation analysis identified how strongly results from the sampling techniques were associated with those from continuous observation. Regression analysis identified how well the results from the sampling techniques were associated with those from continuous observation, changes in magnitude, and whether a sampling technique had bias. Static behaviors were well represented with scan and time sampling techniques, while dynamic behaviors were best represented with time sampling techniques. Methods for identifying an appropriate sampling strategy based upon the type of behavior of interest are outlined and results for non-caged laying hens are presented. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Methods for estimating selected low-flow frequency statistics for unregulated streams in Kentucky

    USGS Publications Warehouse

    Martin, Gary R.; Arihood, Leslie D.

    2010-01-01

    This report provides estimates of, and presents methods for estimating, selected low-flow frequency statistics for unregulated streams in Kentucky including the 30-day mean low flows for recurrence intervals of 2 and 5 years (30Q2 and 30Q5) and the 7-day mean low flows for recurrence intervals of 5, 10, and 20 years (7Q2, 7Q10, and 7Q20). Estimates of these statistics are provided for 121 U.S. Geological Survey streamflow-gaging stations with data through the 2006 climate year, which is the 12-month period ending March 31 of each year. Data were screened to identify the periods of homogeneous, unregulated flows for use in the analyses. Logistic-regression equations are presented for estimating the annual probability of the selected low-flow frequency statistics being equal to zero. Weighted-least-squares regression equations were developed for estimating the magnitude of the nonzero 30Q2, 30Q5, 7Q2, 7Q10, and 7Q20 low flows. Three low-flow regions were defined for estimating the 7-day low-flow frequency statistics. The explicit explanatory variables in the regression equations include total drainage area and the mapped streamflow-variability index measured from a revised statewide coverage of this characteristic. The percentage of the station low-flow statistics correctly classified as zero or nonzero by use of the logistic-regression equations ranged from 87.5 to 93.8 percent. The average standard errors of prediction of the weighted-least-squares regression equations ranged from 108 to 226 percent. The 30Q2 regression equations have the smallest standard errors of prediction, and the 7Q20 regression equations have the largest standard errors of prediction. The regression equations are applicable only to stream sites with low flows unaffected by regulation from reservoirs and local diversions of flow and to drainage basins in specified ranges of basin characteristics. Caution is advised when applying the equations for basins with characteristics near the applicable limits and for basins with karst drainage features.

  3. Income or living standard and health in Germany: different ways of measurement of relative poverty with regard to self-rated health.

    PubMed

    Pfoertner, Timo-Kolja; Andress, Hans-Juergen; Janssen, Christian

    2011-08-01

    Current study introduces the living standard concept as an alternative approach of measuring poverty and compares its explanatory power to an income-based poverty measure with regard to subjective health status of the German population. Analyses are based on the German Socio-Economic Panel (2001, 2003 and 2005) and refer to binary logistic regressions of poor subjective health status with regard to each poverty condition, their duration and their causal influence from a previous time point. To calculate the discriminate power of both poverty indicators, initially the indicators were considered separately in regression models and subsequently, both were included simultaneously. The analyses reveal a stronger poverty-health relationship for the living standard indicator. An inadequate living standard in 2005, longer spells of an inadequate living standard between 2001, 2003 and 2005 as well as an inadequate living standard at a previous time point is significantly strongly associated with subjective health than income poverty. Our results challenge conventional measurements of the relationship between poverty and health that probably has been underestimated by income measures so far.

  4. Modeling and managing risk early in software development

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Thomas, William M.; Hetmanski, Christopher J.

    1993-01-01

    In order to improve the quality of the software development process, we need to be able to build empirical multivariate models based on data collectable early in the software process. These models need to be both useful for prediction and easy to interpret, so that remedial actions may be taken in order to control and optimize the development process. We present an automated modeling technique which can be used as an alternative to regression techniques. We show how it can be used to facilitate the identification and aid the interpretation of the significant trends which characterize 'high risk' components in several Ada systems. Finally, we evaluate the effectiveness of our technique based on a comparison with logistic regression based models.

  5. Stochastic subset selection for learning with kernel machines.

    PubMed

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  6. Sampling techniques for thrips (Thysanoptera: Thripidae) in preflowering tomato.

    PubMed

    Joost, P Houston; Riley, David G

    2004-08-01

    Sampling techniques for thrips (Thysanoptera: Thripidae) were compared in preflowering tomato plants at the Coastal Plain Experiment Station in Tifton, GA, in 2000 and 2003, to determine the most effective method of determining abundance of thrips on tomato foliage early in the growing season. Three relative sampling techniques, including a standard insect aspirator, a 946-ml beat cup, and an insect vacuum device, were compared for accuracy to an absolute method and to themselves for precision and efficiency of sampling thrips. Thrips counts of all relative sampling methods were highly correlated (R > 0.92) to the absolute method. The aspirator method was the most accurate compared with the absolute sample according to regression analysis in 2000. In 2003, all sampling methods were considered accurate according to Dunnett's test, but thrips numbers were lower and sample variation was greater than in 2000. In 2000, the beat cup method had the lowest relative variation (RV) or best precision, at 1 and 8 d after transplant (DAT). Only the beat cup method had RV values <25 for all sampling dates. In 2003, the beat cup method had the lowest RV value at 15 and 21 DAT. The beat cup method also was the most efficient method for all sample dates in both years. Frankliniella fusca (Pergande) was the most abundant thrips species on the foliage of preflowering tomato in both years of study at this location. Overall, the best thrips sampling technique tested was the beat cup method in terms of precision and sampling efficiency.

  7. Rapid estimation of nutritional elements on citrus leaves by near infrared reflectance spectroscopy.

    PubMed

    Galvez-Sola, Luis; García-Sánchez, Francisco; Pérez-Pérez, Juan G; Gimeno, Vicente; Navarro, Josefa M; Moral, Raul; Martínez-Nicolás, Juan J; Nieves, Manuel

    2015-01-01

    Sufficient nutrient application is one of the most important factors in producing quality citrus fruits. One of the main guides in planning citrus fertilizer programs is by directly monitoring the plant nutrient content. However, this requires analysis of a large number of leaf samples using expensive and time-consuming chemical techniques. Over the last 5 years, it has been demonstrated that it is possible to quantitatively estimate certain nutritional elements in citrus leaves by using the spectral reflectance values, obtained by using near infrared reflectance spectroscopy (NIRS). This technique is rapid, non-destructive, cost-effective and environmentally friendly. Therefore, the estimation of macro and micronutrients in citrus leaves by this method would be beneficial in identifying the mineral status of the trees. However, to be used effectively NIRS must be evaluated against the standard techniques across different cultivars. In this study, NIRS spectral analysis, and subsequent nutrient estimations for N, K, Ca, Mg, B, Fe, Cu, Mn, and Zn concentration, were performed using 217 leaf samples from different citrus trees species. Partial least square regression and different pre-processing signal treatments were used to generate the best estimation against the current best practice techniques. It was verified a high proficiency in the estimation of N (Rv = 0.99) and Ca (Rv = 0.98) as well as achieving acceptable estimation for K, Mg, Fe, and Zn. However, no successful calibrations were obtained for the estimation of B, Cu, and Mn.

  8. Rapid estimation of nutritional elements on citrus leaves by near infrared reflectance spectroscopy

    PubMed Central

    Galvez-Sola, Luis; García-Sánchez, Francisco; Pérez-Pérez, Juan G.; Gimeno, Vicente; Navarro, Josefa M.; Moral, Raul; Martínez-Nicolás, Juan J.; Nieves, Manuel

    2015-01-01

    Sufficient nutrient application is one of the most important factors in producing quality citrus fruits. One of the main guides in planning citrus fertilizer programs is by directly monitoring the plant nutrient content. However, this requires analysis of a large number of leaf samples using expensive and time-consuming chemical techniques. Over the last 5 years, it has been demonstrated that it is possible to quantitatively estimate certain nutritional elements in citrus leaves by using the spectral reflectance values, obtained by using near infrared reflectance spectroscopy (NIRS). This technique is rapid, non-destructive, cost-effective and environmentally friendly. Therefore, the estimation of macro and micronutrients in citrus leaves by this method would be beneficial in identifying the mineral status of the trees. However, to be used effectively NIRS must be evaluated against the standard techniques across different cultivars. In this study, NIRS spectral analysis, and subsequent nutrient estimations for N, K, Ca, Mg, B, Fe, Cu, Mn, and Zn concentration, were performed using 217 leaf samples from different citrus trees species. Partial least square regression and different pre-processing signal treatments were used to generate the best estimation against the current best practice techniques. It was verified a high proficiency in the estimation of N (Rv = 0.99) and Ca (Rv = 0.98) as well as achieving acceptable estimation for K, Mg, Fe, and Zn. However, no successful calibrations were obtained for the estimation of B, Cu, and Mn. PMID:26257767

  9. The Impact of Adherence and Instillation Proficiency of Topical Glaucoma Medications on Intraocular Pressure

    PubMed Central

    Shibeshi, Workineh; T. Giorgis, Abeba; Asgedom, Solomon Weldegebreal

    2017-01-01

    Background The possible sequel of poorly controlled intraocular pressure (IOP) includes treatment failure, unnecessary medication use, and economic burden on patients with glaucoma. Objective To assess the impact of adherence and instillation technique on IOP control. Methods A cross-sectional study was conducted on 359 glaucoma patients in Menelik II Hospital from June 1 to July 31, 2015. After conducting a Q-Q analysis, multiple binary logistic analyses, linear regression analyses, and two-tailed paired t-test were conducted to compare IOP in the baseline versus current measurements. Results Intraocular pressure was controlled in 59.6% of the patients and was relatively well controlled during the study period (mean (M) = 17.911 mmHg, standard deviation (S) = 0.323) compared to the baseline (M = 20.866 mmHg, S = 0.383, t (358) = −6.70, p < 0.0001). A unit increase in the administration technique score resulted in a 0.272 mmHg decrease in IOP (p = 0.03). Moreover, primary angle-closure glaucoma (adjusted odds ratio (AOR) = 0.347, 95% confidence interval (CI): 0.144–0.836) and two medications (AOR = 1.869, 95% CI: 1.259–9.379) were factors affecting IOP. Conclusion Good instillation technique of the medications was correlated with a reduction in IOP. Consequently, regular assessment of the instillation technique and IOP should be done for better management of the disease. PMID:29104803

  10. 48 CFR 9904.413-50 - Techniques for application.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    .... 9904.413-50 Section 9904.413-50 Federal Acquisition Regulations System COST ACCOUNTING STANDARDS BOARD... ACCOUNTING STANDARDS COST ACCOUNTING STANDARDS 9904.413-50 Techniques for application. (a) Assignment of actuarial gains and losses. (1) In accordance with the provisions of Cost Accounting Standard 9904.412...

  11. Effect of Contact Damage on the Strength of Ceramic Materials.

    DTIC Science & Technology

    1982-10-01

    variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F

  12. Regional Regression Equations to Estimate Flow-Duration Statistics at Ungaged Stream Sites in Connecticut

    USGS Publications Warehouse

    Ahearn, Elizabeth A.

    2010-01-01

    Multiple linear regression equations for determining flow-duration statistics were developed to estimate select flow exceedances ranging from 25- to 99-percent for six 'bioperiods'-Salmonid Spawning (November), Overwinter (December-February), Habitat Forming (March-April), Clupeid Spawning (May), Resident Spawning (June), and Rearing and Growth (July-October)-in Connecticut. Regression equations also were developed to estimate the 25- and 99-percent flow exceedances without reference to a bioperiod. In total, 32 equations were developed. The predictive equations were based on regression analyses relating flow statistics from streamgages to GIS-determined basin and climatic characteristics for the drainage areas of those streamgages. Thirty-nine streamgages (and an additional 6 short-term streamgages and 28 partial-record sites for the non-bioperiod 99-percent exceedance) in Connecticut and adjacent areas of neighboring States were used in the regression analysis. Weighted least squares regression analysis was used to determine the predictive equations; weights were assigned based on record length. The basin characteristics-drainage area, percentage of area with coarse-grained stratified deposits, percentage of area with wetlands, mean monthly precipitation (November), mean seasonal precipitation (December, January, and February), and mean basin elevation-are used as explanatory variables in the equations. Standard errors of estimate of the 32 equations ranged from 10.7 to 156 percent with medians of 19.2 and 55.4 percent to predict the 25- and 99-percent exceedances, respectively. Regression equations to estimate high and median flows (25- to 75-percent exceedances) are better predictors (smaller variability of the residual values around the regression line) than the equations to estimate low flows (less than 75-percent exceedance). The Habitat Forming (March-April) bioperiod had the smallest standard errors of estimate, ranging from 10.7 to 20.9 percent. In contrast, the Rearing and Growth (July-October) bioperiod had the largest standard errors, ranging from 30.9 to 156 percent. The adjusted coefficient of determination of the equations ranged from 77.5 to 99.4 percent with medians of 98.5 and 90.6 percent to predict the 25- and 99-percent exceedances, respectively. Descriptive information on the streamgages used in the regression, measured basin and climatic characteristics, and estimated flow-duration statistics are provided in this report. Flow-duration statistics and the 32 regression equations for estimating flow-duration statistics in Connecticut are stored on the U.S. Geological Survey World Wide Web application ?StreamStats? (http://water.usgs.gov/osw/streamstats/index.html). The regression equations developed in this report can be used to produce unbiased estimates of select flow exceedances statewide.

  13. Mean annual runoff and peak flow estimates based on channel geometry of streams in northeastern and western Montana

    USGS Publications Warehouse

    Parrett, Charles; Omang, R.J.; Hull, J.A.

    1983-01-01

    Equations for estimating mean annual runoff and peak discharge from measurements of channel geometry were developed for western and northeastern Montana. The study area was divided into two regions for the mean annual runoff analysis, and separate multiple-regression equations were developed for each region. The active-channel width was determined to be the most important independent variable in each region. The standard error of estimate for the estimating equation using active-channel width was 61 percent in the Northeast Region and 38 percent in the West region. The study area was divided into six regions for the peak discharge analysis, and multiple regression equations relating channel geometry and basin characteristics to peak discharges having recurrence intervals of 2, 5, 10, 25, 50 and 100 years were developed for each region. The standard errors of estimate for the regression equations using only channel width as an independent variable ranged from 35 to 105 percent. The standard errors improved in four regions as basin characteristics were added to the estimating equations. (USGS)

  14. Modelling daily water temperature from air temperature for the Missouri River.

    PubMed

    Zhu, Senlin; Nyarko, Emmanuel Karlo; Hadzima-Nyarko, Marijana

    2018-01-01

    The bio-chemical and physical characteristics of a river are directly affected by water temperature, which thereby affects the overall health of aquatic ecosystems. It is a complex problem to accurately estimate water temperature. Modelling of river water temperature is usually based on a suitable mathematical model and field measurements of various atmospheric factors. In this article, the air-water temperature relationship of the Missouri River is investigated by developing three different machine learning models (Artificial Neural Network (ANN), Gaussian Process Regression (GPR), and Bootstrap Aggregated Decision Trees (BA-DT)). Standard models (linear regression, non-linear regression, and stochastic models) are also developed and compared to machine learning models. Analyzing the three standard models, the stochastic model clearly outperforms the standard linear model and nonlinear model. All the three machine learning models have comparable results and outperform the stochastic model, with GPR having slightly better results for stations No. 2 and 3, while BA-DT has slightly better results for station No. 1. The machine learning models are very effective tools which can be used for the prediction of daily river temperature.

  15. MULGRES: a computer program for stepwise multiple regression analysis

    Treesearch

    A. Jeff Martin

    1971-01-01

    MULGRES is a computer program source deck that is designed for multiple regression analysis employing the technique of stepwise deletion in the search for most significant variables. The features of the program, along with inputs and outputs, are briefly described, with a note on machine compatibility.

  16. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    PubMed

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  17. Teaching Students Not to Dismiss the Outermost Observations in Regressions

    ERIC Educational Resources Information Center

    Kasprowicz, Tomasz; Musumeci, Jim

    2015-01-01

    One econometric rule of thumb is that greater dispersion in observations of the independent variable improves estimates of regression coefficients and therefore produces better results, i.e., lower standard errors of the estimates. Nevertheless, students often seem to mistrust precisely the observations that contribute the most to this greater…

  18. From Equal to Equivalent Pay: Salary Discrimination in Academia

    ERIC Educational Resources Information Center

    Greenfield, Ester

    1977-01-01

    Examines the federal statutes barring sex discrimination in employment and argues that the work of any two professors is comparable but not equal. Suggests using regression analysis to prove salary discrimination and discusses the legal justification for adopting regression analysis and the standard of comparable pay for comparable work.…

  19. Tracking the Gender Pay Gap: A Case Study

    ERIC Educational Resources Information Center

    Travis, Cheryl B.; Gross, Louis J.; Johnson, Bruce A.

    2009-01-01

    This article provides a short introduction to standard considerations in the formal study of wages and illustrates the use of multiple regression and resampling simulation approaches in a case study of faculty salaries at one university. Multiple regression is especially beneficial where it provides information on strength of association, specific…

  20. A Constrained Linear Estimator for Multiple Regression

    ERIC Educational Resources Information Center

    Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

    2010-01-01

    "Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

Top