Sample records for linear regression software

  1. U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

    DTIC Science & Technology

    2017-10-01

    ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID

  2. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  3. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  4. Radio Propagation Prediction Software for Complex Mixed Path Physical Channels

    DTIC Science & Technology

    2006-08-14

    63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of

  5. Fitting program for linear regressions according to Mahon (1996)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trappitsch, Reto G.

    2018-01-09

    This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.

  6. A simplified competition data analysis for radioligand specific activity determination.

    PubMed

    Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A

    1990-01-01

    Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.

  7. Automating approximate Bayesian computation by local linear regression.

    PubMed

    Thornton, Kevin R

    2009-07-07

    In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.

  8. Plateletpheresis efficiency and mathematical correction of software-derived platelet yield prediction: A linear regression and ROC modeling approach.

    PubMed

    Jaime-Pérez, José Carlos; Jiménez-Castillo, Raúl Alberto; Vázquez-Hernández, Karina Elizabeth; Salazar-Riojas, Rosario; Méndez-Ramírez, Nereida; Gómez-Almaguer, David

    2017-10-01

    Advances in automated cell separators have improved the efficiency of plateletpheresis and the possibility of obtaining double products (DP). We assessed cell processor accuracy of predicted platelet (PLT) yields with the goal of a better prediction of DP collections. This retrospective proof-of-concept study included 302 plateletpheresis procedures performed on a Trima Accel v6.0 at the apheresis unit of a hematology department. Donor variables, software predicted yield and actual PLT yield were statistically evaluated. Software prediction was optimized by linear regression analysis and its optimal cut-off to obtain a DP assessed by receiver operating characteristic curve (ROC) modeling. Three hundred and two plateletpheresis procedures were performed; in 271 (89.7%) occasions, donors were men and in 31 (10.3%) women. Pre-donation PLT count had the best direct correlation with actual PLT yield (r = 0.486. P < .001). Means of software machine-derived values differed significantly from actual PLT yield, 4.72 × 10 11 vs.6.12 × 10 11 , respectively, (P < .001). The following equation was developed to adjust these values: actual PLT yield= 0.221 + (1.254 × theoretical platelet yield). ROC curve model showed an optimal apheresis device software prediction cut-off of 4.65 × 10 11 to obtain a DP, with a sensitivity of 82.2%, specificity of 93.3%, and an area under the curve (AUC) of 0.909. Trima Accel v6.0 software consistently underestimated PLT yields. Simple correction derived from linear regression analysis accurately corrected this underestimation and ROC analysis identified a precise cut-off to reliably predict a DP. © 2016 Wiley Periodicals, Inc.

  9. Laval University and Lakehead University Experiments at TREC 2015 Contextual Suggestion Track

    DTIC Science & Technology

    2015-11-20

    Department of Computer Science and Software Engineering, Laval University 2 Department of Software Engineering, Lakehead University Abstract—In this...Linear Regression and Lambda Mart perform poorly in this case, be- cause the size of the training data per user is small (less than 50 samples). On the

  10. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  11. [Ultrasonic measurements of fetal thalamus, caudate nucleus and lenticular nucleus in prenatal diagnosis].

    PubMed

    Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei

    2015-05-19

    To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.

  12. Minimizing bias in biomass allometry: Model selection and log transformation of data

    Treesearch

    Joseph Mascaro; undefined undefined; Flint Hughes; Amanda Uowolo; Stefan A. Schnitzer

    2011-01-01

    Nonlinear regression is increasingly used to develop allometric equations for forest biomass estimation (i.e., as opposed to the raditional approach of log-transformation followed by linear regression). Most statistical software packages, however, assume additive errors by default, violating a key assumption of allometric theory and possibly producing spurious models....

  13. Evaluation of pharyngeal space and its correlation with mandible and hyoid bone in patients with different skeletal classes and facial types.

    PubMed

    Nejaim, Yuri; Aps, Johan K M; Groppo, Francisco Carlos; Haiter Neto, Francisco

    2018-06-01

    The purpose of this article was to evaluate the pharyngeal space volume, and the size and shape of the mandible and the hyoid bone, as well as their relationships, in patients with different facial types and skeletal classes. Furthermore, we estimated the volume of the pharyngeal space with a formula using only linear measurements. A total of 161 i-CAT Next Generation (Imaging Sciences International, Hatfield, Pa) cone-beam computed tomography images (80 men, 81 women; ages, 21-58 years; mean age, 27 years) were retrospectively studied. Skeletal class and facial type were determined for each patient from multiplanar reconstructions using the NemoCeph software (Nemotec, Madrid, Spain). Linear and angular measurements were performed using 3D imaging software (version 3.4.3; Carestream Health, Rochester, NY), and volumetric analysis of the pharyngeal space was carried out with ITK-SNAP (version 2.4.0; Cognitica, Philadelphia, Pa) segmentation software. For the statistics, analysis of variance and the Tukey test with a significance level of 0.05, Pearson correlation, and linear regression were used. The pharyngeal space volume, when correlated with mandible and hyoid bone linear and angular measurements, showed significant correlations with skeletal class or facial type. The linear regression performed to estimate the volume of the pharyngeal space showed an R of 0.92 and an adjusted R 2 of 0.8362. There were significant correlations between pharyngeal space volume, and the mandible and hyoid bone measurements, suggesting that the stomatognathic system should be evaluated in an integral and nonindividualized way. Furthermore, it was possible to develop a linear regression model, resulting in a useful formula for estimating the volume of the pharyngeal space. Copyright © 2018 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.

  14. A method for fitting regression splines with varying polynomial order in the linear mixed model.

    PubMed

    Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W

    2006-02-15

    The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.

  15. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  16. Understanding software faults and their role in software reliability modeling

    NASA Technical Reports Server (NTRS)

    Munson, John C.

    1994-01-01

    This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.

  17. Logistic regression for circular data

    NASA Astrophysics Data System (ADS)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  18. Analyses of Field Test Data at the Atucha-1 Spent Fuel Pools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sitaraman, S.

    A field test was conducted at the Atucha-1 spent nuclear fuel pools to validate a software package for gross defect detection that is used in conjunction with the inspection tool, Spent Fuel Neutron Counter (SFNC). A set of measurements was taken with the SFNC and the software predictions were compared with these data and analyzed. The data spanned a wide range of cooling times and a set of burnup levels leading to count rates from the several hundreds to around twenty per second. The current calibration in the software using linear fitting required the use of multiple calibration factors tomore » cover the entire range of count rates recorded. The solution to this was to use power regression data fitting to normalize the predicted response and derive one calibration factor that can be applied to the entire set of data. The resulting comparisons between the predicted and measured responses were generally good and provided a quantitative method of detecting missing fuel in virtually all situations. Since the current version of the software uses the linear calibration method, it would need to be updated with the new power regression method to make it more user-friendly for real time verification and fieldable for the range of responses that will be encountered.« less

  19. Distance correction system for localization based on linear regression and smoothing in ambient intelligence display.

    PubMed

    Kim, Dae-Hee; Choi, Jae-Hun; Lim, Myung-Eun; Park, Soo-Jun

    2008-01-01

    This paper suggests the method of correcting distance between an ambient intelligence display and a user based on linear regression and smoothing method, by which distance information of a user who approaches to the display can he accurately output even in an unanticipated condition using a passive infrared VIR) sensor and an ultrasonic device. The developed system consists of an ambient intelligence display and an ultrasonic transmitter, and a sensor gateway. Each module communicates with each other through RF (Radio frequency) communication. The ambient intelligence display includes an ultrasonic receiver and a PIR sensor for motion detection. In particular, this system selects and processes algorithms such as smoothing or linear regression for current input data processing dynamically through judgment process that is determined using the previous reliable data stored in a queue. In addition, we implemented GUI software with JAVA for real time location tracking and an ambient intelligence display.

  20. Kendall-Theil Robust Line (KTRLine--version 1.0)-A Visual Basic Program for Calculating and Graphing Robust Nonparametric Estimates of Linear-Regression Coefficients Between Two Continuous Variables

    USGS Publications Warehouse

    Granato, Gregory E.

    2006-01-01

    The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.

  1. Ergonomics study on mobile phones for thumb physiology discomfort

    NASA Astrophysics Data System (ADS)

    Bendero, J. M. S.; Doon, M. E. R.; Quiogue, K. C. A.; Soneja, L. C.; Ong, N. R.; Sauli, Z.; Vairavan, R.

    2017-09-01

    The study was conducted on Filipino undergraduate college students and aimed to find out about the significant factors associated with mobile phone usage and its effect on thumb pain.A correlation-prediction analysisand Multiple Linear Regression was adopted and used as the main tool in determining the significant factors and coming up with predictive models on thumb related pain. With the use of the software Statistical Package for the Social Sciences or SPSS in conducting linear regression, 2 significant factors on thumb-related pain (percentage of time using portrait as screen orientation when text messaging, amount of time playing games using one hand in a day) were found.

  2. Bioinactivation: Software for modelling dynamic microbial inactivation.

    PubMed

    Garre, Alberto; Fernández, Pablo S; Lindqvist, Roland; Egea, Jose A

    2017-03-01

    This contribution presents the bioinactivation software, which implements functions for the modelling of isothermal and non-isothermal microbial inactivation. This software offers features such as user-friendliness, modelling of dynamic conditions, possibility to choose the fitting algorithm and generation of prediction intervals. The software is offered in two different formats: Bioinactivation core and Bioinactivation SE. Bioinactivation core is a package for the R programming language, which includes features for the generation of predictions and for the fitting of models to inactivation experiments using non-linear regression or a Markov Chain Monte Carlo algorithm (MCMC). The calculations are based on inactivation models common in academia and industry (Bigelow, Peleg, Mafart and Geeraerd). Bioinactivation SE supplies a user-friendly interface to selected functions of Bioinactivation core, namely the model fitting of non-isothermal experiments and the generation of prediction intervals. The capabilities of bioinactivation are presented in this paper through a case study, modelling the non-isothermal inactivation of Bacillus sporothermodurans. This study has provided a full characterization of the response of the bacteria to dynamic temperature conditions, including confidence intervals for the model parameters and a prediction interval of the survivor curve. We conclude that the MCMC algorithm produces a better characterization of the biological uncertainty and variability than non-linear regression. The bioinactivation software can be relevant to the food and pharmaceutical industry, as well as to regulatory agencies, as part of a (quantitative) microbial risk assessment. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Advanced statistical methods for improved data analysis of NASA astrophysics missions

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.

    1992-01-01

    The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.

  4. TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis

    NASA Astrophysics Data System (ADS)

    Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.

    2016-02-01

    In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.

  5. Chicken barn climate and hazardous volatile compounds control using simple linear regression and PID

    NASA Astrophysics Data System (ADS)

    Abdullah, A. H.; Bakar, M. A. A.; Shukor, S. A. A.; Saad, F. S. A.; Kamis, M. S.; Mustafa, M. H.; Khalid, N. S.

    2016-07-01

    The hazardous volatile compounds from chicken manure in chicken barn are potentially to be a health threat to the farm animals and workers. Ammonia (NH3) and hydrogen sulphide (H2S) produced in chicken barn are influenced by climate changes. The Electronic Nose (e-nose) is used for the barn's air, temperature and humidity data sampling. Simple Linear Regression is used to identify the correlation between temperature-humidity, humidity-ammonia and ammonia-hydrogen sulphide. MATLAB Simulink software was used for the sample data analysis using PID controller. Results shows that the performance of PID controller using the Ziegler-Nichols technique can improve the system controller to control climate in chicken barn.

  6. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  7. The Digital Shoreline Analysis System (DSAS) Version 4.0 - An ArcGIS extension for calculating shoreline change

    USGS Publications Warehouse

    Thieler, E. Robert; Himmelstoss, Emily A.; Zichichi, Jessica L.; Ergul, Ayhan

    2009-01-01

    The Digital Shoreline Analysis System (DSAS) version 4.0 is a software extension to ESRI ArcGIS v.9.2 and above that enables a user to calculate shoreline rate-of-change statistics from multiple historic shoreline positions. A user-friendly interface of simple buttons and menus guides the user through the major steps of shoreline change analysis. Components of the extension and user guide include (1) instruction on the proper way to define a reference baseline for measurements, (2) automated and manual generation of measurement transects and metadata based on user-specified parameters, and (3) output of calculated rates of shoreline change and other statistical information. DSAS computes shoreline rates of change using four different methods: (1) endpoint rate, (2) simple linear regression, (3) weighted linear regression, and (4) least median of squares. The standard error, correlation coefficient, and confidence interval are also computed for the simple and weighted linear-regression methods. The results of all rate calculations are output to a table that can be linked to the transect file by a common attribute field. DSAS is intended to facilitate the shoreline change-calculation process and to provide rate-of-change information and the statistical data necessary to establish the reliability of the calculated results. The software is also suitable for any generic application that calculates positional change over time, such as assessing rates of change of glacier limits in sequential aerial photos, river edge boundaries, land-cover changes, and so on.

  8. Estimation of octanol/water partition coefficients using LSER parameters

    USGS Publications Warehouse

    Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.

    1998-01-01

    The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.

  9. Using Commercial-Off-The-Shelf Speech Recognition Software for Conning U.S. Warships

    DTIC Science & Technology

    2003-06-01

    Linear Regression , 2nd Edition, (John Wiley & Sons, St. Paul, Minnesota, 1985), pp. 267-269. 44 Current Projects About the Sigmoid Curve, Sigmoid Curve...Disabilities Conference, Conference Proceedings, [www.csun.edu/cod/conf/1998/proceedings/csun98_052.htm], as of June 2, 2003. 43 Weisberg, S., Applied

  10. Characterizing the scientific potential of satellite sensors. [San Francisco, California

    NASA Technical Reports Server (NTRS)

    1984-01-01

    Eleven thematic mapper (TM) radiometric calibration programs were tested and evaluated in support of the task to characterize the potential of LANDSAT TM digital imagery for scientific investigations in the Earth sciences and terrestrial physics. Three software errors related to integer overflow, divide by zero, and nonexist file group were found and solved. Raw, calibrated, and corrected image groups that were created and stored on the Barker2 disk are enumerated. Black and white pixel print files were created for various subscenes of a San Francisco scene (ID 40392-18152). The development of linear regression software is discussed. The output of the software and its function are described. Future work in TM radiometric calibration, image processing, and software development is outlined.

  11. PMICALC: an R code-based software for estimating post-mortem interval (PMI) compatible with Windows, Mac and Linux operating systems.

    PubMed

    Muñoz-Barús, José I; Rodríguez-Calvo, María Sol; Suárez-Peñaranda, José M; Vieira, Duarte N; Cadarso-Suárez, Carmen; Febrero-Bande, Manuel

    2010-01-30

    In legal medicine the correct determination of the time of death is of utmost importance. Recent advances in estimating post-mortem interval (PMI) have made use of vitreous humour chemistry in conjunction with Linear Regression, but the results are questionable. In this paper we present PMICALC, an R code-based freeware package which estimates PMI in cadavers of recent death by measuring the concentrations of potassium ([K+]), hypoxanthine ([Hx]) and urea ([U]) in the vitreous humor using two different regression models: Additive Models (AM) and Support Vector Machine (SVM), which offer more flexibility than the previously used Linear Regression. The results from both models are better than those published to date and can give numerical expression of PMI with confidence intervals and graphic support within 20 min. The program also takes into account the cause of death. 2009 Elsevier Ireland Ltd. All rights reserved.

  12. Linear and nonlinear models for predicting fish bioconcentration factors for pesticides.

    PubMed

    Yuan, Jintao; Xie, Chun; Zhang, Ting; Sun, Jinfang; Yuan, Xuejie; Yu, Shuling; Zhang, Yingbiao; Cao, Yunyuan; Yu, Xingchen; Yang, Xuan; Yao, Wu

    2016-08-01

    This work is devoted to the applications of the multiple linear regression (MLR), multilayer perceptron neural network (MLP NN) and projection pursuit regression (PPR) to quantitative structure-property relationship analysis of bioconcentration factors (BCFs) of pesticides tested on Bluegill (Lepomis macrochirus). Molecular descriptors of a total of 107 pesticides were calculated with the DRAGON Software and selected by inverse enhanced replacement method. Based on the selected DRAGON descriptors, a linear model was built by MLR, nonlinear models were developed using MLP NN and PPR. The robustness of the obtained models was assessed by cross-validation and external validation using test set. Outliers were also examined and deleted to improve predictive power. Comparative results revealed that PPR achieved the most accurate predictions. This study offers useful models and information for BCF prediction, risk assessment, and pesticide formulation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Does Parental Control Work With Smartphone Addiction?: A Cross-Sectional Study of Children in South Korea.

    PubMed

    Lee, Eun Jee; Ogbolu, Yolanda

    The purposes of this study were to (a) examine the relationship between personal characteristics (age, gender), psychological factors (depression), and physical factors (sleep time) on smartphone addiction in children and (b) determine whether parental control is associated with a lower incidence of smartphone addiction. Data were collected from children aged 10-12 years (N = 208) by a self-report questionnaire in two elementary schools and were analyzed using t test, one-way analysis of variance, correlation, and multiple linear regression. Most of the participants (73.3%) owned a smartphone, and the percentage of risky smartphone users was 12%. The multiple linear regression model explained 25.4% (adjusted R = .239) of the variance in the smartphone addiction score (SAS). Three variables were significantly associated with the SAS (age, depression, and parental control), and three variables were excluded (gender, geographic region, and parental control software). Teens, aged 10-12 years, with higher depression scores had higher SASs. The more parental control perceived by the student, the higher the SAS. There was no significant relationship between parental control software and smartphone addiction. This is one of the first studies to examine smartphone addiction in teens. Control-oriented managing by parents of children's smartphone use is not very effective and may exacerbate smartphone addiction. Future research should identify additional strategies, beyond parental control software, that have the potential to prevent, reduce, and eliminate smartphone addiction.

  14. Towards an Early Software Effort Estimation Based on Functional and Non-Functional Requirements

    NASA Astrophysics Data System (ADS)

    Kassab, Mohamed; Daneva, Maya; Ormandjieva, Olga

    The increased awareness of the non-functional requirements as a key to software project and product success makes explicit the need to include them in any software project effort estimation activity. However, the existing approaches to defining size-based effort relationships still pay insufficient attention to this need. This paper presents a flexible, yet systematic approach to the early requirements-based effort estimation, based on Non-Functional Requirements ontology. It complementarily uses one standard functional size measurement model and a linear regression technique. We report on a case study which illustrates the application of our solution approach in context and also helps evaluate our experiences in using it.

  15. Methods for scalar-on-function regression.

    PubMed

    Reiss, Philip T; Goldsmith, Jeff; Shang, Han Lin; Ogden, R Todd

    2017-08-01

    Recent years have seen an explosion of activity in the field of functional data analysis (FDA), in which curves, spectra, images, etc. are considered as basic functional data units. A central problem in FDA is how to fit regression models with scalar responses and functional data points as predictors. We review some of the main approaches to this problem, categorizing the basic model types as linear, nonlinear and nonparametric. We discuss publicly available software packages, and illustrate some of the procedures by application to a functional magnetic resonance imaging dataset.

  16. Penalized nonparametric scalar-on-function regression via principal coordinates

    PubMed Central

    Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu

    2016-01-01

    A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963

  17. Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method.

    PubMed

    Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza

    2015-11-18

    Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available.

  18. Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method

    PubMed Central

    Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza

    2016-01-01

    Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889

  19. A non-linear regression analysis program for describing electrophysiological data with multiple functions using Microsoft Excel.

    PubMed

    Brown, Angus M

    2006-04-01

    The objective of this present study was to demonstrate a method for fitting complex electrophysiological data with multiple functions using the SOLVER add-in of the ubiquitous spreadsheet Microsoft Excel. SOLVER minimizes the difference between the sum of the squares of the data to be fit and the function(s) describing the data using an iterative generalized reduced gradient method. While it is a straightforward procedure to fit data with linear functions, and we have previously demonstrated a method of non-linear regression analysis of experimental data based upon a single function, it is more complex to fit data with multiple functions, usually requiring specialized expensive computer software. In this paper we describe an easily understood program for fitting experimentally acquired data, in this case the stimulus-evoked compound action potential from the mouse optic nerve, with multiple Gaussian functions. The program is flexible and can be applied to describe data with a wide variety of user-input functions.

  20. SigrafW: An easy-to-use program for fitting enzyme kinetic data.

    PubMed

    Leone, Francisco Assis; Baranauskas, José Augusto; Furriel, Rosa Prazeres Melo; Borin, Ivana Aparecida

    2005-11-01

    SigrafW is Windows-compatible software developed using the Microsoft® Visual Basic Studio program that uses the simplified Hill equation for fitting kinetic data from allosteric and Michaelian enzymes. SigrafW uses a modified Fibonacci search to calculate maximal velocity (V), the Hill coefficient (n), and the enzyme-substrate apparent dissociation constant (K). The estimation of V, K, and the sum of the squares of residuals is performed using a Wilkinson nonlinear regression at any Hill coefficient (n). In contrast to many currently available kinetic analysis programs, SigrafW shows several advantages for the determination of kinetic parameters of both hyperbolic and nonhyperbolic saturation curves. No initial estimates of the kinetic parameters are required, a measure of the goodness-of-the-fit for each calculation performed is provided, the nonlinear regression used for calculations eliminates the statistical bias inherent in linear transformations, and the software can be used for enzyme kinetic simulations either for educational or research purposes. Persons interested in receiving a free copy of the software should contact Dr. F. A. Leone. Copyright © 2005 International Union of Biochemistry and Molecular Biology, Inc.

  1. Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

  2. Time-resolved perfusion imaging at the angiography suite: preclinical comparison of a new flat-detector application to computed tomography perfusion.

    PubMed

    Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver

    2015-02-01

    The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.

  3. Neural Network and Regression Approximations in High Speed Civil Transport Aircraft Design Optimization

    NASA Technical Reports Server (NTRS)

    Patniak, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.

    1998-01-01

    Nonlinear mathematical-programming-based design optimization can be an elegant method. However, the calculations required to generate the merit function, constraints, and their gradients, which are frequently required, can make the process computational intensive. The computational burden can be greatly reduced by using approximating analyzers derived from an original analyzer utilizing neural networks and linear regression methods. The experience gained from using both of these approximation methods in the design optimization of a high speed civil transport aircraft is the subject of this paper. The Langley Research Center's Flight Optimization System was selected for the aircraft analysis. This software was exercised to generate a set of training data with which a neural network and a regression method were trained, thereby producing the two approximating analyzers. The derived analyzers were coupled to the Lewis Research Center's CometBoards test bed to provide the optimization capability. With the combined software, both approximation methods were examined for use in aircraft design optimization, and both performed satisfactorily. The CPU time for solution of the problem, which had been measured in hours, was reduced to minutes with the neural network approximation and to seconds with the regression method. Instability encountered in the aircraft analysis software at certain design points was also eliminated. On the other hand, there were costs and difficulties associated with training the approximating analyzers. The CPU time required to generate the input-output pairs and to train the approximating analyzers was seven times that required for solution of the problem.

  4. Development of quantitative screen for 1550 chemicals with GC-MS.

    PubMed

    Bergmann, Alan J; Points, Gary L; Scott, Richard P; Wilson, Glenn; Anderson, Kim A

    2018-05-01

    With hundreds of thousands of chemicals in the environment, effective monitoring requires high-throughput analytical techniques. This paper presents a quantitative screening method for 1550 chemicals based on statistical modeling of responses with identification and integration performed using deconvolution reporting software. The method was evaluated with representative environmental samples. We tested biological extracts, low-density polyethylene, and silicone passive sampling devices spiked with known concentrations of 196 representative chemicals. A multiple linear regression (R 2  = 0.80) was developed with molecular weight, logP, polar surface area, and fractional ion abundance to predict chemical responses within a factor of 2.5. Linearity beyond the calibration had R 2  > 0.97 for three orders of magnitude. Median limits of quantitation were estimated to be 201 pg/μL (1.9× standard deviation). The number of detected chemicals and the accuracy of quantitation were similar for environmental samples and standard solutions. To our knowledge, this is the most precise method for the largest number of semi-volatile organic chemicals lacking authentic standards. Accessible instrumentation and software make this method cost effective in quantifying a large, customizable list of chemicals. When paired with silicone wristband passive samplers, this quantitative screen will be very useful for epidemiology where binning of concentrations is common. Graphical abstract A multiple linear regression of chemical responses measured with GC-MS allowed quantitation of 1550 chemicals in samples such as silicone wristbands.

  5. A comparison of radiometric correction techniques in the evaluation of the relationship between LST and NDVI in Landsat imagery.

    PubMed

    Tan, Kok Chooi; Lim, Hwee San; Matjafri, Mohd Zubir; Abdullah, Khiruddin

    2012-06-01

    Atmospheric corrections for multi-temporal optical satellite images are necessary, especially in change detection analyses, such as normalized difference vegetation index (NDVI) rationing. Abrupt change detection analysis using remote-sensing techniques requires radiometric congruity and atmospheric correction to monitor terrestrial surfaces over time. Two atmospheric correction methods were used for this study: relative radiometric normalization and the simplified method for atmospheric correction (SMAC) in the solar spectrum. A multi-temporal data set consisting of two sets of Landsat images from the period between 1991 and 2002 of Penang Island, Malaysia, was used to compare NDVI maps, which were generated using the proposed atmospheric correction methods. Land surface temperature (LST) was retrieved using ATCOR3_T in PCI Geomatica 10.1 image processing software. Linear regression analysis was utilized to analyze the relationship between NDVI and LST. This study reveals that both of the proposed atmospheric correction methods yielded high accuracy through examination of the linear correlation coefficients. To check for the accuracy of the equation obtained through linear regression analysis for every single satellite image, 20 points were randomly chosen. The results showed that the SMAC method yielded a constant value (in terms of error) to predict the NDVI value from linear regression analysis-derived equation. The errors (average) from both proposed atmospheric correction methods were less than 10%.

  6. Optimizing the Performance of Radionuclide Identification Software in the Hunt for Nuclear Security Threats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fotion, Katherine A.

    2016-08-18

    The Radionuclide Analysis Kit (RNAK), my team’s most recent nuclide identification software, is entering the testing phase. A question arises: will removing rare nuclides from the software’s library improve its overall performance? An affirmative response indicates fundamental errors in the software’s framework, while a negative response confirms the effectiveness of the software’s key machine learning algorithms. After thorough testing, I found that the performance of RNAK cannot be improved with the library choice effect, thus verifying the effectiveness of RNAK’s algorithms—multiple linear regression, Bayesian network using the Viterbi algorithm, and branch and bound search.

  7. Stress Regression Analysis of Asphalt Concrete Deck Pavement Based on Orthogonal Experimental Design and Interlayer Contact

    NASA Astrophysics Data System (ADS)

    Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei

    2018-03-01

    A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.

  8. Graphical Tools for Linear Structural Equation Modeling

    DTIC Science & Technology

    2014-06-01

    others. 4Kenny and Milan (2011) write, “Identification is perhaps the most difficult concept for SEM researchers to understand. We have seen SEM...model to using typical SEM software to determine model identifia- bility. Kenny and Milan (2011) list the following drawbacks: (i) If poor starting...the well known recursive and null rules (Bollen, 1989) and the regression rule (Kenny and Milan , 2011). A Simple Criterion for Identifying Individual

  9. Regression dilution bias: tools for correction methods and sample size calculation.

    PubMed

    Berglund, Lars

    2012-08-01

    Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.

  10. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    PubMed

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  11. NASA Software Cost Estimation Model: An Analogy Based Estimation Model

    NASA Technical Reports Server (NTRS)

    Hihn, Jairus; Juster, Leora; Menzies, Tim; Mathew, George; Johnson, James

    2015-01-01

    The cost estimation of software development activities is increasingly critical for large scale integrated projects such as those at DOD and NASA especially as the software systems become larger and more complex. As an example MSL (Mars Scientific Laboratory) developed at the Jet Propulsion Laboratory launched with over 2 million lines of code making it the largest robotic spacecraft ever flown (Based on the size of the software). Software development activities are also notorious for their cost growth, with NASA flight software averaging over 50% cost growth. All across the agency, estimators and analysts are increasingly being tasked to develop reliable cost estimates in support of program planning and execution. While there has been extensive work on improving parametric methods there is very little focus on the use of models based on analogy and clustering algorithms. In this paper we summarize our findings on effort/cost model estimation and model development based on ten years of software effort estimation research using data mining and machine learning methods to develop estimation models based on analogy and clustering. The NASA Software Cost Model performance is evaluated by comparing it to COCOMO II, linear regression, and K-­ nearest neighbor prediction model performance on the same data set.

  12. Interaction Models for Functional Regression.

    PubMed

    Usset, Joseph; Staicu, Ana-Maria; Maity, Arnab

    2016-02-01

    A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data.

  13. Linear regression analysis for comparing two measurers or methods of measurement: but which regression?

    PubMed

    Ludbrook, John

    2010-07-01

    1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software.

  14. Functional mixture regression.

    PubMed

    Yao, Fang; Fu, Yuejiao; Lee, Thomas C M

    2011-04-01

    In functional linear models (FLMs), the relationship between the scalar response and the functional predictor process is often assumed to be identical for all subjects. Motivated by both practical and methodological considerations, we relax this assumption and propose a new class of functional regression models that allow the regression structure to vary for different groups of subjects. By projecting the predictor process onto its eigenspace, the new functional regression model is simplified to a framework that is similar to classical mixture regression models. This leads to the proposed approach named as functional mixture regression (FMR). The estimation of FMR can be readily carried out using existing software implemented for functional principal component analysis and mixture regression. The practical necessity and performance of FMR are illustrated through applications to a longevity analysis of female medflies and a human growth study. Theoretical investigations concerning the consistent estimation and prediction properties of FMR along with simulation experiments illustrating its empirical properties are presented in the supplementary material available at Biostatistics online. Corresponding results demonstrate that the proposed approach could potentially achieve substantial gains over traditional FLMs.

  15. Caries risk assessment in schoolchildren - a form based on Cariogram® software

    PubMed Central

    CABRAL, Renata Nunes; HILGERT, Leandro Augusto; FABER, Jorge; LEAL, Soraya Coelho

    2014-01-01

    Identifying caries risk factors is an important measure which contributes to best understanding of the cariogenic profile of the patient. The Cariogram® software provides this analysis, and protocols simplifying the method were suggested. Objectives The aim of this study was to determine whether a newly developed Caries Risk Assessment (CRA) form based on the Cariogram® software could classify schoolchildren according to their caries risk and to evaluate relationships between caries risk and the variables in the form. Material and Methods 150 schoolchildren aged 5 to 7 years old were included in this survey. Caries prevalence was obtained according to International Caries Detection and Assessment System (ICDAS) II. Information for filling in the form based on Cariogram® was collected clinically and from questionnaires sent to parents. Linear regression and a forward stepwise multiple regression model were applied to correlate the variables included in the form with the caries risk. Results Caries prevalence, in primary dentition, including enamel and dentine carious lesions was 98.6%, and 77.3% when only dentine lesions were considered. Eighty-six percent of the children were classified as at moderate caries risk. The forward stepwise multiple regression model result was significant (R2=0.904; p<0.00001), showing that the most significant factors influencing caries risk were caries experience, oral hygiene, frequency of food consumption, sugar consumption and fluoride sources. Conclusion The use of the form based on the Cariogram® software enabled classification of the schoolchildren at low, moderate and high caries risk. Caries experience, oral hygiene, frequency of food consumption, sugar consumption and fluoride sources are the variables that were shown to be highly correlated with caries risk. PMID:25466473

  16. A practical data processing workflow for multi-OMICS projects.

    PubMed

    Kohl, Michael; Megger, Dominik A; Trippler, Martin; Meckel, Hagen; Ahrens, Maike; Bracht, Thilo; Weber, Frank; Hoffmann, Andreas-Claudius; Baba, Hideo A; Sitek, Barbara; Schlaak, Jörg F; Meyer, Helmut E; Stephan, Christian; Eisenacher, Martin

    2014-01-01

    Multi-OMICS approaches aim on the integration of quantitative data obtained for different biological molecules in order to understand their interrelation and the functioning of larger systems. This paper deals with several data integration and data processing issues that frequently occur within this context. To this end, the data processing workflow within the PROFILE project is presented, a multi-OMICS project that aims on identification of novel biomarkers and the development of new therapeutic targets for seven important liver diseases. Furthermore, a software called CrossPlatformCommander is sketched, which facilitates several steps of the proposed workflow in a semi-automatic manner. Application of the software is presented for the detection of novel biomarkers, their ranking and annotation with existing knowledge using the example of corresponding Transcriptomics and Proteomics data sets obtained from patients suffering from hepatocellular carcinoma. Additionally, a linear regression analysis of Transcriptomics vs. Proteomics data is presented and its performance assessed. It was shown, that for capturing profound relations between Transcriptomics and Proteomics data, a simple linear regression analysis is not sufficient and implementation and evaluation of alternative statistical approaches are needed. Additionally, the integration of multivariate variable selection and classification approaches is intended for further development of the software. Although this paper focuses only on the combination of data obtained from quantitative Proteomics and Transcriptomics experiments, several approaches and data integration steps are also applicable for other OMICS technologies. Keeping specific restrictions in mind the suggested workflow (or at least parts of it) may be used as a template for similar projects that make use of different high throughput techniques. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013 Elsevier B.V. All rights reserved.

  17. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data

    PubMed Central

    Ying, Gui-shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-01-01

    Purpose To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. Methods We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field data in the elderly. Results When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI −0.03 to 0.32D, P=0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, P=0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller P-values, while analysis of the worse eye provided larger P-values than mixed effects models and marginal models. Conclusion In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision. PMID:28102741

  18. Development of parallel line analysis criteria for recombinant adenovirus potency assay and definition of a unit of potency.

    PubMed

    Ogawa, Yasushi; Fawaz, Farah; Reyes, Candice; Lai, Julie; Pungor, Erno

    2007-01-01

    Parameter settings of a parallel line analysis procedure were defined by applying statistical analysis procedures to the absorbance data from a cell-based potency bioassay for a recombinant adenovirus, Adenovirus 5 Fibroblast Growth Factor-4 (Ad5FGF-4). The parallel line analysis was performed with a commercially available software, PLA 1.2. The software performs Dixon outlier test on replicates of the absorbance data, performs linear regression analysis to define linear region of the absorbance data, and tests parallelism between the linear regions of standard and sample. Width of Fiducial limit, expressed as a percent of the measured potency, was developed as a criterion for rejection of the assay data and to significantly improve the reliability of the assay results. With the linear range-finding criteria of the software set to a minimum of 5 consecutive dilutions and best statistical outcome, and in combination with the Fiducial limit width acceptance criterion of <135%, 13% of the assay results were rejected. With these criteria applied, the assay was found to be linear over the range of 0.25 to 4 relative potency units, defined as the potency of the sample normalized to the potency of Ad5FGF-4 standard containing 6 x 10(6) adenovirus particles/mL. The overall precision of the assay was estimated to be 52%. Without the application of Fiducial limit width criterion, the assay results were not linear over the range, and an overall precision of 76% was calculated from the data. An absolute unit of potency for the assay was defined by using the parallel line analysis procedure as the amount of Ad5FGF-4 that results in an absorbance value that is 121% of the average absorbance readings of the wells containing cells not infected with the adenovirus.

  19. [The analysis of threshold effect using Empower Stats software].

    PubMed

    Lin, Lin; Chen, Chang-zhong; Yu, Xiao-dan

    2013-11-01

    In many studies about biomedical research factors influence on the outcome variable, it has no influence or has a positive effect within a certain range. Exceeding a certain threshold value, the size of the effect and/or orientation will change, which called threshold effect. Whether there are threshold effects in the analysis of factors (x) on the outcome variable (y), it can be observed through a smooth curve fitting to see whether there is a piecewise linear relationship. And then using segmented regression model, LRT test and Bootstrap resampling method to analyze the threshold effect. Empower Stats software developed by American X & Y Solutions Inc has a threshold effect analysis module. You can input the threshold value at a given threshold segmentation simulated data. You may not input the threshold, but determined the optimal threshold analog data by the software automatically, and calculated the threshold confidence intervals.

  20. Factors associated with parasite dominance in fishes from Brazil.

    PubMed

    Amarante, Cristina Fernandes do; Tassinari, Wagner de Souza; Luque, Jose Luis; Pereira, Maria Julia Salim

    2016-06-14

    The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.

  1. Female married illiteracy as the most important continual determinant of total fertility rate among districts of Empowered Action Group States of India: Evidence from Annual Health Survey 2011-12.

    PubMed

    Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti

    2017-01-01

    District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run.

  2. Introduction to methodology of dose-response meta-analysis for binary outcome: With application on software.

    PubMed

    Zhang, Chao; Jia, Pengli; Yu, Liu; Xu, Chang

    2018-05-01

    Dose-response meta-analysis (DRMA) is widely applied to investigate the dose-specific relationship between independent and dependent variables. Such methods have been in use for over 30 years and are increasingly employed in healthcare and clinical decision-making. In this article, we give an overview of the methodology used in DRMA. We summarize the commonly used regression model and the pooled method in DRMA. We also use an example to illustrate how to employ a DRMA by these methods. Five regression models, linear regression, piecewise regression, natural polynomial regression, fractional polynomial regression, and restricted cubic spline regression, were illustrated in this article to fit the dose-response relationship. And two types of pooling approaches, that is, one-stage approach and two-stage approach are illustrated to pool the dose-response relationship across studies. The example showed similar results among these models. Several dose-response meta-analysis methods can be used for investigating the relationship between exposure level and the risk of an outcome. However the methodology of DRMA still needs to be improved. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.

  3. The correlation between preoperative volumetry and real graft weight: comparison of two volumetry programs.

    PubMed

    Mussin, Nadiar; Sumo, Marco; Lee, Kwang-Woong; Choi, YoungRok; Choi, Jin Yong; Ahn, Sung-Woo; Yoon, Kyung Chul; Kim, Hyo-Sin; Hong, Suk Kyun; Yi, Nam-Joon; Suh, Kyung-Suk

    2017-04-01

    Liver volumetry is a vital component in living donor liver transplantation to determine an adequate graft volume that meets the metabolic demands of the recipient and at the same time ensures donor safety. Most institutions use preoperative contrast-enhanced CT image-based software programs to estimate graft volume. The objective of this study was to evaluate the accuracy of 2 liver volumetry programs (Rapidia vs . Dr. Liver) in preoperative right liver graft estimation compared with real graft weight. Data from 215 consecutive right lobe living donors between October 2013 and August 2015 were retrospectively reviewed. One hundred seven patients were enrolled in Rapidia group and 108 patients were included in the Dr. Liver group. Estimated graft volumes generated by both software programs were compared with real graft weight measured during surgery, and further classified into minimal difference (≤15%) and big difference (>15%). Correlation coefficients and degree of difference were determined. Linear regressions were calculated and results depicted as scatterplots. Minimal difference was observed in 69.4% of cases from Dr. Liver group and big difference was seen in 44.9% of cases from Rapidia group (P = 0.035). Linear regression analysis showed positive correlation in both groups (P < 0.01). However, the correlation coefficient was better for the Dr. Liver group (R 2 = 0.719), than for the Rapidia group (R 2 = 0.688). Dr. Liver can accurately predict right liver graft size better and faster than Rapidia, and can facilitate preoperative planning in living donor liver transplantation.

  4. [Cytocompatibility of Co-Cr ceramic alloys after recasting].

    PubMed

    Hu, Yu-Feng; Jin, Wen-Zhong

    2017-06-01

    To study the correlation between apical foramen area and accuracy of PropexII electronic apex locator under destroyed apical constriction. Forty extracted teeth with single straight root canal were ground down 1 mm in the root tip and placed in 2% liquid agar gel injected into Castro model. The length of root canal was measured by PropexII electronic apex locator. The difference (L) between the electronic length (LP) and actual length was calculated. Imaging of apical foramen was recorded under microscope and apical foramen area (S) was measured by image processing software Photoshop CS. SPSS 22.0 software package was used to analyze the linear correlation and regression. With ±0.5 mm as the allowable range, all value of L was positive. The precise rate of PropexII was 52.5% when apical constriction was destroyed. There was a linear relationship between S and L (S=0.04+0.11×L,R=0.903). The accuracy decreases when apical constriction is destroyed. The accuracy is worse when the apical foramen area is larger.

  5. CEval: All-in-one software for data processing and statistical evaluations in affinity capillary electrophoresis.

    PubMed

    Dubský, Pavel; Ördögová, Magda; Malý, Michal; Riesová, Martina

    2016-05-06

    We introduce CEval software (downloadable for free at echmet.natur.cuni.cz) that was developed for quicker and easier electrophoregram evaluation and further data processing in (affinity) capillary electrophoresis. This software allows for automatic peak detection and evaluation of common peak parameters, such as its migration time, area, width etc. Additionally, the software includes a nonlinear regression engine that performs peak fitting with the Haarhoff-van der Linde (HVL) function, including automated initial guess of the HVL function parameters. HVL is a fundamental peak-shape function in electrophoresis, based on which the correct effective mobility of the analyte represented by the peak is evaluated. Effective mobilities of an analyte at various concentrations of a selector can be further stored and plotted in an affinity CE mode. Consequently, the mobility of the free analyte, μA, mobility of the analyte-selector complex, μAS, and the apparent complexation constant, K('), are first guessed automatically from the linearized data plots and subsequently estimated by the means of nonlinear regression. An option that allows two complexation dependencies to be fitted at once is especially convenient for enantioseparations. Statistical processing of these data is also included, which allowed us to: i) express the 95% confidence intervals for the μA, μAS and K(') least-squares estimates, ii) do hypothesis testing on the estimated parameters for the first time. We demonstrate the benefits of the CEval software by inspecting complexation of tryptophan methyl ester with two cyclodextrins, neutral heptakis(2,6-di-O-methyl)-β-CD and charged heptakis(6-O-sulfo)-β-CD. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. SCOPA and META-SCOPA: software for the analysis and aggregation of genome-wide association studies of multiple correlated phenotypes.

    PubMed

    Mägi, Reedik; Suleimanov, Yury V; Clarke, Geraldine M; Kaakinen, Marika; Fischer, Krista; Prokopenko, Inga; Morris, Andrew P

    2017-01-11

    Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite the fact that many diseases and quantitative traits are correlated with each other, and often measured in the same sample of individuals. Multivariate analyses of correlated phenotypes have been demonstrated, by simulation, to increase power to detect association with SNPs, and thus may enable improved detection of novel loci contributing to diseases and quantitative traits. We have developed the SCOPA software to enable GWAS analysis of multiple correlated phenotypes. The software implements "reverse regression" methodology, which treats the genotype of an individual at a SNP as the outcome and the phenotypes as predictors in a general linear model. SCOPA can be applied to quantitative traits and categorical phenotypes, and can accommodate imputed genotypes under a dosage model. The accompanying META-SCOPA software enables meta-analysis of association summary statistics from SCOPA across GWAS. Application of SCOPA to two GWAS of high-and low-density lipoprotein cholesterol, triglycerides and body mass index, and subsequent meta-analysis with META-SCOPA, highlighted stronger association signals than univariate phenotype analysis at established lipid and obesity loci. The META-SCOPA meta-analysis also revealed a novel signal of association at genome-wide significance for triglycerides mapping to GPC5 (lead SNP rs71427535, p = 1.1x10 -8 ), which has not been reported in previous large-scale GWAS of lipid traits. The SCOPA and META-SCOPA software enable discovery and dissection of multiple phenotype association signals through implementation of a powerful reverse regression approach.

  7. Clinically Practical Approach for Screening of Low Muscularity Using Electronic Linear Measures on Computed Tomography Images in Critically Ill Patients.

    PubMed

    Avrutin, Egor; Moisey, Lesley L; Zhang, Roselyn; Khattab, Jenna; Todd, Emma; Premji, Tahira; Kozar, Rosemary; Heyland, Daren K; Mourtzakis, Marina

    2017-12-06

    Computed tomography (CT) scans performed during routine hospital care offer the opportunity to quantify skeletal muscle and predict mortality and morbidity in intensive care unit (ICU) patients. Existing methods of muscle cross-sectional area (CSA) quantification require specialized software, training, and time commitment that may not be feasible in a clinical setting. In this article, we explore a new screening method to identify patients with low muscle mass. We analyzed 145 scans of elderly ICU patients (≥65 years old) using a combination of measures obtained with a digital ruler, commonly found on hospital radiological software. The psoas and paraspinal muscle groups at the level of the third lumbar vertebra (L3) were evaluated by using 2 linear measures each and compared with an established method of CT image analysis of total muscle CSA in the L3 region. There was a strong association between linear measures of psoas and paraspinal muscle groups and total L3 muscle CSA (R 2 = 0.745, P < 0.001). Linear measures, age, and sex were included as covariates in a multiple logistic regression to predict those with low muscle mass; receiver operating characteristic (ROC) area under the curve (AUC) of the combined psoas and paraspinal linear index model was 0.920. Intraclass correlation coefficients (ICCs) were used to evaluate intrarater and interrater reliability, resulting in scores of 0.979 (95% CI: 0.940-0.992) and 0.937 (95% CI: 0.828-0.978), respectively. A digital ruler can reliably predict L3 muscle CSA, and these linear measures may be used to identify critically ill patients with low muscularity who are at risk for worse clinical outcomes. © 2017 American Society for Parenteral and Enteral Nutrition.

  8. Software requirements specification for the GIS-T/ISTEA pooled fund study phase C linear referencing engine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Amai, W.; Espinoza, J. Jr.; Fletcher, D.R.

    1997-06-01

    This Software Requirements Specification (SRS) describes the features to be provided by the software for the GIS-T/ISTEA Pooled Fund Study Phase C Linear Referencing Engine project. This document conforms to the recommendations of IEEE Standard 830-1984, IEEE Guide to Software Requirements Specification (Institute of Electrical and Electronics Engineers, Inc., 1984). The software specified in this SRS is a proof-of-concept implementation of the Linear Referencing Engine as described in the GIS-T/ISTEA pooled Fund Study Phase B Summary, specifically Sheet 13 of the Phase B object model. The software allows an operator to convert between two linear referencing methods and a datummore » network.« less

  9. What automated age estimation of hand and wrist MRI data tells us about skeletal maturation in male adolescents.

    PubMed

    Urschler, Martin; Grassegger, Sabine; Štern, Darko

    2015-01-01

    Age estimation of individuals is important in human biology and has various medical and forensic applications. Recent interest in MR-based methods aims to investigate alternatives for established methods involving ionising radiation. Automatic, software-based methods additionally promise improved estimation objectivity. To investigate how informative automatically selected image features are regarding their ability to discriminate age, by exploring a recently proposed software-based age estimation method for MR images of the left hand and wrist. One hundred and two MR datasets of left hand images are used to evaluate age estimation performance, consisting of bone and epiphyseal gap volume localisation, computation of one age regression model per bone mapping image features to age and fusion of individual bone age predictions to a final age estimate. Quantitative results of the software-based method show an age estimation performance with a mean absolute difference of 0.85 years (SD = 0.58 years) to chronological age, as determined by a cross-validation experiment. Qualitatively, it is demonstrated how feature selection works and which image features of skeletal maturation are automatically chosen to model the non-linear regression function. Feasibility of automatic age estimation based on MRI data is shown and selected image features are found to be informative for describing anatomical changes during physical maturation in male adolescents.

  10. Methods for cost estimation in software project management

    NASA Astrophysics Data System (ADS)

    Briciu, C. V.; Filip, I.; Indries, I. I.

    2016-02-01

    The speed in which the processes used in software development field have changed makes it very difficult the task of forecasting the overall costs for a software project. By many researchers, this task has been considered unachievable, but there is a group of scientist for which this task can be solved using the already known mathematical methods (e.g. multiple linear regressions) and the new techniques as genetic programming and neural networks. The paper presents a solution for building a model for the cost estimation models in the software project management using genetic algorithms starting from the PROMISE datasets related COCOMO 81 model. In the first part of the paper, a summary of the major achievements in the research area of finding a model for estimating the overall project costs is presented together with the description of the existing software development process models. In the last part, a basic proposal of a mathematical model of a genetic programming is proposed including here the description of the chosen fitness function and chromosome representation. The perspective of model described it linked with the current reality of the software development considering as basis the software product life cycle and the current challenges and innovations in the software development area. Based on the author's experiences and the analysis of the existing models and product lifecycle it was concluded that estimation models should be adapted with the new technologies and emerging systems and they depend largely by the chosen software development method.

  11. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  12. An Instructional Note on Linear Programming--A Pedagogically Sound Approach.

    ERIC Educational Resources Information Center

    Mitchell, Richard

    1998-01-01

    Discusses the place of linear programming in college curricula and the advantages of using linear-programming software. Lists important characteristics of computer software used in linear programming for more effective teaching and learning. (ASK)

  13. Image analysis software for following progression of peripheral neuropathy

    NASA Astrophysics Data System (ADS)

    Epplin-Zapf, Thomas; Miller, Clayton; Larkin, Sean; Hermesmeyer, Eduardo; Macy, Jenny; Pellegrini, Marco; Luccarelli, Saverio; Staurenghi, Giovanni; Holmes, Timothy

    2009-02-01

    A relationship has been reported by several research groups [1 - 4] between the density and shapes of nerve fibers in the cornea and the existence and severity of peripheral neuropathy. Peripheral neuropathy is a complication of several prevalent diseases or conditions, which include diabetes, HIV, prolonged alcohol overconsumption and aging. A common clinical technique for confirming the condition is intramuscular electromyography (EMG), which is invasive, so a noninvasive technique like the one proposed here carries important potential advantages for the physician and patient. A software program that automatically detects the nerve fibers, counts them and measures their shapes is being developed and tested. Tests were carried out with a database of subjects with levels of severity of diabetic neuropathy as determined by EMG testing. Results from this testing, that include a linear regression analysis are shown.

  14. Female married illiteracy as the most important continual determinant of total fertility rate among districts of Empowered Action Group States of India: Evidence from Annual Health Survey 2011–12

    PubMed Central

    Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti

    2017-01-01

    Background: District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Objective: Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Material and Methods: Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Results: Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. Conclusion: We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run. PMID:29416999

  15. AGSuite: Software to conduct feature analysis of artificial grammar learning performance.

    PubMed

    Cook, Matthew T; Chubala, Chrissy M; Jamieson, Randall K

    2017-10-01

    To simplify the problem of studying how people learn natural language, researchers use the artificial grammar learning (AGL) task. In this task, participants study letter strings constructed according to the rules of an artificial grammar and subsequently attempt to discriminate grammatical from ungrammatical test strings. Although the data from these experiments are usually analyzed by comparing the mean discrimination performance between experimental conditions, this practice discards information about the individual items and participants that could otherwise help uncover the particular features of strings associated with grammaticality judgments. However, feature analysis is tedious to compute, often complicated, and ill-defined in the literature. Moreover, the data violate the assumption of independence underlying standard linear regression models, leading to Type I error inflation. To solve these problems, we present AGSuite, a free Shiny application for researchers studying AGL. The suite's intuitive Web-based user interface allows researchers to generate strings from a database of published grammars, compute feature measures (e.g., Levenshtein distance) for each letter string, and conduct a feature analysis on the strings using linear mixed effects (LME) analyses. The LME analysis solves the inflation of Type I errors that afflicts more common methods of repeated measures regression analysis. Finally, the software can generate a number of graphical representations of the data to support an accurate interpretation of results. We hope the ease and availability of these tools will encourage researchers to take full advantage of item-level variance in their datasets in the study of AGL. We moreover discuss the broader applicability of the tools for researchers looking to conduct feature analysis in any field.

  16. The correlation between preoperative volumetry and real graft weight: comparison of two volumetry programs

    PubMed Central

    Mussin, Nadiar; Sumo, Marco; Choi, YoungRok; Choi, Jin Yong; Ahn, Sung-Woo; Yoon, Kyung Chul; Kim, Hyo-Sin; Hong, Suk Kyun; Yi, Nam-Joon; Suh, Kyung-Suk

    2017-01-01

    Purpose Liver volumetry is a vital component in living donor liver transplantation to determine an adequate graft volume that meets the metabolic demands of the recipient and at the same time ensures donor safety. Most institutions use preoperative contrast-enhanced CT image-based software programs to estimate graft volume. The objective of this study was to evaluate the accuracy of 2 liver volumetry programs (Rapidia vs. Dr. Liver) in preoperative right liver graft estimation compared with real graft weight. Methods Data from 215 consecutive right lobe living donors between October 2013 and August 2015 were retrospectively reviewed. One hundred seven patients were enrolled in Rapidia group and 108 patients were included in the Dr. Liver group. Estimated graft volumes generated by both software programs were compared with real graft weight measured during surgery, and further classified into minimal difference (≤15%) and big difference (>15%). Correlation coefficients and degree of difference were determined. Linear regressions were calculated and results depicted as scatterplots. Results Minimal difference was observed in 69.4% of cases from Dr. Liver group and big difference was seen in 44.9% of cases from Rapidia group (P = 0.035). Linear regression analysis showed positive correlation in both groups (P < 0.01). However, the correlation coefficient was better for the Dr. Liver group (R2 = 0.719), than for the Rapidia group (R2 = 0.688). Conclusion Dr. Liver can accurately predict right liver graft size better and faster than Rapidia, and can facilitate preoperative planning in living donor liver transplantation. PMID:28382294

  17. Method and Excel VBA Algorithm for Modeling Master Recession Curve Using Trigonometry Approach.

    PubMed

    Posavec, Kristijan; Giacopetti, Marco; Materazzi, Marco; Birk, Steffen

    2017-11-01

    A new method was developed and implemented into an Excel Visual Basic for Applications (VBAs) algorithm utilizing trigonometry laws in an innovative way to overlap recession segments of time series and create master recession curves (MRCs). Based on a trigonometry approach, the algorithm horizontally translates succeeding recession segments of time series, placing their vertex, that is, the highest recorded value of each recession segment, directly onto the appropriate connection line defined by measurement points of a preceding recession segment. The new method and algorithm continues the development of methods and algorithms for the generation of MRC, where the first published method was based on a multiple linear/nonlinear regression model approach (Posavec et al. 2006). The newly developed trigonometry-based method was tested on real case study examples and compared with the previously published multiple linear/nonlinear regression model-based method. The results show that in some cases, that is, for some time series, the trigonometry-based method creates narrower overlaps of the recession segments, resulting in higher coefficients of determination R 2 , while in other cases the multiple linear/nonlinear regression model-based method remains superior. The Excel VBA algorithm for modeling MRC using the trigonometry approach is implemented into a spreadsheet tool (MRCTools v3.0 written by and available from Kristijan Posavec, Zagreb, Croatia) containing the previously published VBA algorithms for MRC generation and separation. All algorithms within the MRCTools v3.0 are open access and available free of charge, supporting the idea of running science on available, open, and free of charge software. © 2017, National Ground Water Association.

  18. Using LiDAR to Estimate Total Aboveground Biomass of Redwood Stands in the Jackson Demonstration State Forest, Mendocino, California

    NASA Astrophysics Data System (ADS)

    Rao, M.; Vuong, H.

    2013-12-01

    The overall objective of this study is to develop a method for estimating total aboveground biomass of redwood stands in Jackson Demonstration State Forest, Mendocino, California using airborne LiDAR data. LiDAR data owing to its vertical and horizontal accuracy are increasingly being used to characterize landscape features including ground surface elevation and canopy height. These LiDAR-derived metrics involving structural signatures at higher precision and accuracy can help better understand ecological processes at various spatial scales. Our study is focused on two major species of the forest: redwood (Sequoia semperirens [D.Don] Engl.) and Douglas-fir (Pseudotsuga mensiezii [Mirb.] Franco). Specifically, the objectives included linear regression models fitting tree diameter at breast height (dbh) to LiDAR derived height for each species. From 23 random points on the study area, field measurement (dbh and tree coordinate) were collected for more than 500 trees of Redwood and Douglas-fir over 0.2 ha- plots. The USFS-FUSION application software along with its LiDAR Data Viewer (LDV) were used to to extract Canopy Height Model (CHM) from which tree heights would be derived. Based on the LiDAR derived height and ground based dbh, a linear regression model was developed to predict dbh. The predicted dbh was used to estimate the biomass at the single tree level using Jenkin's formula (Jenkin et al 2003). The linear regression models were able to explain 65% of the variability associated with Redwood's dbh and 80% of that associated with Douglas-fir's dbh.

  19. Prediction of accommodative optical response in prepresbyopic patients using ultrasound biomicroscopy

    PubMed Central

    Ramasubramanian, Viswanathan; Glasser, Adrian

    2015-01-01

    PURPOSE To determine whether relatively low-resolution ultrasound biomicroscopy (UBM) can predict the accommodative optical response in prepresbyopic eyes as well as in a previous study of young phakic subjects, despite lower accommodative amplitudes. SETTING College of Optometry, University of Houston, Houston, USA. DESIGN Observational cross-sectional study. METHODS Static accommodative optical response was measured with infrared photorefraction and an autorefractor (WR-5100K) in subjects aged 36 to 46 years. A 35 MHz UBM device (Vumax, Sonomed Escalon) was used to image the left eye, while the right eye viewed accommodative stimuli. Custom-developed Matlab image-analysis software was used to perform automated analysis of UBM images to measure the ocular biometry parameters. The accommodative optical response was predicted from biometry parameters using linear regression, 95% confidence intervals (CIs), and 95% prediction intervals. RESULTS The study evaluated 25 subjects. Per-diopter (D) accommodative changes in anterior chamber depth (ACD), lens thickness, anterior and posterior lens radii of curvature, and anterior segment length were similar to previous values from young subjects. The standard deviations (SDs) of accommodative optical response predicted from linear regressions for UBM-measured biometry parameters were ACD, 0.15 D; lens thickness, 0.25 D; anterior lens radii of curvature, 0.09 D; posterior lens radii of curvature, 0.37 D; and anterior segment length, 0.42 D. CONCLUSIONS Ultrasound biomicroscopy parameters can, on average, predict accommodative optical response with SDs of less than 0.55 D using linear regressions and 95% CIs. Ultrasound biomicroscopy can be used to visualize and quantify accommodative biometric changes and predict accommodative optical response in prepresbyopic eyes. PMID:26049831

  20. Case-mix groups for VA hospital-based home care.

    PubMed

    Smith, M E; Baker, C R; Branch, L G; Walls, R C; Grimes, R M; Karklins, J M; Kashner, M; Burrage, R; Parks, A; Rogers, P

    1992-01-01

    The purpose of this study is to group hospital-based home care (HBHC) patients homogeneously by their characteristics with respect to cost of care to develop alternative case mix methods for management and reimbursement (allocation) purposes. Six Veterans Affairs (VA) HBHC programs in Fiscal Year (FY) 1986 that maximized patient, program, and regional variation were selected, all of which agreed to participate. All HBHC patients active in each program on October 1, 1987, in addition to all new admissions through September 30, 1988 (FY88), comprised the sample of 874 unique patients. Statistical methods include the use of classification and regression trees (CART software: Statistical Software; Lafayette, CA), analysis of variance, and multiple linear regression techniques. The resulting algorithm is a three-factor model that explains 20% of the cost variance (R2 = 20%, with a cross validation R2 of 12%). Similar classifications such as the RUG-II, which is utilized for VA nursing home and intermediate care, the VA outpatient resource allocation model, and the RUG-HHC, utilized in some states for reimbursing home health care in the private sector, explained less of the cost variance and, therefore, are less adequate for VA home care resource allocation.

  1. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    PubMed

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  2. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

    PubMed

    Wilke, Marko

    2018-02-01

    This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.

  3. Categorical Regression and Benchmark Dose Software 3.0

    EPA Science Inventory

    The objective of this full-day course is to provide participants with interactive training on the use of the U.S. Environmental Protection Agency’s (EPA) Benchmark Dose software (BMDS, version 3.0, released fall 2018) and Categorical Regression software (CatReg, version 3.1...

  4. Estimating the effects of wages on obesity.

    PubMed

    Kim, DaeHwan; Leigh, John Paul

    2010-05-01

    To estimate the effects of wages on obesity and body mass. Data on household heads, aged 20 to 65 years, with full-time jobs, were drawn from the Panel Study of Income Dynamics for 2003 to 2007. The Panel Study of Income Dynamics is a nationally representative sample. Instrumental variables (IV) for wages were created using knowledge of computer software and state legal minimum wages. Least squares (linear regression) with corrected standard errors were used to estimate the equations. Statistical tests revealed both instruments were strong and tests for over-identifying restrictions were favorable. Wages were found to be predictive (P < 0.05) of obesity and body mass in regressions both before and after applying IVs. Coefficient estimates suggested stronger effects in the IV models. Results are consistent with the hypothesis that low wages increase obesity prevalence and body mass.

  5. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  6. Background stratified Poisson regression analysis of cohort data.

    PubMed

    Richardson, David B; Langholz, Bryan

    2012-03-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.

  7. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  8. Competing regression models for longitudinal data.

    PubMed

    Alencar, Airlane P; Singer, Julio M; Rocha, Francisco Marcelo M

    2012-03-01

    The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest-posttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  10. Relationship Between Ktrans and K1 with Simultaneous Versus Separate MR/PET in Rabbits with VX2 Tumors.

    PubMed

    Lee, Kyung Hee; Kang, Seung Kwan; Goo, Jin Mo; Lee, Jae Sung; Cheon, Gi Jeong; Seo, Seongho; Hwang, Eui Jin

    2017-03-01

    To compare the relationship between K trans from DCE-MRI and K 1 from dynamic 13 N-NH 3 -PET, with simultaneous and separate MR/PET in the VX-2 rabbit carcinoma model. MR/PET was performed simultaneously and separately, 14 and 15 days after VX-2 tumor implantation at the paravertebral muscle. The K trans and K 1 values were estimated using an in-house software program. The relationships between K trans and K 1 were analyzed using Pearson's correlation coefficients and linear/non-linear regression function. Assuming a linear relationship, K trans and K 1 exhibited a moderate positive correlations with both simultaneous (r=0.54-0.57) and separate (r=0.53-0.69) imaging. However, while the K trans and K 1 from separate imaging were linearly correlated, those from simultaneous imaging exhibited a non-linear relationship. The amount of change in K 1 associated with a unit increase in K trans varied depending on K trans values. The relationship between K trans and K 1 may be mis-interpreted with separate MR and PET acquisition. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  11. Software-assisted small bowel motility analysis using free-breathing MRI: feasibility study.

    PubMed

    Bickelhaupt, Sebastian; Froehlich, Johannes M; Cattin, Roger; Raible, Stephan; Bouquet, Hanspeter; Bill, Urs; Patak, Michael A

    2014-01-01

    To validate a software prototype allowing for small bowel motility analysis in free breathing by comparing it to manual measurements. In all, 25 patients (15 male, 10 female; mean age 39 years) were included in this Institutional Review Board-approved, retrospective study. Magnetic resonance imaging (MRI) was performed on a 1.5T system after standardized preparation acquiring motility sequences in free breathing over 69-84 seconds. Small bowel motility was analyzed manually and with the software. Functional parameters, measurement time, and reproducibility were compared using the coefficient of variance and paired Student's t-test. Correlation was analyzed using Pearson's correlation coefficient and linear regression. The 25 segments were analyzed twice both by hand and using the software with automatic breathing correction. All assessed parameters significantly correlated between the methods (P < 0.01), but the scattering of repeated measurements was significantly (P < 0.01) lower using the software (3.90%, standard deviation [SD] ± 5.69) than manual examinations (9.77%, SD ± 11.08). The time needed was significantly less (P < 0.001) with the software (4.52 minutes, SD ± 1.58) compared to manual measurement, lasting 17.48 minutes for manual (SD ± 1.75 minutes). The use of the software proves reliable and faster small bowel motility measurements in free-breathing MRI compared to manual analyses. The new technique allows for analyses of prolonged sequences acquired in free breathing, improving the informative value of the examinations by amplifying the evaluable data. Copyright © 2013 Wiley Periodicals, Inc.

  12. Integrating remote sensing with species distribution models; Mapping tamarisk invasions using the Software for Assisted Habitat Modeling (SAHM)

    USGS Publications Warehouse

    West, Amanda M.; Evangelista, Paul H.; Jarnevich, Catherine S.; Young, Nicholas E.; Stohlgren, Thomas J.; Talbert, Colin; Talbert, Marian; Morisette, Jeffrey; Anderson, Ryan

    2016-01-01

    Early detection of invasive plant species is vital for the management of natural resources and protection of ecosystem processes. The use of satellite remote sensing for mapping the distribution of invasive plants is becoming more common, however conventional imaging software and classification methods have been shown to be unreliable. In this study, we test and evaluate the use of five species distribution model techniques fit with satellite remote sensing data to map invasive tamarisk (Tamarix spp.) along the Arkansas River in Southeastern Colorado. The models tested included boosted regression trees (BRT), Random Forest (RF), multivariate adaptive regression splines (MARS), generalized linear model (GLM), and Maxent. These analyses were conducted using a newly developed software package called the Software for Assisted Habitat Modeling (SAHM). All models were trained with 499 presence points, 10,000 pseudo-absence points, and predictor variables acquired from the Landsat 5 Thematic Mapper (TM) sensor over an eight-month period to distinguish tamarisk from native riparian vegetation using detection of phenological differences. From the Landsat scenes, we used individual bands and calculated Normalized Difference Vegetation Index (NDVI), Soil-Adjusted Vegetation Index (SAVI), and tasseled capped transformations. All five models identified current tamarisk distribution on the landscape successfully based on threshold independent and threshold dependent evaluation metrics with independent location data. To account for model specific differences, we produced an ensemble of all five models with map output highlighting areas of agreement and areas of uncertainty. Our results demonstrate the usefulness of species distribution models in analyzing remotely sensed data and the utility of ensemble mapping, and showcase the capability of SAHM in pre-processing and executing multiple complex models.

  13. Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling (SAHM).

    PubMed

    West, Amanda M; Evangelista, Paul H; Jarnevich, Catherine S; Young, Nicholas E; Stohlgren, Thomas J; Talbert, Colin; Talbert, Marian; Morisette, Jeffrey; Anderson, Ryan

    2016-10-11

    Early detection of invasive plant species is vital for the management of natural resources and protection of ecosystem processes. The use of satellite remote sensing for mapping the distribution of invasive plants is becoming more common, however conventional imaging software and classification methods have been shown to be unreliable. In this study, we test and evaluate the use of five species distribution model techniques fit with satellite remote sensing data to map invasive tamarisk (Tamarix spp.) along the Arkansas River in Southeastern Colorado. The models tested included boosted regression trees (BRT), Random Forest (RF), multivariate adaptive regression splines (MARS), generalized linear model (GLM), and Maxent. These analyses were conducted using a newly developed software package called the Software for Assisted Habitat Modeling (SAHM). All models were trained with 499 presence points, 10,000 pseudo-absence points, and predictor variables acquired from the Landsat 5 Thematic Mapper (TM) sensor over an eight-month period to distinguish tamarisk from native riparian vegetation using detection of phenological differences. From the Landsat scenes, we used individual bands and calculated Normalized Difference Vegetation Index (NDVI), Soil-Adjusted Vegetation Index (SAVI), and tasseled capped transformations. All five models identified current tamarisk distribution on the landscape successfully based on threshold independent and threshold dependent evaluation metrics with independent location data. To account for model specific differences, we produced an ensemble of all five models with map output highlighting areas of agreement and areas of uncertainty. Our results demonstrate the usefulness of species distribution models in analyzing remotely sensed data and the utility of ensemble mapping, and showcase the capability of SAHM in pre-processing and executing multiple complex models.

  14. Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

    PubMed Central

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-01-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882

  15. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    PubMed

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  16. Analyzing industrial energy use through ordinary least squares regression models

    NASA Astrophysics Data System (ADS)

    Golden, Allyson Katherine

    Extensive research has been performed using regression analysis and calibrated simulations to create baseline energy consumption models for residential buildings and commercial institutions. However, few attempts have been made to discuss the applicability of these methodologies to establish baseline energy consumption models for industrial manufacturing facilities. In the few studies of industrial facilities, the presented linear change-point and degree-day regression analyses illustrate ideal cases. It follows that there is a need in the established literature to discuss the methodologies and to determine their applicability for establishing baseline energy consumption models of industrial manufacturing facilities. The thesis determines the effectiveness of simple inverse linear statistical regression models when establishing baseline energy consumption models for industrial manufacturing facilities. Ordinary least squares change-point and degree-day regression methods are used to create baseline energy consumption models for nine different case studies of industrial manufacturing facilities located in the southeastern United States. The influence of ambient dry-bulb temperature and production on total facility energy consumption is observed. The energy consumption behavior of industrial manufacturing facilities is only sometimes sufficiently explained by temperature, production, or a combination of the two variables. This thesis also provides methods for generating baseline energy models that are straightforward and accessible to anyone in the industrial manufacturing community. The methods outlined in this thesis may be easily replicated by anyone that possesses basic spreadsheet software and general knowledge of the relationship between energy consumption and weather, production, or other influential variables. With the help of simple inverse linear regression models, industrial manufacturing facilities may better understand their energy consumption and production behavior, and identify opportunities for energy and cost savings. This thesis study also utilizes change-point and degree-day baseline energy models to disaggregate facility annual energy consumption into separate industrial end-user categories. The baseline energy model provides a suitable and economical alternative to sub-metering individual manufacturing equipment. One case study describes the conjoined use of baseline energy models and facility information gathered during a one-day onsite visit to perform an end-point energy analysis of an injection molding facility conducted by the Alabama Industrial Assessment Center. Applying baseline regression model results to the end-point energy analysis allowed the AIAC to better approximate the annual energy consumption of the facility's HVAC system.

  17. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.

    PubMed

    Chatzis, Sotirios P; Andreou, Andreas S

    2015-11-01

    Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.

  18. A comparison of model-based imputation methods for handling missing predictor values in a linear regression model: A simulation study

    NASA Astrophysics Data System (ADS)

    Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah

    2017-08-01

    In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.

  19. Mapping Soil pH Buffering Capacity of Selected Fields

    NASA Technical Reports Server (NTRS)

    Weaver, A. R.; Kissel, D. E.; Chen, F.; West, L. T.; Adkins, W.; Rickman, D.; Luvall, J. C.

    2003-01-01

    Soil pH buffering capacity, since it varies spatially within crop production fields, may be used to define sampling zones to assess lime requirement, or for modeling changes in soil pH when acid forming fertilizers or manures are added to a field. Our objective was to develop a procedure to map this soil property. One hundred thirty six soil samples (0 to 15 cm depth) from three Georgia Coastal Plain fields were titrated with calcium hydroxide to characterize differences in pH buffering capacity of the soils. Since the relationship between soil pH and added calcium hydroxide was approximately linear for all samples up to pH 6.5, the slope values of these linear relationships for all soils were regressed on the organic C and clay contents of the 136 soil samples using multiple linear regression. The equation that fit the data best was b (slope of pH vs. lime added) = 0.00029 - 0.00003 * % clay + 0.00135 * % O/C, r(exp 2) = 0.68. This equation was applied within geographic information system (GIS) software to create maps of soil pH buffering capacity for the three fields. When the mapped values of the pH buffering capacity were compared with measured values for a total of 18 locations in the three fields, there was good general agreement. A regression of directly measured pH buffering capacities on mapped pH buffering capacities at the field locations for these samples gave an r(exp 2) of 0.88 with a slope of 1.04 for a group of soils that varied approximately tenfold in their pH buffering capacities.

  20. The Seismic Tool-Kit (STK): an open source software for seismology and signal processing.

    NASA Astrophysics Data System (ADS)

    Reymond, Dominique

    2016-04-01

    We present an open source software project (GNU public license), named STK: Seismic ToolKit, that is dedicated mainly for seismology and signal processing. The STK project that started in 2007, is hosted by SourceForge.net, and count more than 19 500 downloads at the date of writing. The STK project is composed of two main branches: First, a graphical interface dedicated to signal processing (in the SAC format (SAC_ASCII and SAC_BIN): where the signal can be plotted, zoomed, filtered, integrated, derivated, ... etc. (a large variety of IFR and FIR filter is proposed). The estimation of spectral density of the signal are performed via the Fourier transform, with visualization of the Power Spectral Density (PSD) in linear or log scale, and also the evolutive time-frequency representation (or sonagram). The 3-components signals can be also processed for estimating their polarization properties, either for a given window, or either for evolutive windows along the time. This polarization analysis is useful for extracting the polarized noises, differentiating P waves, Rayleigh waves, Love waves, ... etc. Secondly, a panel of Utilities-Program are proposed for working in a terminal mode, with basic programs for computing azimuth and distance in spherical geometry, inter/auto-correlation, spectral density, time-frequency for an entire directory of signals, focal planes, and main components axis, radiation pattern of P waves, Polarization analysis of different waves (including noize), under/over-sampling the signals, cubic-spline smoothing, and linear/non linear regression analysis of data set. A MINimum library of Linear AlGebra (MIN-LINAG) is also provided for computing the main matrix process like: QR/QL decomposition, Cholesky solve of linear system, finding eigen value/eigen vectors, QR-solve/Eigen-solve of linear equations systems ... etc. STK is developed in C/C++, mainly under Linux OS, and it has been also partially implemented under MS-Windows. Usefull links: http://sourceforge.net/projects/seismic-toolkit/ http://sourceforge.net/p/seismic-toolkit/wiki/browse_pages/

  1. Pseudo-second order models for the adsorption of safranin onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth

    2007-04-02

    Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.

  2. Software LS-MIDA for efficient mass isotopomer distribution analysis in metabolic modelling.

    PubMed

    Ahmed, Zeeshan; Zeeshan, Saman; Huber, Claudia; Hensel, Michael; Schomburg, Dietmar; Münch, Richard; Eisenreich, Wolfgang; Dandekar, Thomas

    2013-07-09

    The knowledge of metabolic pathways and fluxes is important to understand the adaptation of organisms to their biotic and abiotic environment. The specific distribution of stable isotope labelled precursors into metabolic products can be taken as fingerprints of the metabolic events and dynamics through the metabolic networks. An open-source software is required that easily and rapidly calculates from mass spectra of labelled metabolites, derivatives and their fragments global isotope excess and isotopomer distribution. The open-source software "Least Square Mass Isotopomer Analyzer" (LS-MIDA) is presented that processes experimental mass spectrometry (MS) data on the basis of metabolite information such as the number of atoms in the compound, mass to charge ratio (m/e or m/z) values of the compounds and fragments under study, and the experimental relative MS intensities reflecting the enrichments of isotopomers in 13C- or 15 N-labelled compounds, in comparison to the natural abundances in the unlabelled molecules. The software uses Brauman's least square method of linear regression. As a result, global isotope enrichments of the metabolite or fragment under study and the molar abundances of each isotopomer are obtained and displayed. The new software provides an open-source platform that easily and rapidly converts experimental MS patterns of labelled metabolites into isotopomer enrichments that are the basis for subsequent observation-driven analysis of pathways and fluxes, as well as for model-driven metabolic flux calculations.

  3. Robust Bayesian linear regression with application to an analysis of the CODATA values for the Planck constant

    NASA Astrophysics Data System (ADS)

    Wübbeler, Gerd; Bodnar, Olha; Elster, Clemens

    2018-02-01

    Weighted least-squares estimation is commonly applied in metrology to fit models to measurements that are accompanied with quoted uncertainties. The weights are chosen in dependence on the quoted uncertainties. However, when data and model are inconsistent in view of the quoted uncertainties, this procedure does not yield adequate results. When it can be assumed that all uncertainties ought to be rescaled by a common factor, weighted least-squares estimation may still be used, provided that a simple correction of the uncertainty obtained for the estimated model is applied. We show that these uncertainties and credible intervals are robust, as they do not rely on the assumption of a Gaussian distribution of the data. Hence, common software for weighted least-squares estimation may still safely be employed in such a case, followed by a simple modification of the uncertainties obtained by that software. We also provide means of checking the assumptions of such an approach. The Bayesian regression procedure is applied to analyze the CODATA values for the Planck constant published over the past decades in terms of three different models: a constant model, a straight line model and a spline model. Our results indicate that the CODATA values may not have yet stabilized.

  4. Analysis of relativistic nucleus-nucleus interactions in emulsion chambers

    NASA Technical Reports Server (NTRS)

    Mcguire, Stephen C.

    1987-01-01

    The development of a computer-assisted method is reported for the determination of the angular distribution data for secondary particles produced in relativistic nucleus-nucleus collisions in emulsions. The method is applied to emulsion detectors that were placed in a constant, uniform magnetic field and exposed to beams of 60 and 200 GeV/nucleon O-16 ions at the Super Proton Synchrotron (SPS) of the European Center for Nuclear Research (CERN). Linear regression analysis is used to determine the azimuthal and polar emission angles from measured track coordinate data. The software, written in BASIC, is designed to be machine independent, and adaptable to an automated system for acquiring the track coordinates. The fitting algorithm is deterministic, and takes into account the experimental uncertainty in the measured points. Further, a procedure for using the track data to estimate the linear momenta of the charged particles observed in the detectors is included.

  5. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps.

    PubMed

    García-Jacas, César R; Marrero-Ponce, Yovani; Acevedo-Martínez, Liesner; Barigye, Stephen J; Valdés-Martiní, José R; Contreras-Torres, Ernesto

    2014-07-05

    The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies. Copyright © 2014 Wiley Periodicals, Inc.

  6. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    NASA Astrophysics Data System (ADS)

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  7. Adapting iterative algorithms for solving large sparse linear systems for efficient use on the CDC CYBER 205

    NASA Technical Reports Server (NTRS)

    Kincaid, D. R.; Young, D. M.

    1984-01-01

    Adapting and designing mathematical software to achieve optimum performance on the CYBER 205 is discussed. Comments and observations are made in light of recent work done on modifying the ITPACK software package and on writing new software for vector supercomputers. The goal was to develop very efficient vector algorithms and software for solving large sparse linear systems using iterative methods.

  8. Linear regression crash prediction models : issues and proposed solutions.

    DOT National Transportation Integrated Search

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  9. Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

    ERIC Educational Resources Information Center

    Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

    2013-01-01

    In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

  10. The association of longitudinal trend of fasting plasma glucose with retinal microvasculature in people without established diabetes.

    PubMed

    Hu, Yin; Niu, Yong; Wang, Dandan; Wang, Ying; Holden, Brien A; He, Mingguang

    2015-01-22

    Structural changes of retinal vasculature, such as altered retinal vascular calibers, are considered as early signs of systemic vascular damage. We examined the associations of 5-year mean level, longitudinal trend, and fluctuation in fasting plasma glucose (FPG) with retinal vascular caliber in people without established diabetes. A prospective study was conducted in a cohort of Chinese people age ≥40 years in Guangzhou, southern China. The FPG was measured at baseline in 2008 and annually until 2012. In 2012, retinal vascular caliber was assessed using standard fundus photographs and validated software. A total of 3645 baseline nondiabetic participants with baseline and follow-up data on FPG for 3 or more visits was included for statistical analysis. The associations of retinal vascular caliber with 5-year mean FPG level, longitudinal FPG trend (slope of linear regression-FPG), and fluctuation (standard deviation and root mean square error of FPG) were analyzed using multivariable linear regression analyses. Multivariate regression models adjusted for baseline FPG and other potential confounders showed that a 10% annual increase in FPG was associated independently with a 2.65-μm narrowing in retinal arterioles (P = 0.008) and a 3.47-μm widening in venules (P = 0. 0.004). Associations with mean FPG level and fluctuation were not statistically significant. Annual rising trend in FPG, but not its mean level or fluctuation, is associated with altered retinal vasculature in nondiabetic people. Copyright 2015 The Association for Research in Vision and Ophthalmology, Inc.

  11. New machine-learning algorithms for prediction of Parkinson's disease

    NASA Astrophysics Data System (ADS)

    Mandal, Indrajit; Sairam, N.

    2014-03-01

    This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.

  12. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  13. Transmission of linear regression patterns between time series: From relationship in time series to complex networks

    NASA Astrophysics Data System (ADS)

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  14. Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

    PubMed

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  15. SU-F-T-130: [18F]-FDG Uptake Dose Response in Lung Correlates Linearly with Proton Therapy Dose

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, D; Titt, U; Mirkovic, D

    2016-06-15

    Purpose: Analysis of clinical outcomes in lung cancer patients treated with protons using 18F-FDG uptake in lung as a measure of dose response. Methods: A test case lung cancer patient was selected in an unbiased way. The test patient’s treatment planning and post treatment positron emission tomography (PET) were collected from picture archiving and communication system at the UT M.D. Anderson Cancer Center. Average computerized tomography scan was registered with post PET/CT through both rigid and deformable registrations for selected region of interest (ROI) via VelocityAI imaging informatics software. For the voxels in the ROI, a system that extracts themore » Standard Uptake Value (SUV) from PET was developed, and the corresponding relative biological effectiveness (RBE) weighted (both variable and constant) dose was computed using the Monte Carlo (MC) methods. The treatment planning system (TPS) dose was also obtained. Using histogram analysis, the voxel average normalized SUV vs. 3 different doses was obtained and linear regression fit was performed. Results: From the registration process, there were some regions that showed significant artifacts near the diaphragm and heart region, which yielded poor r-squared values when the linear regression fit was performed on normalized SUV vs. dose. Excluding these values, TPS fit yielded mean r-squared value of 0.79 (range 0.61–0.95), constant RBE fit yielded 0.79 (range 0.52–0.94), and variable RBE fit yielded 0.80 (range 0.52–0.94). Conclusion: A system that extracts SUV from PET to correlate between normalized SUV and various dose calculations was developed. A linear relation between normalized SUV and all three different doses was found.« less

  16. Computing Linear Mathematical Models Of Aircraft

    NASA Technical Reports Server (NTRS)

    Duke, Eugene L.; Antoniewicz, Robert F.; Krambeer, Keith D.

    1991-01-01

    Derivation and Definition of Linear Aircraft Model (LINEAR) computer program provides user with powerful, and flexible, standard, documented, and verified software tool for linearization of mathematical models of aerodynamics of aircraft. Intended for use in software tool to drive linear analysis of stability and design of control laws for aircraft. Capable of both extracting such linearized engine effects as net thrust, torque, and gyroscopic effects, and including these effects in linear model of system. Designed to provide easy selection of state, control, and observation variables used in particular model. Also provides flexibility of allowing alternate formulations of both state and observation equations. Written in FORTRAN.

  17. Prediction system of hydroponic plant growth and development using algorithm Fuzzy Mamdani method

    NASA Astrophysics Data System (ADS)

    Sudana, I. Made; Purnawirawan, Okta; Arief, Ulfa Mediaty

    2017-03-01

    Hydroponics is a method of farming without soil. One of the Hydroponic plants is Watercress (Nasturtium Officinale). The development and growth process of hydroponic Watercress was influenced by levels of nutrients, acidity and temperature. The independent variables can be used as input variable system to predict the value level of plants growth and development. The prediction system is using Fuzzy Algorithm Mamdani method. This system was built to implement the function of Fuzzy Inference System (Fuzzy Inference System/FIS) as a part of the Fuzzy Logic Toolbox (FLT) by using MATLAB R2007b. FIS is a computing system that works on the principle of fuzzy reasoning which is similar to humans' reasoning. Basically FIS consists of four units which are fuzzification unit, fuzzy logic reasoning unit, base knowledge unit and defuzzification unit. In addition to know the effect of independent variables on the plants growth and development that can be visualized with the function diagram of FIS output surface that is shaped three-dimensional, and statistical tests based on the data from the prediction system using multiple linear regression method, which includes multiple linear regression analysis, T test, F test, the coefficient of determination and donations predictor that are calculated using SPSS (Statistical Product and Service Solutions) software applications.

  18. Association between the Type of Workplace and Lung Function in Copper Miners

    PubMed Central

    Gruszczyński, Leszek; Wojakowska, Anna; Ścieszka, Marek; Turczyn, Barbara; Schmidt, Edward

    2016-01-01

    The aim of the analysis was to retrospectively assess changes in lung function in copper miners depending on the type of workplace. In the groups of 225 operators, 188 welders, and 475 representatives of other jobs, spirometry was performed at the start of employment and subsequently after 10, 20, and 25 years of work. Spirometry Longitudinal Data Analysis software was used to estimate changes in group means for FEV1 and FVC. Multiple linear regression analysis was used to assess an association between workplace and lung function. Lung function assessed on the basis of calculation of longitudinal FEV1 (FVC) decline was similar in all studied groups. However, multiple linear regression model used in cross-sectional analysis revealed an association between workplace and lung function. In the group of welders, FEF75 was lower in comparison to operators and other miners as early as after 10 years of work. Simultaneously, in smoking welders, the FEV1/FVC ratio was lower than in nonsmokers (p < 0,05). The interactions between type of workplace and smoking (p < 0,05) in their effect on FVC, FEV1, PEF, and FEF50 were shown. Among underground working copper miners, the group of smoking welders is especially threatened by impairment of lung ventilatory function. PMID:27274987

  19. Decisional balance and self-efficacy of physical activity among the elderly in Rasht in 2013 based on the transtheoretical model

    PubMed Central

    Abbaspour, Seddigheh; Farmanbar, Rabiollah; Njafi, Fateme; Ghiasvand, Arezoo Mohamadkhani; Dehghankar, Leila

    2017-01-01

    Background Regular physical activity has been considered as health promotion, and identifying different effective psycho-social variables on physical has proven to be essential. Objective To identify the relationship between decisional balance and self-efficacy in physical activities using the transtheoretical model in the members of a retirement center in Rasht, Guillen. Methods A descriptive cross-sectional study was conducted in 2013 by using convenient sampling on 262 elderly people who are the members of retirement centers in Rasht. Data were collected using Stages of change, Decisional balance, Self-efficacy and Physical Activity Scale for the Elderly (PASE). Data was analyzed using SPSS-16 software, descriptive and analytic statistic (Pearson correlation, Spearman, ANOVA, HSD Tukey, linear and ordinal regression). Results The majority of participants were in maintenance stage. Mean and standard deviation physical activity for the elderly was 119.35±51.50. Stages of change and physical activities were significantly associated with decisional balance and self-efficacy (p<0.0001); however, cons had a significant and reverse association. According to linear and ordinal regression the only predicator variable of physical activity behavior was self-efficacy. Conclusion By increase in pros and self-efficacy on doing physical activity, it can be benefited in designing appropriate intervention programs. PMID:28713520

  20. Predicting Retention Times of Naturally Occurring Phenolic Compounds in Reversed-Phase Liquid Chromatography: A Quantitative Structure-Retention Relationship (QSRR) Approach

    PubMed Central

    Akbar, Jamshed; Iqbal, Shahid; Batool, Fozia; Karim, Abdul; Chan, Kim Wei

    2012-01-01

    Quantitative structure-retention relationships (QSRRs) have successfully been developed for naturally occurring phenolic compounds in a reversed-phase liquid chromatographic (RPLC) system. A total of 1519 descriptors were calculated from the optimized structures of the molecules using MOPAC2009 and DRAGON softwares. The data set of 39 molecules was divided into training and external validation sets. For feature selection and mapping we used step-wise multiple linear regression (SMLR), unsupervised forward selection followed by step-wise multiple linear regression (UFS-SMLR) and artificial neural networks (ANN). Stable and robust models with significant predictive abilities in terms of validation statistics were obtained with negation of any chance correlation. ANN models were found better than remaining two approaches. HNar, IDM, Mp, GATS2v, DISP and 3D-MoRSE (signals 22, 28 and 32) descriptors based on van der Waals volume, electronegativity, mass and polarizability, at atomic level, were found to have significant effects on the retention times. The possible implications of these descriptors in RPLC have been discussed. All the models are proven to be quite able to predict the retention times of phenolic compounds and have shown remarkable validation, robustness, stability and predictive performance. PMID:23203132

  1. NIPTmer: rapid k-mer-based software package for detection of fetal aneuploidies.

    PubMed

    Sauk, Martin; Žilina, Olga; Kurg, Ants; Ustav, Eva-Liina; Peters, Maire; Paluoja, Priit; Roost, Anne Mari; Teder, Hindrek; Palta, Priit; Brison, Nathalie; Vermeesch, Joris R; Krjutškov, Kaarel; Salumets, Andres; Kaplinski, Lauris

    2018-04-04

    Non-invasive prenatal testing (NIPT) is a recent and rapidly evolving method for detecting genetic lesions, such as aneuploidies, of a fetus. However, there is a need for faster and cheaper laboratory and analysis methods to make NIPT more widely accessible. We have developed a novel software package for detection of fetal aneuploidies from next-generation low-coverage whole genome sequencing data. Our tool - NIPTmer - is based on counting pre-defined per-chromosome sets of unique k-mers from raw sequencing data, and applying linear regression model on the counts. Additionally, the filtering process used for k-mer list creation allows one to take into account the genetic variance in a specific sample, thus reducing the source of uncertainty. The processing time of one sample is less than 10 CPU-minutes on a high-end workstation. NIPTmer was validated on a cohort of 583 NIPT samples and it correctly predicted 37 non-mosaic fetal aneuploidies. NIPTmer has the potential to reduce significantly the time and complexity of NIPT post-sequencing analysis compared to mapping-based methods. For non-commercial users the software package is freely available at http://bioinfo.ut.ee/NIPTMer/ .

  2. Open-source Software for Demand Forecasting of Clinical Laboratory Test Volumes Using Time-series Analysis.

    PubMed

    Mohammed, Emad A; Naugler, Christopher

    2017-01-01

    Demand forecasting is the area of predictive analytics devoted to predicting future volumes of services or consumables. Fair understanding and estimation of how demand will vary facilitates the optimal utilization of resources. In a medical laboratory, accurate forecasting of future demand, that is, test volumes, can increase efficiency and facilitate long-term laboratory planning. Importantly, in an era of utilization management initiatives, accurately predicted volumes compared to the realized test volumes can form a precise way to evaluate utilization management initiatives. Laboratory test volumes are often highly amenable to forecasting by time-series models; however, the statistical software needed to do this is generally either expensive or highly technical. In this paper, we describe an open-source web-based software tool for time-series forecasting and explain how to use it as a demand forecasting tool in clinical laboratories to estimate test volumes. This tool has three different models, that is, Holt-Winters multiplicative, Holt-Winters additive, and simple linear regression. Moreover, these models are ranked and the best one is highlighted. This tool will allow anyone with historic test volume data to model future demand.

  3. Open-source Software for Demand Forecasting of Clinical Laboratory Test Volumes Using Time-series Analysis

    PubMed Central

    Mohammed, Emad A.; Naugler, Christopher

    2017-01-01

    Background: Demand forecasting is the area of predictive analytics devoted to predicting future volumes of services or consumables. Fair understanding and estimation of how demand will vary facilitates the optimal utilization of resources. In a medical laboratory, accurate forecasting of future demand, that is, test volumes, can increase efficiency and facilitate long-term laboratory planning. Importantly, in an era of utilization management initiatives, accurately predicted volumes compared to the realized test volumes can form a precise way to evaluate utilization management initiatives. Laboratory test volumes are often highly amenable to forecasting by time-series models; however, the statistical software needed to do this is generally either expensive or highly technical. Method: In this paper, we describe an open-source web-based software tool for time-series forecasting and explain how to use it as a demand forecasting tool in clinical laboratories to estimate test volumes. Results: This tool has three different models, that is, Holt-Winters multiplicative, Holt-Winters additive, and simple linear regression. Moreover, these models are ranked and the best one is highlighted. Conclusion: This tool will allow anyone with historic test volume data to model future demand. PMID:28400996

  4. Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

    PubMed

    Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

    2018-06-01

    This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.

  5. Composite Linear Models | Division of Cancer Prevention

    Cancer.gov

    By Stuart G. Baker The composite linear models software is a matrix approach to compute maximum likelihood estimates and asymptotic standard errors for models for incomplete multinomial data. It implements the method described in Baker SG. Composite linear models for incomplete multinomial data. Statistics in Medicine 1994;13:609-622. The software includes a library of thirty

  6. Estimating population diversity with CatchAll

    PubMed Central

    Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.

    2012-01-01

    Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246

  7. New database for improving virtual system “body-dress”

    NASA Astrophysics Data System (ADS)

    Yan, J. Q.; Zhang, S. C.; Kuzmichev, V. E.; Adolphe, D. C.

    2017-10-01

    The aim of this exploration is to develop a new database of solid algorithms and relations between the dress fit and the fabric mechanical properties, the pattern block construction for improving the reality of virtual system “body-dress”. In virtual simulation, the system “body-clothing” sometimes shown distinct results with reality, especially when important changes in pattern block and fabrics were involved. In this research, to enhance the simulation process, diverse fit parameters were proposed: bottom height of dress, angle of front center contours, air volume and its distribution between dress and dummy. Measurements were done and optimized by ruler, camera, 3D body scanner image processing software and 3D modeling software. In the meantime, pattern block indexes were measured and fabric properties were tested by KES. Finally, the correlation and linear regression equations between indexes of fabric properties, pattern blocks and fit parameters were investigated. In this manner, new database could be extended in programming modules of virtual design for more realistic results.

  8. QSAR models for predicting octanol/water and organic carbon/water partition coefficients of polychlorinated biphenyls.

    PubMed

    Yu, S; Gao, S; Gan, Y; Zhang, Y; Ruan, X; Wang, Y; Yang, L; Shi, J

    2016-04-01

    Quantitative structure-property relationship modelling can be a valuable alternative method to replace or reduce experimental testing. In particular, some endpoints such as octanol-water (KOW) and organic carbon-water (KOC) partition coefficients of polychlorinated biphenyls (PCBs) are easier to predict and various models have been already developed. In this paper, two different methods, which are multiple linear regression based on the descriptors generated using Dragon software and hologram quantitative structure-activity relationships, were employed to predict suspended particulate matter (SPM) derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of 209 PCBs. The predictive ability of the derived models was validated using a test set. The performances of all these models were compared with EPI Suite™ software. The results indicated that the proposed models were robust and satisfactory, and could provide feasible and promising tools for the rapid assessment of the SPM derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of PCBs.

  9. Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

    DTIC Science & Technology

    1974-01-01

    REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans

  10. Element enrichment factor calculation using grain-size distribution and functional data regression.

    PubMed

    Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

    2015-01-01

    In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Who Will Win?: Predicting the Presidential Election Using Linear Regression

    ERIC Educational Resources Information Center

    Lamb, John H.

    2007-01-01

    This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…

  12. CatReg Software for Categorical Regression Analysis (May 2016)

    EPA Science Inventory

    CatReg 3.0 is a Microsoft Windows enhanced version of the Agency’s categorical regression analysis (CatReg) program. CatReg complements EPA’s existing Benchmark Dose Software (BMDS) by greatly enhancing a risk assessor’s ability to determine whether data from separate toxicologic...

  13. Modeling Relationships Between Flight Crew Demographics and Perceptions of Interval Management

    NASA Technical Reports Server (NTRS)

    Remy, Benjamin; Wilson, Sara R.

    2016-01-01

    The Interval Management Alternative Clearances (IMAC) human-in-the-loop simulation experiment was conducted to assess interval management system performance and participants' acceptability and workload while performing three interval management clearance types. Twenty-four subject pilots and eight subject controllers flew ten high-density arrival scenarios into Denver International Airport during two weeks of data collection. This analysis examined the possible relationships between subject pilot demographics on reported perceptions of interval management in IMAC. Multiple linear regression models were created with a new software tool to predict subject pilot questionnaire item responses from demographic information. General patterns were noted across models that may indicate flight crew demographics influence perceptions of interval management.

  14. POWERLIB: SAS/IML Software for Computing Power in Multivariate Linear Models

    PubMed Central

    Johnson, Jacqueline L.; Muller, Keith E.; Slaughter, James C.; Gurka, Matthew J.; Gribbin, Matthew J.; Simpson, Sean L.

    2014-01-01

    The POWERLIB SAS/IML software provides convenient power calculations for a wide range of multivariate linear models with Gaussian errors. The software includes the Box, Geisser-Greenhouse, Huynh-Feldt, and uncorrected tests in the “univariate” approach to repeated measures (UNIREP), the Hotelling Lawley Trace, Pillai-Bartlett Trace, and Wilks Lambda tests in “multivariate” approach (MULTIREP), as well as a limited but useful range of mixed models. The familiar univariate linear model with Gaussian errors is an important special case. For estimated covariance, the software provides confidence limits for the resulting estimated power. All power and confidence limits values can be output to a SAS dataset, which can be used to easily produce plots and tables for manuscripts. PMID:25400516

  15. Computer simulation of Cerebral Arteriovenous Malformation-validation analysis of hemodynamics parameters.

    PubMed

    Kumar, Y Kiran; Mehta, Shashi Bhushan; Ramachandra, Manjunath

    2017-01-01

    The purpose of this work is to provide some validation methods for evaluating the hemodynamic assessment of Cerebral Arteriovenous Malformation (CAVM). This article emphasizes the importance of validating noninvasive measurements for CAVM patients, which are designed using lumped models for complex vessel structure. The validation of the hemodynamics assessment is based on invasive clinical measurements and cross-validation techniques with the Philips proprietary validated software's Qflow and 2D Perfursion. The modeling results are validated for 30 CAVM patients for 150 vessel locations. Mean flow, diameter, and pressure were compared between modeling results and with clinical/cross validation measurements, using an independent two-tailed Student t test. Exponential regression analysis was used to assess the relationship between blood flow, vessel diameter, and pressure between them. Univariate analysis is used to assess the relationship between vessel diameter, vessel cross-sectional area, AVM volume, AVM pressure, and AVM flow results were performed with linear or exponential regression. Modeling results were compared with clinical measurements from vessel locations of cerebral regions. Also, the model is cross validated with Philips proprietary validated software's Qflow and 2D Perfursion. Our results shows that modeling results and clinical results are nearly matching with a small deviation. In this article, we have validated our modeling results with clinical measurements. The new approach for cross-validation is proposed by demonstrating the accuracy of our results with a validated product in a clinical environment.

  16. Comparative Performance Evaluation of Rainfall-runoff Models, Six of Black-box Type and One of Conceptual Type, From The Galway Flow Forecasting System (gffs) Package, Applied On Two Irish Catchments

    NASA Astrophysics Data System (ADS)

    Goswami, M.; O'Connor, K. M.; Shamseldin, A. Y.

    The "Galway Real-Time River Flow Forecasting System" (GFFS) is a software pack- age developed at the Department of Engineering Hydrology, of the National University of Ireland, Galway, Ireland. It is based on a selection of lumped black-box and con- ceptual rainfall-runoff models, all developed in Galway, consisting primarily of both the non-parametric (NP) and parametric (P) forms of two black-box-type rainfall- runoff models, namely, the Simple Linear Model (SLM-NP and SLM-P) and the seasonally-based Linear Perturbation Model (LPM-NP and LPM-P), together with the non-parametric wetness-index-based Linearly Varying Gain Factor Model (LVGFM), the black-box Artificial Neural Network (ANN) Model, and the conceptual Soil Mois- ture Accounting and Routing (SMAR) Model. Comprised of the above suite of mod- els, the system enables the user to calibrate each model individually, initially without updating, and it is capable also of producing combined (i.e. consensus) forecasts us- ing the Simple Average Method (SAM), the Weighted Average Method (WAM), or the Artificial Neural Network Method (NNM). The updating of each model output is achieved using one of four different techniques, namely, simple Auto-Regressive (AR) updating, Linear Transfer Function (LTF) updating, Artificial Neural Network updating (NNU), and updating by the Non-linear Auto-Regressive Exogenous-input method (NARXM). The models exhibit a considerable range of variation in degree of complexity of structure, with corresponding degrees of complication in objective func- tion evaluation. Operating in continuous river-flow simulation and updating modes, these models and techniques have been applied to two Irish catchments, namely, the Fergus and the Brosna. A number of performance evaluation criteria have been used to comparatively assess the model discharge forecast efficiency.

  17. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value

  18. Local Linear Regression for Data with AR Errors.

    PubMed

    Li, Runze; Li, Yan

    2009-07-01

    In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.

  19. Correlation of Vitamin D status and orthodontic-induced external apical root resorption.

    PubMed

    Tehranchi, Azita; Sadighnia, Azin; Younessian, Farnaz; Abdi, Amir H; Shirvani, Armin

    2017-01-01

    Adequate Vitamin D is essential for dental and skeletal health in children and adult. The purpose of this study was to assess the correlation of serum Vitamin D level with external-induced apical root resorption (EARR) following fixed orthodontic treatment. In this cross-sectional study, the prevalence of Vitamin D deficiency (defined by25-hydroxyvitamin-D) was determined in 34 patients (23.5% male; age range 12-23 years; mean age 16.63 ± 2.84) treated with fixed orthodontic treatment. Root resorption of four maxillary incisors was measured using before and after periapical radiographs (136 measured teeth) by means of a design-to-purpose software to optimize data collection. Teeth with a maximum percentage of root resorption (%EARR) were indicated as representative root resorption for each patient. A multiple linear regression model and Pearson correlation coefficient were used to assess the association of Vitamin D status and observed EARR. P < 0.05 was considered statistically significant. The Pearson coefficient between these two variables was determined about 0.15 ( P = 0.38). Regression analysis revealed that Vitamin D status of the patients demonstrated no significant statistical correlation with EARR, after adjustment of confounding variables using linear regression model ( P > 0.05). This study suggests that Vitamin D level is not among the clinical variables that are potential contributors for EARR. The prevalence of Vitamin D deficiency does not differ in patients with higher EARR. These data suggest the possibility that Vitamin D insufficiency may not contribute to the development of more apical root resorption although this remains to be confirmed by further longitudinal cohort studies.

  20. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  1. Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

    PubMed

    Austin, Peter C

    2010-04-22

    Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

  2. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  3. [Correlation research on contents of podophyllotoxin and total lignans in Sinopodophyllum hexandrum and ecological factors].

    PubMed

    Li, Min; Zhong, Guo-yue; Wu, Ao-lin; Zhang, Shou-wen; Jiang, Wei; Liang, Jian

    2015-05-01

    To explore the correlation between the ecological factors and the contents of podophyllotoxin and total lignans in root and rhizome of Sinopodophyllum hexandrum, podophyllotoxin in 87 samples (from 5 provinces) was determined by HPLC and total lignans by UV. A correlation and regression analysis was made by software SPSS 16.0 in combination with ecological factors (terrain, soil and climate). The content determination results showed a great difference between podophyllotoxin and total lignans, attaining 1.001%-6.230% and 5.350%-16.34%, respective. The correlation and regression analysis by SPSS showed a positive linear correlation between their contents, strong positive correlation between their contents, latitude and annual average rainfall within the sampling area, weak negative correlation with pH value and organic material in soil, weaker and stronger positive correlations with soil potassium, weak negative correlation with slope and annual average temperature and weaker positive correlation between the podophyllotoxin content and soil potassium.

  4. [Quantitative relationship between gas chromatographic retention time and structural parameters of alkylphenols].

    PubMed

    Ruan, Xiaofang; Zhang, Ruisheng; Yao, Xiaojun; Liu, Mancang; Fan, Botao

    2007-03-01

    Alkylphenols are a group of permanent pollutants in the environment and could adversely disturb the human endocrine system. It is therefore important to effectively separate and measure the alkylphenols. To guide the chromatographic analysis of these compounds in practice, the development of quantitative relationship between the molecular structure and the retention time of alkylphenols becomes necessary. In this study, topological, constitutional, geometrical, electrostatic and quantum-chemical descriptors of 44 alkylphenols were calculated using a software, CODESSA, and these descriptors were pre-selected using the heuristic method. As a result, three-descriptor linear model (LM) was developed to describe the relationship between the molecular structure and the retention time of alkylphenols. Meanwhile, the non-linear regression model was also developed based on support vector machine (SVM) using the same three descriptors. The correlation coefficient (R(2)) for the LM and SVM was 0.98 and 0. 92, and the corresponding root-mean-square error was 0. 99 and 2. 77, respectively. By comparing the stability and prediction ability of the two models, it was found that the linear model was a better method for describing the quantitative relationship between the retention time of alkylphenols and the molecular structure. The results obtained suggested that the linear model could be applied for the chromatographic analysis of alkylphenols with known molecular structural parameters.

  5. Quadratic Blind Linear Unmixing: A Graphical User Interface for Tissue Characterization

    PubMed Central

    Gutierrez-Navarro, O.; Campos-Delgado, D.U.; Arce-Santana, E. R.; Jo, Javier A.

    2016-01-01

    Spectral unmixing is the process of breaking down data from a sample into its basic components and their abundances. Previous work has been focused on blind unmixing of multi-spectral fluorescence lifetime imaging microscopy (m-FLIM) datasets under a linear mixture model and quadratic approximations. This method provides a fast linear decomposition and can work without a limitation in the maximum number of components or end-members. Hence this work presents an interactive software which implements our blind end-member and abundance extraction (BEAE) and quadratic blind linear unmixing (QBLU) algorithms in Matlab. The options and capabilities of our proposed software are described in detail. When the number of components is known, our software can estimate the constitutive end-members and their abundances. When no prior knowledge is available, the software can provide a completely blind solution to estimate the number of components, the end-members and their abundances. The characterization of three case studies validates the performance of the new software: ex-vivo human coronary arteries, human breast cancer cell samples, and in-vivo hamster oral mucosa. The software is freely available in a hosted webpage by one of the developing institutions, and allows the user a quick, easy-to-use and efficient tool for multi/hyper-spectral data decomposition. PMID:26589467

  6. Quadratic blind linear unmixing: A graphical user interface for tissue characterization.

    PubMed

    Gutierrez-Navarro, O; Campos-Delgado, D U; Arce-Santana, E R; Jo, Javier A

    2016-02-01

    Spectral unmixing is the process of breaking down data from a sample into its basic components and their abundances. Previous work has been focused on blind unmixing of multi-spectral fluorescence lifetime imaging microscopy (m-FLIM) datasets under a linear mixture model and quadratic approximations. This method provides a fast linear decomposition and can work without a limitation in the maximum number of components or end-members. Hence this work presents an interactive software which implements our blind end-member and abundance extraction (BEAE) and quadratic blind linear unmixing (QBLU) algorithms in Matlab. The options and capabilities of our proposed software are described in detail. When the number of components is known, our software can estimate the constitutive end-members and their abundances. When no prior knowledge is available, the software can provide a completely blind solution to estimate the number of components, the end-members and their abundances. The characterization of three case studies validates the performance of the new software: ex-vivo human coronary arteries, human breast cancer cell samples, and in-vivo hamster oral mucosa. The software is freely available in a hosted webpage by one of the developing institutions, and allows the user a quick, easy-to-use and efficient tool for multi/hyper-spectral data decomposition. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  7. Can Functional Cardiac Age be Predicted from ECG in a Normal Healthy Population

    NASA Technical Reports Server (NTRS)

    Schlegel, Todd; Starc, Vito; Leban, Manja; Sinigoj, Petra; Vrhovec, Milos

    2011-01-01

    In a normal healthy population, we desired to determine the most age-dependent conventional and advanced ECG parameters. We hypothesized that changes in several ECG parameters might correlate with age and together reliably characterize the functional age of the heart. Methods: An initial study population of 313 apparently healthy subjects was ultimately reduced to 148 subjects (74 men, 84 women, in the range from 10 to 75 years of age) after exclusion criteria. In all subjects, ECG recordings (resting 5-minute 12-lead high frequency ECG) were evaluated via custom software programs to calculate up to 85 different conventional and advanced ECG parameters including beat-to-beat QT and RR variability, waveform complexity, and signal-averaged, high-frequency and spatial/spatiotemporal ECG parameters. The prediction of functional age was evaluated by multiple linear regression analysis using the best 5 univariate predictors. Results: Ignoring what were ultimately small differences between males and females, the functional age was found to be predicted (R2= 0.69, P < 0.001) from a linear combination of 5 independent variables: QRS elevation in the frontal plane (p<0.001), a new repolarization parameter QTcorr (p<0.001), mean high frequency QRS amplitude (p=0.009), the variability parameter % VLF of RRV (p=0.021) and the P-wave width (p=0.10). Here, QTcorr represents the correlation between the calculated QT and the measured QT signal. Conclusions: In apparently healthy subjects with normal conventional ECGs, functional cardiac age can be estimated by multiple linear regression analysis of mostly advanced ECG results. Because some parameters in the regression formula, such as QTcorr, high frequency QRS amplitude and P-wave width also change with disease in the same direction as with increased age, increased functional age of the heart may reflect subtle age-related pathologies in cardiac electrical function that are usually hidden on conventional ECG.

  8. Morse Code, Scrabble, and the Alphabet

    ERIC Educational Resources Information Center

    Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

    2004-01-01

    In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…

  9. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  10. Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

    NASA Astrophysics Data System (ADS)

    Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

    2017-11-01

    Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.

  11. A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

    PubMed

    Ferrari, Alberto; Comelli, Mario

    2016-12-01

    In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    PubMed

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  13. Automated Algorithms for Quantum-Level Accuracy in Atomistic Simulations: LDRD Final Report.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thompson, Aidan Patrick; Schultz, Peter Andrew; Crozier, Paul

    2014-09-01

    This report summarizes the result of LDRD project 12-0395, titled "Automated Algorithms for Quantum-level Accuracy in Atomistic Simulations." During the course of this LDRD, we have developed an interatomic potential for solids and liquids called Spectral Neighbor Analysis Poten- tial (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projectedmore » on to a basis of hyperspherical harmonics in four dimensions. The SNAP coef- ficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. Global optimization methods in the DAKOTA software package are used to seek out good choices of hyperparameters that define the overall structure of the SNAP potential. FitSnap.py, a Python-based software pack- age interfacing to both LAMMPS and DAKOTA is used to formulate the linear regression problem, solve it, and analyze the accuracy of the resultant SNAP potential. We describe a SNAP potential for tantalum that accurately reproduces a variety of solid and liquid properties. Most significantly, in contrast to existing tantalum potentials, SNAP correctly predicts the Peierls barrier for screw dislocation motion. We also present results from SNAP potentials generated for indium phosphide (InP) and silica (SiO 2 ). We describe efficient algorithms for calculating SNAP forces and energies in molecular dynamics simulations using massively parallel computers and advanced processor ar- chitectures. Finally, we briefly describe the MSM method for efficient calculation of electrostatic interactions on massively parallel computers.« less

  14. Separation in Logistic Regression: Causes, Consequences, and Control.

    PubMed

    Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg

    2018-04-01

    Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.

  15. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  16. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  17. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  18. Simplified large African carnivore density estimators from track indices.

    PubMed

    Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J

    2016-01-01

    The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y  =  αx  + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P  > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P  < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.

  19. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  20. Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

    PubMed

    Hemmila, April; McGill, Jim; Ritter, David

    2008-03-01

    To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.

  1. Linearity versus Nonlinearity of Offspring-Parent Regression: An Experimental Study of Drosophila Melanogaster

    PubMed Central

    Gimelfarb, A.; Willis, J. H.

    1994-01-01

    An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818

  2. Novel semi-automated kidney volume measurements in autosomal dominant polycystic kidney disease.

    PubMed

    Muto, Satoru; Kawano, Haruna; Isotani, Shuji; Ide, Hisamitsu; Horie, Shigeo

    2018-06-01

    We assessed the effectiveness and convenience of a novel semi-automatic kidney volume (KV) measuring high-speed 3D-image analysis system SYNAPSE VINCENT ® (Fuji Medical Systems, Tokyo, Japan) for autosomal dominant polycystic kidney disease (ADPKD) patients. We developed a novel semi-automated KV measurement software for patients with ADPKD to be included in the imaging analysis software SYNAPSE VINCENT ® . The software extracts renal regions using image recognition software and measures KV (VINCENT KV). The algorithm was designed to work with the manual designation of a long axis of a kidney including cysts. After using the software to assess the predictive accuracy of the VINCENT method, we performed an external validation study and compared accurate KV and ellipsoid KV based on geometric modeling by linear regression analysis and Bland-Altman analysis. Median eGFR was 46.9 ml/min/1.73 m 2 . Median accurate KV, Vincent KV and ellipsoid KV were 627.7, 619.4 ml (IQR 431.5-947.0) and 694.0 ml (IQR 488.1-1107.4), respectively. Compared with ellipsoid KV (r = 0.9504), Vincent KV correlated strongly with accurate KV (r = 0.9968), without systematic underestimation or overestimation (ellipsoid KV; 14.2 ± 22.0%, Vincent KV; - 0.6 ± 6.0%). There were no significant slice thickness-specific differences (p = 0.2980). The VINCENT method is an accurate and convenient semi-automatic method to measure KV in patients with ADPKD compared with the conventional ellipsoid method.

  3. An Ada Linear-Algebra Software Package Modeled After HAL/S

    NASA Technical Reports Server (NTRS)

    Klumpp, Allan R.; Lawson, Charles L.

    1990-01-01

    New avionics software written more easily. Software package extends Ada programming language to include linear-algebra capabilities similar to those of HAL/S programming language. Designed for such avionics applications as Space Station flight software. In addition to built-in functions of HAL/S, package incorporates quaternion functions used in Space Shuttle and Galileo projects and routines from LINPAK solving systems of equations involving general square matrices. Contains two generic programs: one for floating-point computations and one for integer computations. Written on IBM/AT personal computer running under PC DOS, v.3.1.

  4. Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

    PubMed

    Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

    2014-03-01

    In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.

  5. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  6. An Expert System for the Evaluation of Cost Models

    DTIC Science & Technology

    1990-09-01

    contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John

  7. Compound Identification Using Penalized Linear Regression on Metabolomics

    PubMed Central

    Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho

    2014-01-01

    Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894

  8. Control Variate Selection for Multiresponse Simulation.

    DTIC Science & Technology

    1987-05-01

    M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels

  9. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    ERIC Educational Resources Information Center

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  10. High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

    PubMed

    Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

    2018-05-30

    NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  12. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

    PubMed Central

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-01-01

    Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393

  13. NCCS Regression Test Harness

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  14. Adding Processing Functionality to the Sensor Web

    NASA Astrophysics Data System (ADS)

    Stasch, Christoph; Pross, Benjamin; Jirka, Simon; Gräler, Benedikt

    2017-04-01

    The Sensor Web allows discovering, accessing and tasking different kinds of environmental sensors in the Web, ranging from simple in-situ sensors to remote sensing systems. However, (geo-)processing functionality needs to be applied to integrate data from different sensor sources and to generate higher level information products. Yet, a common standardized approach for processing sensor data in the Sensor Web is still missing and the integration differs from application to application. Standardizing not only the provision of sensor data, but also the processing facilitates sharing and re-use of processing modules, enables reproducibility of processing results, and provides a common way to integrate external scalable processing facilities or legacy software. In this presentation, we provide an overview on on-going research projects that develop concepts for coupling standardized geoprocessing technologies with Sensor Web technologies. At first, different architectures for coupling sensor data services with geoprocessing services are presented. Afterwards, profiles for linear regression and spatio-temporal interpolation of the OGC Web Processing Services that allow consuming sensor data coming from and uploading predictions to Sensor Observation Services are introduced. The profiles are implemented in processing services for the hydrological domain. Finally, we illustrate how the R software can be coupled with existing OGC Sensor Web and Geoprocessing Services and present an example, how a Web app can be built that allows exploring the results of environmental models in an interactive way using the R Shiny framework. All of the software presented is available as Open Source Software.

  15. Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction.

    PubMed

    He, Dan; Kuhn, David; Parida, Laxmi

    2016-06-15

    Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. dhe@us.ibm.com. © The Author 2016. Published by Oxford University Press.

  16. Age estimation by pulp-to-tooth area ratio using cone-beam computed tomography: A preliminary analysis.

    PubMed

    Rai, Arpita; Acharya, Ashith B; Naikmasur, Venkatesh G

    2016-01-01

    Age estimation of living or deceased individuals is an important aspect of forensic sciences. Conventionally, pulp-to-tooth area ratio (PTR) measured from periapical radiographs have been utilized as a nondestructive method of age estimation. Cone-beam computed tomography (CBCT) is a new method to acquire three-dimensional images of the teeth in living individuals. The present study investigated age estimation based on PTR of the maxillary canines measured in three planes obtained from CBCT image data. Sixty subjects aged 20-85 years were included in the study. For each tooth, mid-sagittal, mid-coronal, and three axial sections-cementoenamel junction (CEJ), one-fourth root level from CEJ, and mid-root-were assessed. PTR was calculated using AutoCAD software after outlining the pulp and tooth. All statistical analyses were performed using an SPSS 17.0 software program. Linear regression analysis showed that only PTR in axial plane at CEJ had significant age correlation ( r = 0.32; P < 0.05). This is probably because of clearer demarcation of pulp and tooth outline at this level.

  17. Software for Storage and Management of Microclimatic Data for Preventive Conservation of Cultural Heritage

    PubMed Central

    Fernández-Navajas, Ángel; Merello, Paloma; Beltrán, Pedro; García-Diego, Fernando-Juan

    2013-01-01

    Cultural Heritage preventive conservation requires the monitoring of the parameters involved in the process of deterioration of artworks. Thus, both long-term monitoring of the environmental parameters as well as further analysis of the recorded data are necessary. The long-term monitoring at frequencies higher than 1 data point/day generates large volumes of data that are difficult to store, manage and analyze. This paper presents software which uses a free open source database engine that allows managing and interacting with huge amounts of data from environmental monitoring of cultural heritage sites. It is of simple operation and offers multiple capabilities, such as detection of anomalous data, inquiries, graph plotting and mean trajectories. It is also possible to export the data to a spreadsheet for analyses with more advanced statistical methods (principal component analysis, ANOVA, linear regression, etc.). This paper also deals with a practical application developed for the Renaissance frescoes of the Cathedral of Valencia. The results suggest infiltration of rainwater in the vault and weekly relative humidity changes related with the religious service schedules. PMID:23447005

  18. Estimating the Dead Space Volume Between a Headform and N95 Filtering Facepiece Respirator Using Microsoft Kinect.

    PubMed

    Xu, Ming; Lei, Zhipeng; Yang, James

    2015-01-01

    N95 filtering facepiece respirator (FFR) dead space is an important factor for respirator design. The dead space refers to the cavity between the internal surface of the FFR and the wearer's facial surface. This article presents a novel method to estimate the dead space volume of FFRs and experimental validation. In this study, six FFRs and five headforms (small, medium, large, long/narrow, and short/wide) are used for various FFR and headform combinations. Microsoft Kinect Sensors (Microsoft Corporation, Redmond, WA) are used to scan the headforms without respirators and then scan the headforms with the FFRs donned. The FFR dead space is formed through geometric modeling software, and finally the volume is obtained through LS-DYNA (Livermore Software Technology Corporation, Livermore, CA). In the experimental validation, water is used to measure the dead space. The simulation and experimental dead space volumes are 107.5-167.5 mL and 98.4-165.7 mL, respectively. Linear regression analysis is conducted to correlate the results from Kinect and water, and R(2) = 0.85.

  19. Statistical tools for analysis and modeling of cosmic populations and astronomical time series: CUDAHM and TSE

    NASA Astrophysics Data System (ADS)

    Loredo, Thomas; Budavari, Tamas; Scargle, Jeffrey D.

    2018-01-01

    This presentation provides an overview of open-source software packages addressing two challenging classes of astrostatistics problems. (1) CUDAHM is a C++ framework for hierarchical Bayesian modeling of cosmic populations, leveraging graphics processing units (GPUs) to enable applying this computationally challenging paradigm to large datasets. CUDAHM is motivated by measurement error problems in astronomy, where density estimation and linear and nonlinear regression must be addressed for populations of thousands to millions of objects whose features are measured with possibly complex uncertainties, potentially including selection effects. An example calculation demonstrates accurate GPU-accelerated luminosity function estimation for simulated populations of $10^6$ objects in about two hours using a single NVIDIA Tesla K40c GPU. (2) Time Series Explorer (TSE) is a collection of software in Python and MATLAB for exploratory analysis and statistical modeling of astronomical time series. It comprises a library of stand-alone functions and classes, as well as an application environment for interactive exploration of times series data. The presentation will summarize key capabilities of this emerging project, including new algorithms for analysis of irregularly-sampled time series.

  20. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  1. Simulated Analysis of Linear Reversible Enzyme Inhibition with SCILAB

    ERIC Educational Resources Information Center

    Antuch, Manuel; Ramos, Yaquelin; Álvarez, Rubén

    2014-01-01

    SCILAB is a lesser-known program (than MATLAB) for numeric simulations and has the advantage of being free software. A challenging software-based activity to analyze the most common linear reversible inhibition types with SCILAB is described. Students establish typical values for the concentration of enzyme, substrate, and inhibitor to simulate…

  2. Using Cognitive Tutor Software in Learning Linear Algebra Word Concept

    ERIC Educational Resources Information Center

    Yang, Kai-Ju

    2015-01-01

    This paper reports on a study of twelve 10th grade students using Cognitive Tutor, a math software program, to learn linear algebra word concept. The study's purpose was to examine whether students' mathematics performance as it is related to using Cognitive Tutor provided evidence to support Koedlinger's (2002) four instructional principles used…

  3. A dose-response curve for biodosimetry from a 6 MV electron linear accelerator

    PubMed Central

    Lemos-Pinto, M.M.P.; Cadena, M.; Santos, N.; Fernandes, T.S.; Borges, E.; Amaral, A.

    2015-01-01

    Biological dosimetry (biodosimetry) is based on the investigation of radiation-induced biological effects (biomarkers), mainly dicentric chromosomes, in order to correlate them with radiation dose. To interpret the dicentric score in terms of absorbed dose, a calibration curve is needed. Each curve should be constructed with respect to basic physical parameters, such as the type of ionizing radiation characterized by low or high linear energy transfer (LET) and dose rate. This study was designed to obtain dose calibration curves by scoring of dicentric chromosomes in peripheral blood lymphocytes irradiated in vitro with a 6 MV electron linear accelerator (Mevatron M, Siemens, USA). Two software programs, CABAS (Chromosomal Aberration Calculation Software) and Dose Estimate, were used to generate the curve. The two software programs are discussed; the results obtained were compared with each other and with other published low LET radiation curves. Both software programs resulted in identical linear and quadratic terms for the curve presented here, which was in good agreement with published curves for similar radiation quality and dose rates. PMID:26445334

  4. Analyzing longitudinal data with the linear mixed models procedure in SPSS.

    PubMed

    West, Brady T

    2009-09-01

    Many applied researchers analyzing longitudinal data share a common misconception: that specialized statistical software is necessary to fit hierarchical linear models (also known as linear mixed models [LMMs], or multilevel models) to longitudinal data sets. Although several specialized statistical software programs of high quality are available that allow researchers to fit these models to longitudinal data sets (e.g., HLM), rapid advances in general purpose statistical software packages have recently enabled analysts to fit these same models when using preferred packages that also enable other more common analyses. One of these general purpose statistical packages is SPSS, which includes a very flexible and powerful procedure for fitting LMMs to longitudinal data sets with continuous outcomes. This article aims to present readers with a practical discussion of how to analyze longitudinal data using the LMMs procedure in the SPSS statistical software package.

  5. Correlation of Vitamin D status and orthodontic-induced external apical root resorption

    PubMed Central

    Tehranchi, Azita; Sadighnia, Azin; Younessian, Farnaz; Abdi, Amir H.; Shirvani, Armin

    2017-01-01

    Background: Adequate Vitamin D is essential for dental and skeletal health in children and adult. The purpose of this study was to assess the correlation of serum Vitamin D level with external-induced apical root resorption (EARR) following fixed orthodontic treatment. Materials and Methods: In this cross-sectional study, the prevalence of Vitamin D deficiency (defined by25-hydroxyvitamin-D) was determined in 34 patients (23.5% male; age range 12–23 years; mean age 16.63 ± 2.84) treated with fixed orthodontic treatment. Root resorption of four maxillary incisors was measured using before and after periapical radiographs (136 measured teeth) by means of a design-to-purpose software to optimize data collection. Teeth with a maximum percentage of root resorption (%EARR) were indicated as representative root resorption for each patient. A multiple linear regression model and Pearson correlation coefficient were used to assess the association of Vitamin D status and observed EARR. P < 0.05 was considered statistically significant. Results: The Pearson coefficient between these two variables was determined about 0.15 (P = 0.38). Regression analysis revealed that Vitamin D status of the patients demonstrated no significant statistical correlation with EARR, after adjustment of confounding variables using linear regression model (P > 0.05). Conclusion: This study suggests that Vitamin D level is not among the clinical variables that are potential contributors for EARR. The prevalence of Vitamin D deficiency does not differ in patients with higher EARR. These data suggest the possibility that Vitamin D insufficiency may not contribute to the development of more apical root resorption although this remains to be confirmed by further longitudinal cohort studies. PMID:29238379

  6. Pseudo second order kinetics and pseudo isotherms for malachite green onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth; Sivanesan, S

    2006-08-25

    Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.

  7. Improving mass-univariate analysis of neuroimaging data by modelling important unknown covariates: Application to Epigenome-Wide Association Studies.

    PubMed

    Guillaume, Bryan; Wang, Changqing; Poh, Joann; Shen, Mo Jun; Ong, Mei Lyn; Tan, Pei Fang; Karnani, Neerja; Meaney, Michael; Qiu, Anqi

    2018-06-01

    Statistical inference on neuroimaging data is often conducted using a mass-univariate model, equivalent to fitting a linear model at every voxel with a known set of covariates. Due to the large number of linear models, it is challenging to check if the selection of covariates is appropriate and to modify this selection adequately. The use of standard diagnostics, such as residual plotting, is clearly not practical for neuroimaging data. However, the selection of covariates is crucial for linear regression to ensure valid statistical inference. In particular, the mean model of regression needs to be reasonably well specified. Unfortunately, this issue is often overlooked in the field of neuroimaging. This study aims to adopt the existing Confounder Adjusted Testing and Estimation (CATE) approach and to extend it for use with neuroimaging data. We propose a modification of CATE that can yield valid statistical inferences using Principal Component Analysis (PCA) estimators instead of Maximum Likelihood (ML) estimators. We then propose a non-parametric hypothesis testing procedure that can improve upon parametric testing. Monte Carlo simulations show that the modification of CATE allows for more accurate modelling of neuroimaging data and can in turn yield a better control of False Positive Rate (FPR) and Family-Wise Error Rate (FWER). We demonstrate its application to an Epigenome-Wide Association Study (EWAS) on neonatal brain imaging and umbilical cord DNA methylation data obtained as part of a longitudinal cohort study. Software for this CATE study is freely available at http://www.bioeng.nus.edu.sg/cfa/Imaging_Genetics2.html. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  8. Machine learning techniques for energy optimization in mobile embedded systems

    NASA Astrophysics Data System (ADS)

    Donohoo, Brad Kyoshi

    Mobile smartphones and other portable battery operated embedded systems (PDAs, tablets) are pervasive computing devices that have emerged in recent years as essential instruments for communication, business, and social interactions. While performance, capabilities, and design are all important considerations when purchasing a mobile device, a long battery lifetime is one of the most desirable attributes. Battery technology and capacity has improved over the years, but it still cannot keep pace with the power consumption demands of today's mobile devices. This key limiter has led to a strong research emphasis on extending battery lifetime by minimizing energy consumption, primarily using software optimizations. This thesis presents two strategies that attempt to optimize mobile device energy consumption with negligible impact on user perception and quality of service (QoS). The first strategy proposes an application and user interaction aware middleware framework that takes advantage of user idle time between interaction events of the foreground application to optimize CPU and screen backlight energy consumption. The framework dynamically classifies mobile device applications based on their received interaction patterns, then invokes a number of different power management algorithms to adjust processor frequency and screen backlight levels accordingly. The second strategy proposes the usage of machine learning techniques to learn a user's mobile device usage pattern pertaining to spatiotemporal and device contexts, and then predict energy-optimal data and location interface configurations. By learning where and when a mobile device user uses certain power-hungry interfaces (3G, WiFi, and GPS), the techniques, which include variants of linear discriminant analysis, linear logistic regression, non-linear logistic regression, and k-nearest neighbor, are able to dynamically turn off unnecessary interfaces at runtime in order to save energy.

  9. Wind tunnel test of Teledyne Geotech model 1564B cup anemometer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parker, M.J.; Addis, R.P.

    1991-04-04

    The Department of Energy (DOE) Environment, Safety and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0--25 mph regression equations than 0--50 mphmore » regression equations. Higher wind speeds were slightly overpredicted by the 0--25 mph regression equations when compared to 0--50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweight the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0--25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.« less

  10. Wind tunnel test of Teledyne Geotech model 1564B cup anemometer

    NASA Astrophysics Data System (ADS)

    Parker, M. J.; Addis, R. P.

    1991-04-01

    The Department of Energy (DOE) Environment, Safety, and Health Compliance Assessment (Tiger Team) of the Savannah River Site (SRS) questioned the method by which wind speed sensors (cup anemometers) are calibrated by the Environmental Technology Section (ETS). The Tiger Team member was concerned that calibration data was generated by running the wind tunnel to only 26 miles per hour (mph) when speeds exceeding 50 mph are readily obtainable. A wind tunnel experiment was conducted and confirmed the validity of the practice. Wind speeds common to SRS (6 mph) were predicted more accurately by 0-25 mph regression equations than 0-50 mph regression equations. Higher wind speeds were slightly overpredicted by the 0-25 mph regression equations when compared to 0-50 mph regression equations. However, the greater benefit of more accurate lower wind speed predictions accuracy outweigh the benefit of slightly better high (extreme) wind speed predictions. Therefore, it is concluded that 0-25 mph regression equations should continue to be utilized by ETS at SRS. During the Department of Energy Tiger Team audit, concerns were raised about the calibration of SRS cup anemometers. Wind speed is measured by ETS with Teledyne Geotech model 1564B cup anemometers, which are calibrated in the ETS wind tunnel. Linear regression lines are fitted to data points of tunnel speed versus anemometer output voltages up to 25 mph. The regression coefficients are then implemented into the data acquisition computer software when an instrument is installed in the field. The concern raised was that since the wind tunnel at SRS is able to generate a maximum wind speed higher than 25 mph, errors may be introduced in not using the full range of the wind tunnel.

  11. Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

    DTIC Science & Technology

    2015-07-15

    Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

  12. Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

    NASA Astrophysics Data System (ADS)

    Cepowski, Tomasz

    2017-06-01

    The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.

  13. Screening-level models to estimate partition ratios of organic chemicals between polymeric materials, air and water.

    PubMed

    Reppas-Chrysovitsinos, Efstathios; Sobek, Anna; MacLeod, Matthew

    2016-06-15

    Polymeric materials flowing through the technosphere are repositories of organic chemicals throughout their life cycle. Equilibrium partition ratios of organic chemicals between these materials and air (KMA) or water (KMW) are required for models of fate and transport, high-throughput exposure assessment and passive sampling. KMA and KMW have been measured for a growing number of chemical/material combinations, but significant data gaps still exist. We assembled a database of 363 KMA and 910 KMW measurements for 446 individual compounds and nearly 40 individual polymers and biopolymers, collected from 29 studies. We used the EPI Suite and ABSOLV software packages to estimate physicochemical properties of the compounds and we employed an empirical correlation based on Trouton's rule to adjust the measured KMA and KMW values to a standard reference temperature of 298 K. Then, we used a thermodynamic triangle with Henry's law constant to calculate a complete set of 1273 KMA and KMW values. Using simple linear regression, we developed a suite of single parameter linear free energy relationship (spLFER) models to estimate KMA from the EPI Suite-estimated octanol-air partition ratio (KOA) and KMW from the EPI Suite-estimated octanol-water (KOW) partition ratio. Similarly, using multiple linear regression, we developed a set of polyparameter linear free energy relationship (ppLFER) models to estimate KMA and KMW from ABSOLV-estimated Abraham solvation parameters. We explored the two LFER approaches to investigate (1) their performance in estimating partition ratios, and (2) uncertainties associated with treating all different polymers as a single "bulk" polymeric material compartment. The models we have developed are suitable for screening assessments of the tendency for organic chemicals to be emitted from materials, and for use in multimedia models of the fate of organic chemicals in the indoor environment. In screening applications we recommend that KMA and KMW be modeled as 0.06 ×KOA and 0.06 ×KOW respectively, with an uncertainty range of a factor of 15.

  14. Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

    ERIC Educational Resources Information Center

    Li, Deping; Oranje, Andreas

    2007-01-01

    Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…

  15. Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

    PubMed

    Ernst, Anja F; Albers, Casper J

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.

  16. Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

    PubMed Central

    Ernst, Anja F.

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971

  17. Estimating linear temporal trends from aggregated environmental monitoring data

    USGS Publications Warehouse

    Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.

    2017-01-01

    Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.

  18. Comparing The Effectiveness of a90/95 Calculations (Preprint)

    DTIC Science & Technology

    2006-09-01

    Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity

  19. Correlation and simple linear regression.

    PubMed

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  20. The Effects of Multiple Linked Representations on Students' Learning of Linear Relationships

    ERIC Educational Resources Information Center

    Ozgun-Koca, S. Asli

    2004-01-01

    The focus of this study was on comparing three groups of Algebra I 9th-year students: one group using linked representation software, the second group using similar software but with semi-linked representations, and the control group in order to examine the effects on students' understanding of linear relationships. Data collection methods…

  1. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    PubMed

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  2. Mathcad in the Chemistry Curriculum Symbolic Software in the Chemistry Curriculum

    NASA Astrophysics Data System (ADS)

    Zielinski, Theresa Julia

    2000-05-01

    Physical chemistry is such a broad discipline that the topics we expect average students to complete in two semesters usually exceed their ability for meaningful learning. Consequently, the number and kind of topics and the efficiency with which students can learn them are important concerns. What topics are essential and what can we do to provide efficient and effective access to those topics? How do we accommodate the fact that students come to upper-division chemistry courses with a variety of nonuniformly distributed skills, a bit of calculus, and some physics studied one or more years before physical chemistry? The critical balance between depth and breadth of learning in courses and curricula may be achieved through appropriate use of technology and especially through the use of symbolic mathematics software. Software programs such as Mathcad, Mathematica, and Maple, however, have learning curves that diminish their effectiveness for novices. There are several ways to address the learning curve conundrum. First, basic instruction in the software provided during laboratory sessions should be followed by requiring laboratory reports that use the software. Second, one should assign weekly homework that requires the software and builds student skills within the discipline and with the software. Third, a complementary method, supported by this column, is to provide students with Mathcad worksheets or templates that focus on one set of related concepts and incorporate a variety of features of the software that they are to use to learn chemistry. In this column we focus on two significant topics for young chemists. The first is curve-fitting and the statistical analysis of the fitting parameters. The second is the analysis of the rotation/vibration spectrum of a diatomic molecule, HCl. A broad spectrum of Mathcad documents exists for teaching chemistry. One collection of 50 documents can be found at http://www.monmouth.edu/~tzielins/mathcad/Lists/index.htm. Another collection of peer-reviewed documents is developing through this column at the JCE Internet Web site, http://jchemed.chem.wisc.edu/JCEWWW/Features/ McadInChem/index.html. With this column we add three peer-reviewed and tested Mathcad documents to the JCE site. In Linear Least-Squares Regression, Sidney H. Young and Andrzej Wierzbicki demonstrate various implicit and explicit methods for determining the slope and intercept of the regression line for experimental data. The document shows how to determine the standard deviation for the slope, the intercept, and the standard deviation of the overall fit. Students are next given the opportunity to examine the confidence level for the fit through the Student's t-test. Examination of the residuals of the fit leads students to explore the possibility of rejecting points in a set of data. The document concludes with a discussion of and practice with adding a quadratic term to create a polynomial fit to a set of data and how to determine if the quadratic term is statistically significant. There is full documentation of the various steps used throughout the exposition of the statistical concepts. Although the statistical methods presented in this worksheet are generally accessible to average physical chemistry students, an instructor would be needed to explain the finer points of the matrix methods used in some sections of the worksheet. The worksheet is accompanied by a set of data for students to use to practice the techniques presented. It would be worthwhile for students to spend one or two laboratory periods learning to use the concepts presented and then to apply them to experimental data they have collected for themselves. Any linear or linearizable data set would be appropriate for use with this Mathcad worksheet. Alternatively, instructors may select sections of the document suited to the skill level of their students and the laboratory tasks at hand. In a second Mathcad document, Non-Linear Least-Squares Regression, Young and Wierzbicki introduce the basic concepts of nonlinear curve-fitting and develop the techniques needed to fit a variety of mathematical functions to experimental data. This approach is especially important when mathematical models for chemical processes cannot be linearized. In Mathcad the Levenberg-Marquardt algorithm is used to determine the best fitting parameters for a particular mathematical model. As in linear least-squares, the goal of the fitting process is to find the values for the fitting parameters that minimize the sum of the squares of the deviations between the data and the mathematical model. Students are asked to determine the fitting parameters, use the Hessian matrix to compute the standard deviation of the fitting parameters, test for the significance of the parameters using Student's t-test, use residual analysis to test for data points to remove, and repeat the calculations for another set of data. The nonlinear least-squares procedure follows closely on the pattern set up for linear least-squares by the same authors (see above). If students master the linear least-squares worksheet content they will be able to master the nonlinear least-squares technique (see also refs 1, 2). In the third document, The Analysis of the Vibrational Spectrum of a Linear Molecule by Richard Schwenz, William Polik, and Sidney Young, the authors build on the concepts presented in the curve fitting worksheets described above. This vibrational analysis document, which supports a classic experiment performed in the physical chemistry laboratory, shows how a Mathcad worksheet can increase the efficiency by which a set of complicated manipulations for data reduction can be made more accessible for students. The increase in efficiency frees up time for students to develop a fuller understanding of the physical chemistry concepts important to the interpretation of spectra and understanding of bond vibrations in general. The analysis of the vibration/rotation spectrum for a linear molecule worksheet builds on the rich literature for this topic (3). Before analyzing their own spectral data, students practice and learn the concepts and methods of the HCl spectral analysis by using the fundamental and first harmonic vibrational frequencies provided by the authors. This approach has a fundamental pedagogical advantage. Most explanations in laboratory texts are very concise and lack mathematical details required by average students. This Mathcad worksheet acts as a tutor; it guides students through the essential concepts for data reduction and lets them focus on learning important spectroscopic concepts. The Mathcad worksheet is amply annotated. Students who have moderate skill with the software and have learned about regression analysis from the curve-fitting worksheets described in this column will be able to complete and understand their analysis of the IR spectrum of HCl. The three Mathcad worksheets described here stretch the physical chemistry curriculum by presenting important topics in forms that students can use with only moderate Mathcad skills. The documents facilitate learning by giving students opportunities to interact with the material in meaningful ways in addition to using the documents as sources of techniques for building their own data-reduction worksheets. However, working through these Mathcad worksheets is not a trivial task for the average student. Support needs to be provided by the instructor to ease students through more advanced mathematical and Mathcad processes. These worksheets raise the question of how much we can ask diligent students to do in one course and how much time they need to spend to master the essential concepts of that course. The Mathcad documents and associated PDF versions are available at the JCE Internet WWW site. The Mathcad documents require Mathcad version 6.0 or higher and the PDF files require Adobe Acrobat. Every effort has been made to make the documents fully compatible across the various Mathcad versions. Users may need to refer to Mathcad manuals for functions that vary with the Mathcad version number. Literature Cited 1. Bevington, P. R. Data Reduction and Error Analysis for the Physical Sciences; McGraw-Hill: New York, 1969. 2. Zielinski, T. J.; Allendoerfer, R. D. J. Chem. Educ. 1997, 74, 1001. 3. Schwenz, R. W.; Polik, W. F. J. Chem. Educ. 1999, 76, 1302.

  3. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  4. LHCb Build and Deployment Infrastructure for run 2

    NASA Astrophysics Data System (ADS)

    Clemencic, M.; Couturier, B.

    2015-12-01

    After the successful run 1 of the LHC, the LHCb Core software team has taken advantage of the long shutdown to consolidate and improve its build and deployment infrastructure. Several of the related projects have already been presented like the build system using Jenkins, as well as the LHCb Performance and Regression testing infrastructure. Some components are completely new, like the Software Configuration Database (using the Graph DB Neo4j), or the new packaging installation using RPM packages. Furthermore all those parts are integrated to allow easier and quicker releases of the LHCb Software stack, therefore reducing the risk of operational errors. Integration and Regression tests are also now easier to implement, allowing to improve further the software checks.

  5. A Constrained Linear Estimator for Multiple Regression

    ERIC Educational Resources Information Center

    Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

    2010-01-01

    "Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

  6. A class of non-linear exposure-response models suitable for health impact assessment applicable to large cohort studies of ambient air pollution.

    PubMed

    Nasari, Masoud M; Szyszkowicz, Mieczysław; Chen, Hong; Crouse, Daniel; Turner, Michelle C; Jerrett, Michael; Pope, C Arden; Hubbell, Bryan; Fann, Neal; Cohen, Aaron; Gapstur, Susan M; Diver, W Ryan; Stieb, David; Forouzanfar, Mohammad H; Kim, Sun-Young; Olives, Casey; Krewski, Daniel; Burnett, Richard T

    2016-01-01

    The effectiveness of regulatory actions designed to improve air quality is often assessed by predicting changes in public health resulting from their implementation. Risk of premature mortality from long-term exposure to ambient air pollution is the single most important contributor to such assessments and is estimated from observational studies generally assuming a log-linear, no-threshold association between ambient concentrations and death. There has been only limited assessment of this assumption in part because of a lack of methods to estimate the shape of the exposure-response function in very large study populations. In this paper, we propose a new class of variable coefficient risk functions capable of capturing a variety of potentially non-linear associations which are suitable for health impact assessment. We construct the class by defining transformations of concentration as the product of either a linear or log-linear function of concentration multiplied by a logistic weighting function. These risk functions can be estimated using hazard regression survival models with currently available computer software and can accommodate large population-based cohorts which are increasingly being used for this purpose. We illustrate our modeling approach with two large cohort studies of long-term concentrations of ambient air pollution and mortality: the American Cancer Society Cancer Prevention Study II (CPS II) cohort and the Canadian Census Health and Environment Cohort (CanCHEC). We then estimate the number of deaths attributable to changes in fine particulate matter concentrations over the 2000 to 2010 time period in both Canada and the USA using both linear and non-linear hazard function models.

  7. Implementation of software-based sensor linearization algorithms on low-cost microcontrollers.

    PubMed

    Erdem, Hamit

    2010-10-01

    Nonlinear sensors and microcontrollers are used in many embedded system designs. As the input-output characteristic of most sensors is nonlinear in nature, obtaining data from a nonlinear sensor by using an integer microcontroller has always been a design challenge. This paper discusses the implementation of six software-based sensor linearization algorithms for low-cost microcontrollers. The comparative study of the linearization algorithms is performed by using a nonlinear optical distance-measuring sensor. The performance of the algorithms is examined with respect to memory space usage, linearization accuracy and algorithm execution time. The implementation and comparison results can be used for selection of a linearization algorithm based on the sensor transfer function, expected linearization accuracy and microcontroller capacity. Copyright © 2010 ISA. Published by Elsevier Ltd. All rights reserved.

  8. Lumbar subcutaneous edema and degenerative spinal disease in patients with low back pain: a retrospective MRI study.

    PubMed

    Quattrocchi, C C; Giona, A; Di Martino, A; Gaudino, F; Mallio, C A; Errante, Y; Occhicone, F; Vitali, M A; Zobel, B B; Denaro, V

    2015-08-01

    This study was designed to determine the association between LSE, spondylolisthesis, facet arthropathy, lumbar canal stenosis, BMI, radiculopathy and bone marrow edema at conventional lumbar spine MR imaging. This is a retrospective radiological study; 441 consecutive patients with low back pain (224 men and 217 women; mean age 57.3 years; mean BMI 26) underwent conventional lumbar MRI using a 1.5-T magnet (Avanto, Siemens). Lumbar MR images were reviewed by consensus for the presence of LSE, spondylolisthesis, facet arthropathy, lumbar canal stenosis, radiculopathy and bone marrow edema. Descriptive statistics and association studies were conducted using STATA software 11.0. Association studies have been performed using linear univariate regression analysis and multivariate regression analysis, considering LSE as response variable. The overall prevalence of LSE was 40%; spondylolisthesis (p = 0.01), facet arthropathy (p < 0.001), BMI (p = 0.008) and lumbar canal stenosis (p < 0.001) were included in the multivariate regression model, whereas bone marrow edema, radiculopathy and age were not. LSE is highly associated with spondylolisthesis, facet arthropathy and BMI, suggesting underestimation of its clinical impact as an integral component in chronic lumbar back pain. Longitudinal simultaneous X-ray/MRI studies should be conducted to test the relationship of LSE with lumbar spinal instability and low back pain.

  9. On the design of classifiers for crop inventories

    NASA Technical Reports Server (NTRS)

    Heydorn, R. P.; Takacs, H. C.

    1986-01-01

    Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.

  10. LIMAO: Cross-platform software for simulating laser-induced alignment and orientation dynamics of linear-, symmetric- and asymmetric tops

    NASA Astrophysics Data System (ADS)

    Szidarovszky, Tamás; Jono, Maho; Yamanouchi, Kaoru

    2018-07-01

    A user-friendly and cross-platform software called Laser-Induced Molecular Alignment and Orientation simulator (LIMAO) has been developed. The program can be used to simulate within the rigid rotor approximation the rotational dynamics of gas phase molecules induced by linearly polarized intense laser fields at a given temperature. The software is implemented in the Java and Mathematica programming languages. The primary aim of LIMAO is to aid experimental scientists in predicting and analyzing experimental data representing laser-induced spatial alignment and orientation of molecules.

  11. Density conversion factor determined using a cone-beam computed tomography unit NewTom QR-DVT 9000.

    PubMed

    Lagravère, M O; Fang, Y; Carey, J; Toogood, R W; Packota, G V; Major, P W

    2006-11-01

    The purpose of this study was to determine a conversion coefficient for Hounsfield Units (HU) to material density (g cm(-3)) obtained from cone-beam computed tomography (CBCT-NewTom QR-DVT 9000) data. Six cylindrical models of materials with different densities were made and scanned using the NewTom QR-DVT 9000 Volume Scanner. The raw data were converted into DICOM format and analysed using Merge eFilm and AMIRA to determine the HU of different areas of the models. There was no significant difference (P = 0.846) between the HU given by each piece of software. A linear regression was performed using the density, rho (g cm(-3)), as the dependent variable in terms of the HU (H). The regression equation obtained was rho = 0.002H-0.381 with an R2 value of 0.986. The standard error of the estimation is 27.104 HU in the case of the Hounsfield Units and 0.064 g cm(-3) in the case of density. CBCT provides an effective option for determination of material density expressed as Hounsfield Units.

  12. Vertical bone measurements from cone beam computed tomography images using different software packages.

    PubMed

    Vasconcelos, Taruska Ventorini; Neves, Frederico Sampaio; Moraes, Lívia Almeida Bueno; Freitas, Deborah Queiroz

    2015-01-01

    This article aimed at comparing the accuracy of linear measurement tools of different commercial software packages. Eight fully edentulous dry mandibles were selected for this study. Incisor, canine, premolar, first molar and second molar regions were selected. Cone beam computed tomography (CBCT) images were obtained with i-CAT Next Generation. Linear bone measurements were performed by one observer on the cross-sectional images using three different software packages: XoranCat®, OnDemand3D® and KDIS3D®, all able to assess DICOM images. In addition, 25% of the sample was reevaluated for the purpose of reproducibility. The mandibles were sectioned to obtain the gold standard for each region. Intraclass coefficients (ICC) were calculated to examine the agreement between the two periods of evaluation; the one-way analysis of variance performed with the post-hoc Dunnett test was used to compare each of the software-derived measurements with the gold standard. The ICC values were excellent for all software packages. The least difference between the software-derived measurements and the gold standard was obtained with the OnDemand3D and KDIS3D (-0.11 and -0.14 mm, respectively), and the greatest, with the XoranCAT (+0.25 mm). However, there was no statistical significant difference between the measurements obtained with the different software packages and the gold standard (p> 0.05). In conclusion, linear bone measurements were not influenced by the software package used to reconstruct the image from CBCT DICOM data.

  13. Controls/CFD Interdisciplinary Research Software Generates Low-Order Linear Models for Control Design From Steady-State CFD Results

    NASA Technical Reports Server (NTRS)

    Melcher, Kevin J.

    1997-01-01

    The NASA Lewis Research Center is developing analytical methods and software tools to create a bridge between the controls and computational fluid dynamics (CFD) disciplines. Traditionally, control design engineers have used coarse nonlinear simulations to generate information for the design of new propulsion system controls. However, such traditional methods are not adequate for modeling the propulsion systems of complex, high-speed vehicles like the High Speed Civil Transport. To properly model the relevant flow physics of high-speed propulsion systems, one must use simulations based on CFD methods. Such CFD simulations have become useful tools for engineers that are designing propulsion system components. The analysis techniques and software being developed as part of this effort are an attempt to evolve CFD into a useful tool for control design as well. One major aspect of this research is the generation of linear models from steady-state CFD results. CFD simulations, often used during the design of high-speed inlets, yield high resolution operating point data. Under a NASA grant, the University of Akron has developed analytical techniques and software tools that use these data to generate linear models for control design. The resulting linear models have the same number of states as the original CFD simulation, so they are still very large and computationally cumbersome. Model reduction techniques have been successfully applied to reduce these large linear models by several orders of magnitude without significantly changing the dynamic response. The result is an accurate, easy to use, low-order linear model that takes less time to generate than those generated by traditional means. The development of methods for generating low-order linear models from steady-state CFD is most complete at the one-dimensional level, where software is available to generate models with different kinds of input and output variables. One-dimensional methods have been extended somewhat so that linear models can also be generated from two- and three-dimensional steady-state results. Standard techniques are adequate for reducing the order of one-dimensional CFD-based linear models. However, reduction of linear models based on two- and three-dimensional CFD results is complicated by very sparse, ill-conditioned matrices. Some novel approaches are being investigated to solve this problem.

  14. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  15. Summary of Documentation for DYNA3D-ParaDyn's Software Quality Assurance Regression Test Problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zywicz, Edward

    The Software Quality Assurance (SQA) regression test suite for DYNA3D (Zywicz and Lin, 2015) and ParaDyn (DeGroot, et al., 2015) currently contains approximately 600 problems divided into 21 suites, and is a required component of ParaDyn’s SQA plan (Ferencz and Oliver, 2013). The regression suite allows developers to ensure that software modifications do not unintentionally alter the code response. The entire regression suite is run prior to permanently incorporating any software modification or addition. When code modifications alter test problem results, the specific cause must be determined and fully understood before the software changes and revised test answers can bemore » incorporated. The regression suite is executed on LLNL platforms using a Python script and an associated data file. The user specifies the DYNA3D or ParaDyn executable, number of processors to use, test problems to run, and other options to the script. The data file details how each problem and its answer extraction scripts are executed. For each problem in the regression suite there exists an input deck, an eight-processor partition file, an answer file, and various extraction scripts. These scripts assemble a temporary answer file in a specific format from the simulation results. The temporary and stored answer files are compared to a specific level of numerical precision, and when differences are detected the test problem is flagged as failed. Presently, numerical results are stored and compared to 16 digits. At this accuracy level different processor types, compilers, number of partitions, etc. impact the results to various degrees. Thus, for consistency purposes the regression suite is run with ParaDyn using 8 processors on machines with a specific processor type (currently the Intel Xeon E5530 processor). For non-parallel regression problems, i.e., the two XFEM problems, DYNA3D is used instead. When environments or platforms change, executables using the current source code and the new resource are created and the regression suite is run. If differences in answers arise, the new answers are retained provided that the differences are inconsequential. This bootstrap approach allows the test suite answers to evolve in a controlled manner with a high level of confidence. Developers also run the entire regression suite with (serial) DYNA3D. While these results normally differ from the stored (parallel) answers, abnormal termination or wildly different values are strong indicators of potential issues.« less

  16. ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration

    PubMed Central

    Bottolo, Leonardo; Langley, Sarah R.; Petretto, Enrico; Tiret, Laurence; Tregouet, David; Richardson, Sylvia

    2011-01-01

    Summary: ESS++ is a C++ implementation of a fully Bayesian variable selection approach for single and multiple response linear regression. ESS++ works well both when the number of observations is larger than the number of predictors and in the ‘large p, small n’ case. In the current version, ESS++ can handle several hundred observations, thousands of predictors and a few responses simultaneously. The core engine of ESS++ for the selection of relevant predictors is based on Evolutionary Monte Carlo. Our implementation is open source, allowing community-based alterations and improvements. Availability: C++ source code and documentation including compilation instructions are available under GNU licence at http://bgx.org.uk/software/ESS.html. Contact: l.bottolo@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21233165

  17. Linear regression analysis of survival data with missing censoring indicators.

    PubMed

    Wang, Qihua; Dinse, Gregg E

    2011-04-01

    Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

  18. An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.

    DTIC Science & Technology

    1983-09-01

    books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40

  19. Testing hypotheses for differences between linear regression lines

    Treesearch

    Stanley J. Zarnoch

    2009-01-01

    Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...

  20. Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

    ERIC Educational Resources Information Center

    Schafer, William D.; Wang, Yuh-Yin

    A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…

  1. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    ERIC Educational Resources Information Center

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  2. Solving large mixed linear models using preconditioned conjugate gradient iteration.

    PubMed

    Strandén, I; Lidauer, M

    1999-12-01

    Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.

  3. Work related stress and blood glucose levels.

    PubMed

    Sancini, A; Ricci, S; Tomei, F; Sacco, C; Pacchiarotti, A; Nardone, N; Ricci, P; Suppi, A; De Cesare, D P; Anzelmo, V; Giubilati, R; Pimpinella, B; Rosati, M V; Tomei, G

    2017-01-01

    The aim of the study is to evaluate work-related subjective stress in a group of workers on a major Italian company in the field of healthcare through the administration of a valid "questionnaire-tool indicator" (HSE Indicator Tool), and to analyze any correlation between stress levels taken from questionnaire scores and blood glucose values. We studied a final sample consisting of 241 subjects with different tasks. The HSE questionnaire - made up of 35 items (divided into 7 organizational dimensions) with 5 possible answers - has been distributed to all the subjects in occasion of the health surveillance examinations provided by law. The questionnaire was then analyzed using its specific software to process the results related to the 7 dimensions. These results were compared using the Pearson correlation and multiple linear regression with the blood glucose values obtained from each subject. From the analysis of the data the following areas resulted critical, in other words linked to an intermediate (yellow area) or high (red area) condition of stress: sustain from managers, sustain from colleagues, quality of relationships and professional changes. A significant positive correlation (p <0.05) between the mean values of all critical areas and the concentrations of glucose values have been highlighted with the correlation index of Pearson. Multiple linear regression confirmed these findings, showing that the critical dimensions resulting from the questionnaire were the significant variables that can increase the levels of blood glucose. The preliminary results indicate that perceived work stress can be statistically associated with increased levels of blood glucose.

  4. Impact of Dental Disorders and its Influence on Self Esteem Levels among Adolescents.

    PubMed

    Kaur, Puneet; Singh, Simarpreet; Mathur, Anmol; Makkar, Diljot Kaur; Aggarwal, Vikram Pal; Batra, Manu; Sharma, Anshika; Goyal, Nikita

    2017-04-01

    Self esteem is more of a psychological concept therefore, even the common dental disorders like dental trauma, tooth loss and untreated carious lesions may affect the self esteem thus influencing the quality of life. This study aims to assess the impact of dental disorders among the adolescents on their self esteem level. The present cross-sectional study was conducted among 10 to 17 years adolescents. In order to obtain a representative sample, multistage sampling technique was used and sample was selected based on Probability Proportional to Enrolment size (PPE). Oral health assessment was carried out using WHO type III examination and self esteem was estimated using the Rosenberg Self Esteem Scale score (RSES). The descriptive and inferential analysis of the data was done by using IBM SPSS software. Logistic and linear regression analysis was executed to test the individual association of different independent clinical variables with self esteem. Total sample of 1140 adolescents with mean age of 14.95 ±2.08 and RSES of 27.09 ±3.12 were considered. Stepwise multiple linear regression analysis was applied and best predictors in relation to RSES in the descending order were Dental Health Component (DHC), Aesthetic Component (AC), dental decay {(aesthetic zone), (masticatory zone)}, tooth loss {(aesthetic zone), (masticatory zone)} and anterior fracture of tooth. It was found that various dental disorders like malocclusion, anterior traumatic tooth, tooth loss and untreated decay causes a profound impact on aesthetics and psychosocial behaviour of adolescents, thus affecting their self esteem.

  5. The Association Between Racial and Gender Discrimination and Body Mass Index Among Residents Living in Lower-income Housing

    PubMed Central

    Shelton, Rachel C.; Puleo, Elaine; Bennett, Gary G.; McNeill, Lorna H.; Sorensen, Glorian; Emmons, Karen M.

    2010-01-01

    Background Research on the association between self-reported racial or gender discrimination and body mass index (BMI) has been limited and inconclusive to date, particularly among lower-income populations. Objectives The aim of the current study was to examine the association between self-reported racial and gender discrimination and BMI among a sample of adult residents living in 12 urban lower-income housing sites in Boston, Masschusetts (USA). Methods Baseline survey data were collected among 1,307 (weighted N=1907) study participants. For analyses, linear regression models with a cluster design were conducted using SUDAAN and SAS statistical software. Results Our sample was predominately Black (weighted n=956) and Hispanic (weighted n=857), and female (weighted n=1420), with a mean age of 49.3 (SE: .40) and mean BMI of 30.2 kg m−2 (SE: .19). Nearly 47% of participants reported ever experiencing racial discrimination, and 24.8% reported ever experiencing gender discrimination. In bivariate and multivariable linear regression models, no main effect association was found between either racial or gender discrimination and BMI. Conclusions While our findings suggest that self-reported discrimination is not a key determinant of BMI among lower-income housing residents, these results should be considered in light of study limitations. Future researchers may want to investigate this association among other relevant samples, and other social contextual and cultural factors should be explored to understand how they contribute to disparities. PMID:19769005

  6. The association between racial and gender discrimination and body mass index among residents living in lower-income housing.

    PubMed

    Shelton, Rachel C; Puleo, Elaine; Bennett, Gary G; McNeill, Lorna H; Sorensen, Glorian; Emmons, Karen M

    2009-01-01

    Research on the association between self-reported racial or gender discrimination and body mass index (BMI) has been limited and inconclusive to date, particularly among lower-income populations. The aim of the current study was to examine the association between self-reported racial and gender discrimination and BMI among a sample of adult residents living in 12 urban lower-income housing sites in Boston, Masschusetts (USA). Baseline survey data were collected among 1,307 (weighted N = 1907) study participants. For analyses, linear regression models with a cluster design were conducted using SUDAAN and SAS statistical software. Our sample was predominately Black (weighted n = 956) and Hispanic (weighted n = 857), and female (weighted n = 1420), with a mean age of 49.3 (SE: .40) and mean BMI of 30.2 kg m(-2) (SE: .19). Nearly 47% of participants reported ever experiencing racial discrimination, and 24.8% reported ever experiencing gender discrimination. In bivariate and multivariable linear regression models, no main effect association was found between either racial or gender discrimination and BMI. While our findings suggest that self-reported discrimination is not a key determinant of BMI among lower-income housing residents, these results should be considered in light of study limitations. Future researchers may want to investigate this association among other relevant samples, and other social contextual and cultural factors should be explored to understand how they contribute to disparities.

  7. Estimating monotonic rates from biological data using local linear regression.

    PubMed

    Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

    2017-03-01

    Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.

  8. Locally linear regression for pose-invariant face recognition.

    PubMed

    Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

    2007-07-01

    The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.

  9. Application of Scion image software to the simultaneous determination of curcuminoids in turmeric (Curcuma longa).

    PubMed

    Sotanaphun, Uthai; Phattanawasin, Panadda; Sriphong, Lawan

    2009-01-01

    Curcumin, desmethoxycurcumin and bisdesmethoxycurcumin are bioactive constituents of turmeric (Curcuma longa). Owing to their different potency, quality control of turmeric based on the content of each curcuminoid is more reliable than that based on total curcuminoids. However, to perform such an assay, high-cost instrument is needed. To develop a simple and low-cost method for the simultaneous quantification of three curcuminoids in turmeric using TLC and the public-domain software Scion Image. The image of a TLC chromatogram of turmeric extract was recorded using a digital scanner. The density of the TLC spot of each curcuminoid was analysed by the Scion Image software. The density value was transformed to concentration by comparison with the calibration curve of standard curcuminoids developed on the same TLC plate. The polynomial regression data for all curcuminoids showed good linear relationship with R(2) > 0.99 in the concentration range of 0.375-6 microg/spot. The limits of detection and quantitation were 43-73 and 143-242 ng/spot, respectively. The method gave adequate precision, accuracy and recovery. The contents of each curcuminoid determined using this method were not significantly different from those determined using the TLC densitometric method. TLC image analysis using Scion Image is shown to be a reliable method for the simultaneous analysis of the content of each curcuminoid in turmeric.

  10. Preliminary Survey on TRY Forest Traits and Growth Index Relations - New Challenges

    NASA Astrophysics Data System (ADS)

    Lyubenova, Mariyana; Kattge, Jens; van Bodegom, Peter; Chikalanov, Alexandre; Popova, Silvia; Zlateva, Plamena; Peteva, Simona

    2016-04-01

    Forest ecosystems provide critical ecosystem goods and services, including food, fodder, water, shelter, nutrient cycling, and cultural and recreational value. Forests also store carbon, provide habitat for a wide range of species and help alleviate land degradation and desertification. Thus they have a potentially significant role to play in climate change adaptation planning through maintaining ecosystem services and providing livelihood options. Therefore the study of forest traits is such an important issue not just for individual countries but for the planet as a whole. We need to know what functional relations between forest traits exactly can express TRY data base and haw it will be significant for the global modeling and IPBES. The study of the biodiversity characteristics at all levels and functional links between them is extremely important for the selection of key indicators for assessing biodiversity and ecosystem services for sustainable natural capital control. By comparing the available information in tree data bases: TRY, ITR (International Tree Ring) and SP-PAM the 42 tree species are selected for the traits analyses. The dependence between location characteristics (latitude, longitude, altitude, annual precipitation, annual temperature and soil type) and forest traits (specific leaf area, leaf weight ratio, wood density and growth index) is studied by by multiply regression analyses (RDA) using the statistical software package Canoco 4.5. The Pearson correlation coefficient (measure of linear correlation), Kendal rank correlation coefficient (non parametric measure of statistical dependence) and Spearman correlation coefficient (monotonic function relationship between two variables) are calculated for each pair of variables (indexes) and species. After analysis of above mentioned correlation coefficients the dimensional linear regression models, multidimensional linear and nonlinear regression models and multidimensional neural networks models are built. The strongest dependence between It and WD was obtained. The research will support the work on: Strategic Plan for Biodiversity 2011-2020, modelling and implementation of ecosystem-based approaches to climate change adaptation and disaster risk reduction. Key words: Specific leaf area (SLA), Leaf weight ratio (LWR), Wood density (WD), Growth index (It)

  11. [Visual field progression in glaucoma: cluster analysis].

    PubMed

    Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

    2012-11-01

    Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best results, it is preferable to compare the analyses of several tests in combination with morphologic exam. Copyright © 2012 Elsevier Masson SAS. All rights reserved.

  12. Bayesian models for comparative analysis integrating phylogenetic uncertainty.

    PubMed

    de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P

    2012-06-28

    Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.

  13. Bayesian models for comparative analysis integrating phylogenetic uncertainty

    PubMed Central

    2012-01-01

    Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602

  14. Effect of Malmquist bias on correlation studies with IRAS data base

    NASA Technical Reports Server (NTRS)

    Verter, Frances

    1993-01-01

    The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.

  15. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  16. Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2013-01-01

    This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…

  17. Application of Experimental and Quasi-Experimental Research Designs to Educational Software Evaluation.

    ERIC Educational Resources Information Center

    Muller, Eugene W.

    1985-01-01

    Develops generalizations for empirical evaluation of software based upon suitability of several research designs--pretest posttest control group, single-group pretest posttest, nonequivalent control group, time series, and regression discontinuity--to type of software being evaluated, and on circumstances under which evaluation is conducted. (MBR)

  18. Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2011-01-01

    Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…

  19. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    ERIC Educational Resources Information Center

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  20. Classical Testing in Functional Linear Models.

    PubMed

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.

  1. Classical Testing in Functional Linear Models

    PubMed Central

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155

  2. Age estimation by dentin translucency measurement using digital method: An institutional study

    PubMed Central

    Gupta, Shalini; Chandra, Akhilesh; Agnihotri, Archana; Gupta, Om Prakash; Maurya, Niharika

    2017-01-01

    Aims: The aims of the present study were to measure translucency on sectioned teeth using available computer hardware and software, to correlate dimensions of root dentin translucency with age, and to assess whether translucency is reliable for age estimation. Materials and Methods: A pilot study was done on 62 freshly extracted single-rooted permanent teeth from 62 different individuals (35 males and 27 females) and their 250 μm thick sections were prepared by micromotor, carborundum disks, and Arkansas stone. Each tooth section was scanned and the images were opened in the Adobe Photoshop software. Measurement of root dentin translucency (TD length) was done on the scanned image by placing two guides (A and B) along the x-axis of ABFO NO. 2 scale. Unpaired t-test, regression analysis, and Pearson correlation coefficient were used as statistical tools. Results: A linear relationship was observed between TD length and age in the regression analysis. The Pearson correlation analysis showed that there was positive correlation (r = 0.52, P = 0.0001) between TD length and age. However, no significant (P > 0.05) difference was observed in the TD length between male (8.44 ± 2.92 mm) and female (7.80 ± 2.79 mm) samples. Conclusion: Translucency of the root dentin increases with age and it can be used as a reliable parameter for the age estimation. The method used here to digitally select and measure translucent root dentin is more refined, better correlated to age, and produce superior age estimation. PMID:28584476

  3. Comparison of two-concentration with multi-concentration linear regressions: Retrospective data analysis of multiple regulated LC-MS bioanalytical projects.

    PubMed

    Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

    2013-09-01

    Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. A Linear Regression and Markov Chain Model for the Arabian Horse Registry

    DTIC Science & Technology

    1993-04-01

    as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to

  5. Development of a User Interface for a Regression Analysis Software Tool

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  6. Validation of a single-platform method for hematopoietic CD34+ stem cells enumeration according to accreditation procedure.

    PubMed

    Massin, Frédéric; Huili, Cai; Decot, Véronique; Stoltz, Jean-François; Bensoussan, Danièle; Latger-Cannard, Véronique

    2015-01-01

    Stem cells for autologous and allogenic transplantation are obtained from several sources including bone marrow, peripheral blood or cord blood. Accurate enumeration of viable CD34+ hematopoietic stem cells (HSC) is routinely used in clinical settings, especially to monitor progenitor cell mobilization and apheresis. The number of viable CD34+ HSC has also been shown to be the most critical factor in haematopoietic engraftment. The International Society for Cellular Therapy actually recommends the use of single-platform flow cytometry system using 7-AAD as a viability dye. In a way to move routine analysis from a BD FACSCaliburTM instrument to a BD FACSCantoTM II, according to ISO 15189 standard guidelines, we define laboratory performance data of the BDTM Stem Cell Enumeration (SCE) kit on a CE-IVD system including a BD FACSCanto II flow cytometer and the BD FACSCantoTM Clinical Software. InterQCTM software, a real time internet laboratory QC management system developed by VitroTM and distributed by Becton DickinsonTM, was also tested to monitor daily QC data, to define the internal laboratory statistics and to compare them to external laboratories. Precision was evaluated with BDTM Stem Cell Control (high and low) results and the InterQC software, an internet laboratory QC management system by Vitro. This last one drew Levey-Jennings curves and generated numeral statistical parameters allowing detection of potential changes in the system performances as well as interlaboratory comparisons. Repeatability, linearity and lower limits of detection were obtained with routine samples from different origins. Agreement evaluation between BD FACSCanto II system versus BD FACSCalibur system was tested on fresh peripheral blood, freeze-thawed apheresis, fresh bone marrow and fresh cord blood samples. Instrument's measure and staining repeatability clearly evidenced acceptable variability on the different samples tested. Intra- and inter-laboratory CV in CD34+ cell absolute count are consistent and reproducible. Linearity analysis, established between 2 and 329 cells/μl showed a linear relation between expected counts and measured counts (R2=0.97). Linear regression and Bland-Altman representations showed an excellent correlation on samples from different sources between the two systems and allowed the transfer of routine analysis from BD FACSCalibur to BD FACSCanto II. The BD SCE kit provides an accurate measure of the CD34 HSC, and can be used in daily routine to optimize the enumeration of hematopoietic CD34+ stem cells by flow cytometry. Moreover, the InterQC system seems to be a very useful tool for laboratory daily quality monitoring and thus for accreditation.

  7. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  8. Analysis of linear measurements on 3D surface models using CBCT data segmentation obtained by automatic standard pre-set thresholds in two segmentation software programs: an in vitro study.

    PubMed

    Poleti, Marcelo Lupion; Fernandes, Thais Maria Freire; Pagin, Otávio; Moretti, Marcela Rodrigues; Rubira-Bullen, Izabel Regina Fischer

    2016-01-01

    The aim of this in vitro study was to evaluate the reliability and accuracy of linear measurements on three-dimensional (3D) surface models obtained by standard pre-set thresholds in two segmentation software programs. Ten mandibles with 17 silica markers were scanned for 0.3-mm voxels in the i-CAT Classic (Imaging Sciences International, Hatfield, PA, USA). Twenty linear measurements were carried out by two observers two times on the 3D surface models: the Dolphin Imaging 11.5 (Dolphin Imaging & Management Solutions, Chatsworth, CA, USA), using two filters(Translucent and Solid-1), and in the InVesalius 3.0.0 (Centre for Information Technology Renato Archer, Campinas, SP, Brazil). The physical measurements were made by another observer two times using a digital caliper on the dry mandibles. Excellent intra- and inter-observer reliability for the markers, physical measurements, and 3D surface models were found (intra-class correlation coefficient (ICC) and Pearson's r ≥ 0.91). The linear measurements on 3D surface models by Dolphin and InVesalius software programs were accurate (Dolphin Solid-1 > InVesalius > Dolphin Translucent). The highest absolute and percentage errors were obtained for the variable R1-R1 (1.37 mm) and MF-AC (2.53 %) in the Dolphin Translucent and InVesalius software, respectively. Linear measurements on 3D surface models obtained by standard pre-set thresholds in the Dolphin and InVesalius software programs are reliable and accurate compared with physical measurements. Studies that evaluate the reliability and accuracy of the 3D models are necessary to ensure error predictability and to establish diagnosis, treatment plan, and prognosis in a more realistic way.

  9. Teacher-Designed Software for Interactive Linear Equations: Concepts, Interpretive Skills, Applications & Word-Problem Solving.

    ERIC Educational Resources Information Center

    Lawrence, Virginia

    No longer just a user of commercial software, the 21st century teacher is a designer of interactive software based on theories of learning. This software, a comprehensive study of straightline equations, enhances conceptual understanding, sketching, graphic interpretive and word problem solving skills as well as making connections to real-life and…

  10. Log-normal frailty models fitted as Poisson generalized linear mixed models.

    PubMed

    Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver

    2016-12-01

    The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

    NASA Astrophysics Data System (ADS)

    Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

    2007-07-01

    Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.

  12. Scilab software as an alternative low-cost computing in solving the linear equations problem

    NASA Astrophysics Data System (ADS)

    Agus, Fahrul; Haviluddin

    2017-02-01

    Numerical computation packages are widely used both in teaching and research. These packages consist of license (proprietary) and open source software (non-proprietary). One of the reasons to use the package is a complexity of mathematics function (i.e., linear problems). Also, number of variables in a linear or non-linear function has been increased. The aim of this paper was to reflect on key aspects related to the method, didactics and creative praxis in the teaching of linear equations in higher education. If implemented, it could be contribute to a better learning in mathematics area (i.e., solving simultaneous linear equations) that essential for future engineers. The focus of this study was to introduce an additional numerical computation package of Scilab as an alternative low-cost computing programming. In this paper, Scilab software was proposed some activities that related to the mathematical models. In this experiment, four numerical methods such as Gaussian Elimination, Gauss-Jordan, Inverse Matrix, and Lower-Upper Decomposition (LU) have been implemented. The results of this study showed that a routine or procedure in numerical methods have been created and explored by using Scilab procedures. Then, the routine of numerical method that could be as a teaching material course has exploited.

  13. Descriptions of Free and Freeware Software in the Mathematics Teaching

    NASA Astrophysics Data System (ADS)

    Antunes de Macedo, Josue; Neves de Almeida, Samara; Voelzke, Marcos Rincon

    2016-05-01

    This paper presents the analysis and the cataloging of free and freeware mathematical software available on the internet, a brief explanation of them, and types of licenses for use in teaching and learning. The methodology is based on the qualitative research. Among the different types of software found, it stands out in algebra, the Winmat, that works with linear algebra, matrices and linear systems. In geometry, the GeoGebra, which can be used in the study of functions, plan and spatial geometry, algebra and calculus. For graphing, can quote the Graph and Graphequation. With Graphmatica software, it is possible to build various graphs of mathematical equations on the same screen, representing cartesian equations, inequalities, parametric among other functions. The Winplot allows the user to build graphics in two and three dimensions functions and mathematical equations. Thus, this work aims to present the teachers some free math software able to be used in the classroom.

  14. Biostatistics Series Module 6: Correlation and Linear Regression.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

  15. Biostatistics Series Module 6: Correlation and Linear Regression

    PubMed Central

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175

  16. Cost-aware request routing in multi-geography cloud data centres using software-defined networking

    NASA Astrophysics Data System (ADS)

    Yuan, Haitao; Bi, Jing; Li, Bo Hu; Tan, Wei

    2017-03-01

    Current geographically distributed cloud data centres (CDCs) require gigantic energy and bandwidth costs to provide multiple cloud applications to users around the world. Previous studies only focus on energy cost minimisation in distributed CDCs. However, a CDC provider needs to deliver gigantic data between users and distributed CDCs through internet service providers (ISPs). Geographical diversity of bandwidth and energy costs brings a highly challenging problem of how to minimise the total cost of a CDC provider. With the recently emerging software-defined networking, we study the total cost minimisation problem for a CDC provider by exploiting geographical diversity of energy and bandwidth costs. We formulate the total cost minimisation problem as a mixed integer non-linear programming (MINLP). Then, we develop heuristic algorithms to solve the problem and to provide a cost-aware request routing for joint optimisation of the selection of ISPs and the number of servers in distributed CDCs. Besides, to tackle the dynamic workload in distributed CDCs, this article proposes a regression-based workload prediction method to obtain future incoming workload. Finally, this work evaluates the cost-aware request routing by trace-driven simulation and compares it with the existing approaches to demonstrate its effectiveness.

  17. Age estimation by pulp-to-tooth area ratio using cone-beam computed tomography: A preliminary analysis

    PubMed Central

    Rai, Arpita; Acharya, Ashith B.; Naikmasur, Venkatesh G.

    2016-01-01

    Background: Age estimation of living or deceased individuals is an important aspect of forensic sciences. Conventionally, pulp-to-tooth area ratio (PTR) measured from periapical radiographs have been utilized as a nondestructive method of age estimation. Cone-beam computed tomography (CBCT) is a new method to acquire three-dimensional images of the teeth in living individuals. Aims: The present study investigated age estimation based on PTR of the maxillary canines measured in three planes obtained from CBCT image data. Settings and Design: Sixty subjects aged 20–85 years were included in the study. Materials and Methods: For each tooth, mid-sagittal, mid-coronal, and three axial sections—cementoenamel junction (CEJ), one-fourth root level from CEJ, and mid-root—were assessed. PTR was calculated using AutoCAD software after outlining the pulp and tooth. Statistical Analysis Used: All statistical analyses were performed using an SPSS 17.0 software program. Results and Conclusions: Linear regression analysis showed that only PTR in axial plane at CEJ had significant age correlation (r = 0.32; P < 0.05). This is probably because of clearer demarcation of pulp and tooth outline at this level. PMID:28123269

  18. The Comparison of Dietary Behaviors among Rural Controlled and Uncontrolled Hypertensive Patients.

    PubMed

    Kamran, Aziz; Shekarchi, Ali Akbar; Sharifian, Elham; Heydari, Heshmatolah

    2016-01-01

    Nutrition is a dominant peripheral factor in increasing blood pressure; however, little information is available about the nutritional status of hypertensive patients in Iran. This study aimed to compare nutritional behaviors of the rural controlled and uncontrolled hypertensive patients and to determine the predictive power of nutritional behaviors from blood pressure. This cross-sectional study was conducted on 671 rural hypertensive patients, using multistage random sampling method in Ardabil city in 2013. Data were collected by a 3-day food record questionnaire. Nutritional data were extracted by Nutritionist 4 software and analyzed by the SPSS 18 software using Pearson correlation, multiple linear regression, ANOVA, and independent t-test. A significant difference was observed in the means of fat intake, cholesterol, saturated fat, sodium, energy, calcium, vitamin C, fiber, and nutritional knowledge between controlled and uncontrolled groups. In the controlled group, sodium, saturated fats, vitamin C, calcium, and energy intake explained 30.6% of the variations in blood pressure and, in the uncontrolled group, sodium, carbohydrate, fiber intake, and nutritional knowledge explained 83% of the variations in blood pressure. There was a significant difference in the nutritional behavior between the two groups and changes in blood pressure could be explained significantly by nutritional behaviors.

  19. Novel non-contact retina camera for the rat and its application to dynamic retinal vessel analysis

    PubMed Central

    Link, Dietmar; Strohmaier, Clemens; Seifert, Bernd U.; Riemer, Thomas; Reitsamer, Herbert A.; Haueisen, Jens; Vilser, Walthard

    2011-01-01

    We present a novel non-invasive and non-contact system for reflex-free retinal imaging and dynamic retinal vessel analysis in the rat. Theoretical analysis was performed prior to development of the new optical design, taking into account the optical properties of the rat eye and its specific illumination and imaging requirements. A novel optical model of the rat eye was developed for use with standard optical design software, facilitating both sequential and non-sequential modes. A retinal camera for the rat was constructed using standard optical and mechanical components. The addition of a customized illumination unit and existing standard software enabled dynamic vessel analysis. Seven-minute in-vivo vessel diameter recordings performed on 9 Brown-Norway rats showed stable readings. On average, the coefficient of variation was (1.1 ± 0.19) % for the arteries and (0.6 ± 0.08) % for the veins. The slope of the linear regression analysis was (0.56 ± 0.26) % for the arteries and (0.15 ± 0.27) % for the veins. In conclusion, the device can be used in basic studies of retinal vessel behavior. PMID:22076270

  20. Gis-Based Spatial Statistical Analysis of College Graduates Employment

    NASA Astrophysics Data System (ADS)

    Tang, R.

    2012-07-01

    It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.

  1. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    ERIC Educational Resources Information Center

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  2. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    PubMed

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  3. Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

    USGS Publications Warehouse

    Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

    2009-01-01

    In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.

  4. CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

    NASA Astrophysics Data System (ADS)

    Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

    2007-11-01

    Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.

  5. Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

    NASA Astrophysics Data System (ADS)

    Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

    2017-11-01

    This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation

  6. Support Vector Machine algorithm for regression and classification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Chenggang; Zavaljevski, Nela

    2001-08-01

    The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less

  7. Isobio software: biological dose distribution and biological dose volume histogram from physical dose conversion using linear-quadratic-linear model.

    PubMed

    Jaikuna, Tanwiwat; Khadsiri, Phatchareewan; Chawapun, Nisa; Saekho, Suwit; Tharavichitkul, Ekkasit

    2017-02-01

    To develop an in-house software program that is able to calculate and generate the biological dose distribution and biological dose volume histogram by physical dose conversion using the linear-quadratic-linear (LQL) model. The Isobio software was developed using MATLAB version 2014b to calculate and generate the biological dose distribution and biological dose volume histograms. The physical dose from each voxel in treatment planning was extracted through Computational Environment for Radiotherapy Research (CERR), and the accuracy was verified by the differentiation between the dose volume histogram from CERR and the treatment planning system. An equivalent dose in 2 Gy fraction (EQD 2 ) was calculated using biological effective dose (BED) based on the LQL model. The software calculation and the manual calculation were compared for EQD 2 verification with pair t -test statistical analysis using IBM SPSS Statistics version 22 (64-bit). Two and three-dimensional biological dose distribution and biological dose volume histogram were displayed correctly by the Isobio software. Different physical doses were found between CERR and treatment planning system (TPS) in Oncentra, with 3.33% in high-risk clinical target volume (HR-CTV) determined by D 90% , 0.56% in the bladder, 1.74% in the rectum when determined by D 2cc , and less than 1% in Pinnacle. The difference in the EQD 2 between the software calculation and the manual calculation was not significantly different with 0.00% at p -values 0.820, 0.095, and 0.593 for external beam radiation therapy (EBRT) and 0.240, 0.320, and 0.849 for brachytherapy (BT) in HR-CTV, bladder, and rectum, respectively. The Isobio software is a feasible tool to generate the biological dose distribution and biological dose volume histogram for treatment plan evaluation in both EBRT and BT.

  8. GIS Tools to Estimate Average Annual Daily Traffic

    DOT National Transportation Integrated Search

    2012-06-01

    This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...

  9. Estimating extent of mortality associated with the Douglas-fir beetle in the Central and Northern Rockies

    Treesearch

    Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini

    1999-01-01

    Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...

  10. [Prediction model of health workforce and beds in county hospitals of Hunan by multiple linear regression].

    PubMed

    Ling, Ru; Liu, Jiawang

    2011-12-01

    To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.

  11. Structural Analysis Using NX Nastran 9.0

    NASA Technical Reports Server (NTRS)

    Rolewicz, Benjamin M.

    2014-01-01

    NX Nastran is a powerful Finite Element Analysis (FEA) software package used to solve linear and non-linear models for structural and thermal systems. The software, which consists of both a solver and user interface, breaks down analysis into four files, each of which are important to the end results of the analysis. The software offers capabilities for a variety of types of analysis, and also contains a respectable modeling program. Over the course of ten weeks, I was trained to effectively implement NX Nastran into structural analysis and refinement for parts of two missions at NASA's Kennedy Space Center, the Restore mission and the Orion mission.

  12. A new approach to correct the QT interval for changes in heart rate using a nonparametric regression model in beagle dogs.

    PubMed

    Watanabe, Hiroyuki; Miyazaki, Hiroyasu

    2006-01-01

    Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.

  13. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  14. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    PubMed

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.

  15. Mixed-Integer Conic Linear Programming: Challenges and Perspectives

    DTIC Science & Technology

    2013-10-01

    The novel DCCs for MISOCO may be used in branch- and-cut algorithms when solving MISOCO problems. The experimental software CICLO was developed to...perform limited, but rigorous computational experiments. The CICLO solver utilizes continuous SOCO solvers, MOSEK, CPLES or SeDuMi, builds on the open...submitted Fall 2013. Software: 1. CICLO : Integer conic linear optimization package. Authors: J.C. Góez, T.K. Ralphs, Y. Fu, and T. Terlaky

  16. Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

    NASA Astrophysics Data System (ADS)

    Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

    2017-05-01

    This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.

  17. Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.

  18. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  19. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  20. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  1. Stress Induced in Periodontal Ligament under Orthodontic Loading (Part II): A Comparison of Linear Versus Non-Linear Fem Study.

    PubMed

    Hemanth, M; Deoli, Shilpi; Raghuveer, H P; Rani, M S; Hegde, Chatura; Vedavathi, B

    2015-09-01

    Simulation of periodontal ligament (PDL) using non-linear finite element method (FEM) analysis gives better insight into understanding of the biology of tooth movement. The stresses in the PDL were evaluated for intrusion and lingual root torque using non-linear properties. A three-dimensional (3D) FEM model of the maxillary incisors was generated using Solidworks modeling software. Stresses in the PDL were evaluated for intrusive and lingual root torque movements by 3D FEM using ANSYS software. These stresses were compared with linear and non-linear analyses. For intrusive and lingual root torque movements, distribution of stress over the PDL was within the range of optimal stress value as proposed by Lee, but was exceeding the force system given by Proffit as optimum forces for orthodontic tooth movement with linear properties. When same force load was applied in non-linear analysis, stresses were more compared to linear analysis and were beyond the optimal stress range as proposed by Lee for both intrusive and lingual root torque. To get the same stress as linear analysis, iterations were done using non-linear properties and the force level was reduced. This shows that the force level required for non-linear analysis is lesser than that of linear analysis.

  2. Deriving the Cost of Software Maintenance for Software Intensive Systems

    DTIC Science & Technology

    2011-08-29

    more of software maintenance). Figure 4. SEER-SEM Maintenance Effort by Year Report (Reifer, Allen, Fersch, Hitchings, Judy , & Rosa, 2010...understand the linear relationship between two variables. The formula for the simple Pearson product-moment correlation is represented in Equation 5...standardization is required across the software maintenance community in order to ensure that the data being recorded can be employed beyond the agency or

  3. Scoring and staging systems using cox linear regression modeling and recursive partitioning.

    PubMed

    Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H

    2006-01-01

    Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.

  4. What are the predictor variables of social well-being among the medical science students?

    PubMed

    Javadi-Pashaki, Nazila; Darvishpour, Azar

    2018-01-01

    Individuals with social well-being can cope more successfully with major problems of social roles. Due to the social nature of human life, it cannot be ignored to pay attention the social aspect of health. The purpose of this study was to identify variables that predict the social well-being of medical students. A descriptive-analytical study was conducted on 489 medical science students of Gilan Province, the North of Iran, during May to September 2016. The samples were selected using quota sampling method. Research instrument was a questionnaire consisting of two parts: demographic section and Keyes social well-being questionnaire. Data analysis was done using SPSS software version 19 and with descriptive and inferential statistics (t-test, ANOVA, and linear regression). The results showed that majority of the students had average social well-being. Furthermore, a significant relationship between the academic degree ( P = 0.009), major ( P = 0.0001), the interest and field's satisfaction ( P = 0.0001), and social well-being was seen. The results of linear regression model showed that four variables (academic degree, major, group membership, and the interest and field's satisfaction) were significantly associated with the social well-being ( P < 0.05). The findings demonstrate that the different effects of the demographic factors on social well-being and the need for further consideration of these factors are obvious. Thus, health and education authorities are advised to pay attention students' academic degree, major, group membership, and the interest and field's satisfaction to upgrade and maintain the level of their social well-being.

  5. Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

    PubMed

    Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

    2017-01-01

    This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.

  6. Agile Acceptance Test–Driven Development of Clinical Decision Support Advisories: Feasibility of Using Open Source Software

    PubMed Central

    Baldwin, Krystal L; Kannan, Vaishnavi; Flahaven, Emily L; Parks, Cassandra J; Ott, Jason M; Willett, Duwayne L

    2018-01-01

    Background Moving to electronic health records (EHRs) confers substantial benefits but risks unintended consequences. Modern EHRs consist of complex software code with extensive local configurability options, which can introduce defects. Defects in clinical decision support (CDS) tools are surprisingly common. Feasible approaches to prevent and detect defects in EHR configuration, including CDS tools, are needed. In complex software systems, use of test–driven development and automated regression testing promotes reliability. Test–driven development encourages modular, testable design and expanding regression test coverage. Automated regression test suites improve software quality, providing a “safety net” for future software modifications. Each automated acceptance test serves multiple purposes, as requirements (prior to build), acceptance testing (on completion of build), regression testing (once live), and “living” design documentation. Rapid-cycle development or “agile” methods are being successfully applied to CDS development. The agile practice of automated test–driven development is not widely adopted, perhaps because most EHR software code is vendor-developed. However, key CDS advisory configuration design decisions and rules stored in the EHR may prove amenable to automated testing as “executable requirements.” Objective We aimed to establish feasibility of acceptance test–driven development of clinical decision support advisories in a commonly used EHR, using an open source automated acceptance testing framework (FitNesse). Methods Acceptance tests were initially constructed as spreadsheet tables to facilitate clinical review. Each table specified one aspect of the CDS advisory’s expected behavior. Table contents were then imported into a test suite in FitNesse, which queried the EHR database to automate testing. Tests and corresponding CDS configuration were migrated together from the development environment to production, with tests becoming part of the production regression test suite. Results We used test–driven development to construct a new CDS tool advising Emergency Department nurses to perform a swallowing assessment prior to administering oral medication to a patient with suspected stroke. Test tables specified desired behavior for (1) applicable clinical settings, (2) triggering action, (3) rule logic, (4) user interface, and (5) system actions in response to user input. Automated test suite results for the “executable requirements” are shown prior to building the CDS alert, during build, and after successful build. Conclusions Automated acceptance test–driven development and continuous regression testing of CDS configuration in a commercial EHR proves feasible with open source software. Automated test–driven development offers one potential contribution to achieving high-reliability EHR configuration. Vetting acceptance tests with clinicians elicits their input on crucial configuration details early during initial CDS design and iteratively during rapid-cycle optimization. PMID:29653922

  7. Agile Acceptance Test-Driven Development of Clinical Decision Support Advisories: Feasibility of Using Open Source Software.

    PubMed

    Basit, Mujeeb A; Baldwin, Krystal L; Kannan, Vaishnavi; Flahaven, Emily L; Parks, Cassandra J; Ott, Jason M; Willett, Duwayne L

    2018-04-13

    Moving to electronic health records (EHRs) confers substantial benefits but risks unintended consequences. Modern EHRs consist of complex software code with extensive local configurability options, which can introduce defects. Defects in clinical decision support (CDS) tools are surprisingly common. Feasible approaches to prevent and detect defects in EHR configuration, including CDS tools, are needed. In complex software systems, use of test-driven development and automated regression testing promotes reliability. Test-driven development encourages modular, testable design and expanding regression test coverage. Automated regression test suites improve software quality, providing a "safety net" for future software modifications. Each automated acceptance test serves multiple purposes, as requirements (prior to build), acceptance testing (on completion of build), regression testing (once live), and "living" design documentation. Rapid-cycle development or "agile" methods are being successfully applied to CDS development. The agile practice of automated test-driven development is not widely adopted, perhaps because most EHR software code is vendor-developed. However, key CDS advisory configuration design decisions and rules stored in the EHR may prove amenable to automated testing as "executable requirements." We aimed to establish feasibility of acceptance test-driven development of clinical decision support advisories in a commonly used EHR, using an open source automated acceptance testing framework (FitNesse). Acceptance tests were initially constructed as spreadsheet tables to facilitate clinical review. Each table specified one aspect of the CDS advisory's expected behavior. Table contents were then imported into a test suite in FitNesse, which queried the EHR database to automate testing. Tests and corresponding CDS configuration were migrated together from the development environment to production, with tests becoming part of the production regression test suite. We used test-driven development to construct a new CDS tool advising Emergency Department nurses to perform a swallowing assessment prior to administering oral medication to a patient with suspected stroke. Test tables specified desired behavior for (1) applicable clinical settings, (2) triggering action, (3) rule logic, (4) user interface, and (5) system actions in response to user input. Automated test suite results for the "executable requirements" are shown prior to building the CDS alert, during build, and after successful build. Automated acceptance test-driven development and continuous regression testing of CDS configuration in a commercial EHR proves feasible with open source software. Automated test-driven development offers one potential contribution to achieving high-reliability EHR configuration. Vetting acceptance tests with clinicians elicits their input on crucial configuration details early during initial CDS design and iteratively during rapid-cycle optimization. ©Mujeeb A Basit, Krystal L Baldwin, Vaishnavi Kannan, Emily L Flahaven, Cassandra J Parks, Jason M Ott, Duwayne L Willett. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.04.2018.

  8. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  9. Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

    PubMed Central

    Fernandes, Bruno J. T.; Roque, Alexandre

    2018-01-01

    Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366

  10. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  11. Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

    PubMed Central

    Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

    2011-01-01

    This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626

  12. AZTEC. Parallel Iterative method Software for Solving Linear Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hutchinson, S.; Shadid, J.; Tuminaro, R.

    1995-07-01

    AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less

  13. Age estimation by canines' pulp/tooth ratio in an Iranian population using digital panoramic radiography.

    PubMed

    Dehghani, Mahdieh; Shadkam, Elaheh; Ahrari, Farzaneh; Dehghani, Mahboobe

    2018-04-01

    Age estimation in adults is an important issue in forensic science. This study aimed to estimate the chronological age of Iranians by means of pulp/tooth area ratio (AR) of canines in digital panoramic radiographs. The sample consisted of panoramic radiographs of 271 male and female subjects aged 16-64 years. The pulp/tooth area ratio (AR) of upper and lower canines was calculated by AutoCAD software. Data were subjected to correlation and regression analysis. There was a significant and inverse correlation between age and pulp/tooth area ratio of upper and lower canines (r=-0.794 for upper canine and r=-0.282 for lower canine; p-value<0.001). Linear regression equations were derived separately for upper, lower and both canines. The mean difference between actual and estimated age using upper canine was 6.07±1.7. The results showed that the pulp/tooth area ratios of canines are a reliable method for age estimation in Iranians. The pulp/tooth area ratio of upper canine was better correlated with chronological age than that of lower canine. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Prediction of anthropometric measurements from tooth length--A Dravidian study.

    PubMed

    Sunitha, J; Ananthalakshmi, R; Sathiya, Jeeva J; Nadeem, Jeddy; Dhanarathnam, Shanmugam

    2015-12-01

    Anthropometric measurement is essential for identification of both victims and suspects. Often, this data is not readily available in a crime scene situation. The availability of one data set should help in predicting the other. This study was hypothesised on the basis of a correlation and geometry between the tooth length and various body measurements. To correlate face, palm, foot and stature measurements with tooth length. To derive a regression formula to estimate the various measurements from tooth length. The present study was conducted on Dravidian dental students in the age group 18 - 25 with a sample size of 372. All of the dental and physical parameters were measured using standard anthropometric equipments and techniques. The data was analysed using SPSS software and the methods used for statistical analysis were linear regression analysis and Pearson correlation. The parameters (incisor height (IH), face height (FH), palm length (PL), foot length (FL) and stature (S) showed nil to mild correlation (R = 0.2 ≤ 0.4) except for palm length (PL) and foot length (FL). (R>0.6). It is concluded that odontometric data is not a reliable source for estimating the face height (FH), palm length (PL), foot length (FL) and stature (S).

  15. Regression tree modeling of forest NPP using site conditions and climate variables across eastern USA

    NASA Astrophysics Data System (ADS)

    Kwon, Y.

    2013-12-01

    As evidence of global warming continue to increase, being able to predict forest response to climate changes, such as expected rise of temperature and precipitation, will be vital for maintaining the sustainability and productivity of forests. To map forest species redistribution by climate change scenario has been successful, however, most species redistribution maps lack mechanistic understanding to explain why trees grow under the novel conditions of chaining climate. Distributional map is only capable of predicting under the equilibrium assumption that the communities would exist following a prolonged period under the new climate. In this context, forest NPP as a surrogate for growth rate, the most important facet that determines stand dynamics, can lead to valid prediction on the transition stage to new vegetation-climate equilibrium as it represents changes in structure of forest reflecting site conditions and climate factors. The objective of this study is to develop forest growth map using regression tree analysis by extracting large-scale non-linear structures from both field-based FIA and remotely sensed MODIS data set. The major issue addressed in this approach is non-linear spatial patterns of forest attributes. Forest inventory data showed complex spatial patterns that reflect environmental states and processes that originate at different spatial scales. At broad scales, non-linear spatial trends in forest attributes and mixture of continuous and discrete types of environmental variables make traditional statistical (multivariate regression) and geostatistical (kriging) models inefficient. It calls into question some traditional underlying assumptions of spatial trends that uncritically accepted in forest data. To solve the controversy surrounding the suitability of forest data, regression tree analysis are performed using Software See5 and Cubist. Four publicly available data sets were obtained: First, field-based Forest Inventory and Analysis (USDA, Forest Service) data set for the 31 eastern most United States. Second, 8-day composite of MODIS Land Cover, FPAR, LAI and GPP/NPP data were obtained from Jan 2001 to Dec 2004 (total 182 composite) and each product were filtered by pixel-level quality assurance data to select best quality pixels. Third, 30-year averaged climate data were collected from National Oceanic and Atmospheric Administration (NOAA) and five climatic variables were obtained: Monthly temperature, precipitation, annual heating and cooling days, and annual frost-free days. Forth, topographic data were obtained from digital elevation model (1km by 1km). This research will provide a better understanding of large-scale forest responses to environmental factors that will be beneficial for the development of important forest management applications.

  16. Accuracy of Estimation of Graft Size for Living-Related Liver Transplantation: First Results of a Semi-Automated Interactive Software for CT-Volumetry

    PubMed Central

    Mokry, Theresa; Bellemann, Nadine; Müller, Dirk; Lorenzo Bermejo, Justo; Klauß, Miriam; Stampfl, Ulrike; Radeleff, Boris; Schemmer, Peter; Kauczor, Hans-Ulrich; Sommer, Christof-Matthias

    2014-01-01

    Objectives To evaluate accuracy of estimated graft size for living-related liver transplantation using a semi-automated interactive software for CT-volumetry. Materials and Methods Sixteen donors for living-related liver transplantation (11 male; mean age: 38.2±9.6 years) underwent contrast-enhanced CT prior to graft removal. CT-volumetry was performed using a semi-automated interactive software (P), and compared with a manual commercial software (TR). For P, liver volumes were provided either with or without vessels. For TR, liver volumes were provided always with vessels. Intraoperative weight served as reference standard. Major study goals included analyses of volumes using absolute numbers, linear regression analyses and inter-observer agreements. Minor study goals included the description of the software workflow: degree of manual correction, speed for completion, and overall intuitiveness using five-point Likert scales: 1–markedly lower/faster/higher for P compared with TR, 2–slightly lower/faster/higher for P compared with TR, 3–identical for P and TR, 4–slightly lower/faster/higher for TR compared with P, and 5–markedly lower/faster/higher for TR compared with P. Results Liver segments II/III, II–IV and V–VIII served in 6, 3, and 7 donors as transplanted liver segments. Volumes were 642.9±368.8 ml for TR with vessels, 623.8±349.1 ml for P with vessels, and 605.2±345.8 ml for P without vessels (P<0.01). Regression equations between intraoperative weights and volumes were y = 0.94x+30.1 (R2 = 0.92; P<0.001) for TR with vessels, y = 1.00x+12.0 (R2 = 0.92; P<0.001) for P with vessels, and y = 1.01x+28.0 (R2 = 0.92; P<0.001) for P without vessels. Inter-observer agreement showed a bias of 1.8 ml for TR with vessels, 5.4 ml for P with vessels, and 4.6 ml for P without vessels. For the degree of manual correction, speed for completion and overall intuitiveness, scale values were 2.6±0.8, 2.4±0.5 and 2. Conclusions CT-volumetry performed with P can predict accurately graft size for living-related liver transplantation while improving workflow compared with TR. PMID:25330198

  17. Optimal experimental designs for estimating Henry's law constants via the method of phase ratio variation.

    PubMed

    Kapelner, Adam; Krieger, Abba; Blanford, William J

    2016-10-14

    When measuring Henry's law constants (k H ) using the phase ratio variation (PRV) method via headspace gas chromatography (G C ), the value of k H of the compound under investigation is calculated from the ratio of the slope to the intercept of a linear regression of the inverse G C response versus the ratio of gas to liquid volumes of a series of vials drawn from the same parent solution. Thus, an experimenter collects measurements consisting of the independent variable (the gas/liquid volume ratio) and dependent variable (the G C -1 peak area). A review of the literature found that the common design is a simple uniform spacing of liquid volumes. We present an optimal experimental design which estimates k H with minimum error and provides multiple means for building confidence intervals for such estimates. We illustrate performance improvements of our design with an example measuring the k H for Naphthalene in aqueous solution as well as simulations on previous studies. Our designs are most applicable after a trial run defines the linear G C response and the linear phase ratio to the G C -1 region (where the PRV method is suitable) after which a practitioner can collect measurements in bulk. The designs can be easily computed using our open source software optDesignSlopeInt, an R package on CRAN. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Gorgees, HazimMansoor; Mahdi, FatimahAssim

    2018-05-01

    This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.

  19. Computing and software

    USGS Publications Warehouse

    White, Gary C.; Hines, J.E.

    2004-01-01

    The reality is that the statistical methods used for analysis of data depend upon the availability of software. Analysis of marked animal data is no different than the rest of the statistical field. The methods used for analysis are those that are available in reliable software packages. Thus, the critical importance of having reliable, up–to–date software available to biologists is obvious. Statisticians have continued to develop more robust models, ever expanding the suite of potential analysis methodsavailable. But without software to implement these newer methods, they will languish in the abstract, and not be applied to the problems deserving them.In the Computers and Software Session, two new software packages are described, a comparison of implementation of methods for the estimation of nest survival is provided, and a more speculative paper about how the next generation of software might be structured is presented.Rotella et al. (2004) compare nest survival estimation with different software packages: SAS logistic regression, SAS non–linear mixed models, and Program MARK. Nests are assumed to be visited at various, possibly infrequent, intervals. All of the approaches described compute nest survival with the same likelihood, and require that the age of the nest is known to account for nests that eventually hatch. However, each approach offers advantages and disadvantages, explored by Rotella et al. (2004).Efford et al. (2004) present a new software package called DENSITY. The package computes population abundance and density from trapping arrays and other detection methods with a new and unique approach. DENSITY represents the first major addition to the analysis of trapping arrays in 20 years.Barker & White (2004) discuss how existing software such as Program MARK require that each new model’s likelihood must be programmed specifically for that model. They wishfully think that future software might allow the user to combine pieces of likelihood functions together to generate estimates. The idea is interesting, and maybe some bright young statistician can work out the specifics to implement the procedure.Choquet et al. (2004) describe MSURGE, a software package that implements the multistate capture–recapture models. The unique feature of MSURGE is that the design matrix is constructed with an interpreted language called GEMACO. Because MSURGE is limited to just multistate models, the special requirements of these likelihoods can be provided.The software and methods presented in these papers gives biologists and wildlife managers an expanding range of possibilities for data analysis. Although ease–of–use is generally getting better, it does not replace the need for understanding of the requirements and structure of the models being computed. The internet provides access to many free software packages as well as user–discussion groups to share knowledge and ideas. (A starting point for wildlife–related applications is (http://www.phidot.org).

  20. Microcomputer Scheduling of Reference Desk Staff.

    ERIC Educational Resources Information Center

    Cornick, Donna; Owen, Willy

    1988-01-01

    Presents a model that can accommodate staff preferences when determining a reference desk schedule using a microcomputer, the Lotus 1-2-3 spreadsheet software, and the linear programing software LP83. (eight references) (MES)

  1. Comparative study of age estimation using dentinal translucency by digital and conventional methods.

    PubMed

    Bommannavar, Sushma; Kulkarni, Meena

    2015-01-01

    Estimating age using the dentition plays a significant role in identification of the individual in forensic cases. Teeth are one of the most durable and strongest structures in the human body. The morphology and arrangement of teeth vary from person-to-person and is unique to an individual as are the fingerprints. Therefore, the use of dentition is the method of choice in the identification of the unknown. Root dentin translucency is considered to be one of the best parameters for dental age estimation. Traditionally, root dentin translucency was measured using calipers. Recently, the use of custom built software programs have been proposed for the same. The present study describes a method to measure root dentin translucency on sectioned teeth using a custom built software program Adobe Photoshop 7.0 version (Adobe system Inc, Mountain View California). A total of 50 single rooted teeth were sectioned longitudinally to derive a 0.25 mm uniform thickness and the root dentin translucency was measured using digital and caliper methods and compared. The Gustafson's morphohistologic approach is used in this study. Correlation coefficients of translucency measurements to age were statistically significant for both the methods (P < 0.125) and linear regression equations derived from both methods revealed better ability of the digital method to assess age. The custom built software program used in the present study is commercially available and widely used image editing software. Furthermore, this method is easy to use and less time consuming. The measurements obtained using this method are more precise and thus help in more accurate age estimation. Considering these benefits, the present study recommends the use of digital method to assess translucency for age estimation.

  2. Comparative study of age estimation using dentinal translucency by digital and conventional methods

    PubMed Central

    Bommannavar, Sushma; Kulkarni, Meena

    2015-01-01

    Introduction: Estimating age using the dentition plays a significant role in identification of the individual in forensic cases. Teeth are one of the most durable and strongest structures in the human body. The morphology and arrangement of teeth vary from person-to-person and is unique to an individual as are the fingerprints. Therefore, the use of dentition is the method of choice in the identification of the unknown. Root dentin translucency is considered to be one of the best parameters for dental age estimation. Traditionally, root dentin translucency was measured using calipers. Recently, the use of custom built software programs have been proposed for the same. Objectives: The present study describes a method to measure root dentin translucency on sectioned teeth using a custom built software program Adobe Photoshop 7.0 version (Adobe system Inc, Mountain View California). Materials and Methods: A total of 50 single rooted teeth were sectioned longitudinally to derive a 0.25 mm uniform thickness and the root dentin translucency was measured using digital and caliper methods and compared. The Gustafson's morphohistologic approach is used in this study. Results: Correlation coefficients of translucency measurements to age were statistically significant for both the methods (P < 0.125) and linear regression equations derived from both methods revealed better ability of the digital method to assess age. Conclusion: The custom built software program used in the present study is commercially available and widely used image editing software. Furthermore, this method is easy to use and less time consuming. The measurements obtained using this method are more precise and thus help in more accurate age estimation. Considering these benefits, the present study recommends the use of digital method to assess translucency for age estimation. PMID:25709325

  3. Java Web Start based software for automated quantitative nuclear analysis of prostate cancer and benign prostate hyperplasia.

    PubMed

    Singh, Swaroop S; Kim, Desok; Mohler, James L

    2005-05-11

    Androgen acts via androgen receptor (AR) and accurate measurement of the levels of AR protein expression is critical for prostate research. The expression of AR in paired specimens of benign prostate and prostate cancer from 20 African and 20 Caucasian Americans was compared to demonstrate an application of this system. A set of 200 immunopositive and 200 immunonegative nuclei were collected from the images using a macro developed in Image Pro Plus. Linear Discriminant and Logistic Regression analyses were performed on the data to generate classification coefficients. Classification coefficients render the automated image analysis software independent of the type of immunostaining or image acquisition system used. The image analysis software performs local segmentation and uses nuclear shape and size to detect prostatic epithelial nuclei. AR expression is described by (a) percentage of immunopositive nuclei; (b) percentage of immunopositive nuclear area; and (c) intensity of AR expression among immunopositive nuclei or areas. The percent positive nuclei and percent nuclear area were similar by race in both benign prostate hyperplasia and prostate cancer. In prostate cancer epithelial nuclei, African Americans exhibited 38% higher levels of AR immunostaining than Caucasian Americans (two sided Student's t-tests; P < 0.05). Intensity of AR immunostaining was similar between races in benign prostate. The differences measured in the intensity of AR expression in prostate cancer were consistent with previous studies. Classification coefficients are required due to non-standardized immunostaining and image collection methods across medical institutions and research laboratories and helps customize the software for the specimen under study. The availability of a free, automated system creates new opportunities for testing, evaluation and use of this image analysis system by many research groups who study nuclear protein expression.

  4. libdrdc: software standards library

    NASA Astrophysics Data System (ADS)

    Erickson, David; Peng, Tie

    2008-04-01

    This paper presents the libdrdc software standards library including internal nomenclature, definitions, units of measure, coordinate reference frames, and representations for use in autonomous systems research. This library is a configurable, portable C-function wrapped C++ / Object Oriented C library developed to be independent of software middleware, system architecture, processor, or operating system. It is designed to use the automatically-tuned linear algebra suite (ATLAS) and Basic Linear Algebra Suite (BLAS) and port to firmware and software. The library goal is to unify data collection and representation for various microcontrollers and Central Processing Unit (CPU) cores and to provide a common Application Binary Interface (ABI) for research projects at all scales. The library supports multi-platform development and currently works on Windows, Unix, GNU/Linux, and Real-Time Executive for Multiprocessor Systems (RTEMS). This library is made available under LGPL version 2.1 license.

  5. PB-AM: An open-source, fully analytical linear poisson-boltzmann solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Felberg, Lisa E.; Brookes, David H.; Yap, Eng-Hui

    2016-11-02

    We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized Poisson Boltzmann equation. The PB-AM software package includes the generation of outputs files appropriate for visualization using VMD, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmannmore » Solver (APBS) software package to make it more accessible to a larger group of scientists, educators and students that are more familiar with the APBS framework.« less

  6. Modeling and managing risk early in software development

    NASA Technical Reports Server (NTRS)

    Briand, Lionel C.; Thomas, William M.; Hetmanski, Christopher J.

    1993-01-01

    In order to improve the quality of the software development process, we need to be able to build empirical multivariate models based on data collectable early in the software process. These models need to be both useful for prediction and easy to interpret, so that remedial actions may be taken in order to control and optimize the development process. We present an automated modeling technique which can be used as an alternative to regression techniques. We show how it can be used to facilitate the identification and aid the interpretation of the significant trends which characterize 'high risk' components in several Ada systems. Finally, we evaluate the effectiveness of our technique based on a comparison with logistic regression based models.

  7. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    PubMed

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  8. Development and Application of Nonlinear Land-Use Regression Models

    NASA Astrophysics Data System (ADS)

    Champendal, Alexandre; Kanevski, Mikhail; Huguenot, Pierre-Emmanuel

    2014-05-01

    The problem of air pollution modelling in urban zones is of great importance both from scientific and applied points of view. At present there are several fundamental approaches either based on science-based modelling (air pollution dispersion) or on the application of space-time geostatistical methods (e.g. family of kriging models or conditional stochastic simulations). Recently, there were important developments in so-called Land Use Regression (LUR) models. These models take into account geospatial information (e.g. traffic network, sources of pollution, average traffic, population census, land use, etc.) at different scales, for example, using buffering operations. Usually the dimension of the input space (number of independent variables) is within the range of (10-100). It was shown that LUR models have some potential to model complex and highly variable patterns of air pollution in urban zones. Most of LUR models currently used are linear models. In the present research the nonlinear LUR models are developed and applied for Geneva city. Mainly two nonlinear data-driven models were elaborated: multilayer perceptron and random forest. An important part of the research deals also with a comprehensive exploratory data analysis using statistical, geostatistical and time series tools. Unsupervised self-organizing maps were applied to better understand space-time patterns of the pollution. The real data case study deals with spatial-temporal air pollution data of Geneva (2002-2011). Nitrogen dioxide (NO2) has caught our attention. It has effects on human health and on plants; NO2 contributes to the phenomenon of acid rain. The negative effects of nitrogen dioxides on plants are the reduction of the growth, production and pesticide resistance. And finally, the effects on materials: nitrogen dioxide increases the corrosion. The data used for this study consist of a set of 106 NO2 passive sensors. 80 were used to build the models and the remaining 36 have constituted the testing set. Missing data have been completed using multiple linear regression and annual average values of pollutant concentrations were computed. All sensors are dispersed homogeneously over the central urban area of Geneva. The main result of the study is that the nonlinear LUR models developed have demonstrated their efficiency in modelling complex phrenomena of air pollution in urban zones and significantly reduced the testing error in comparison with linear models. Further research deals with the development and application of other non-linear data-driven models (Kanevski et al. 2009). References Kanevski M., Pozdnoukhov A. and Timonin V. (2009). Machine Learning for Spatial Environmental Data. Theory, Applications and Software. EPLF Press, Lausanne.

  9. Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.

    PubMed

    Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong

    2017-01-01

    This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. Biostatistics Series Module 10: Brief Overview of Multivariate Methods.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2017-01-01

    Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.

  11. A simple bias correction in linear regression for quantitative trait association under two-tail extreme selection.

    PubMed

    Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C

    2011-09-01

    Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.

  12. Monte Carlo simulation of parameter confidence intervals for non-linear regression analysis of biological data using Microsoft Excel.

    PubMed

    Lambert, Ronald J W; Mytilinaios, Ioannis; Maitland, Luke; Brown, Angus M

    2012-08-01

    This study describes a method to obtain parameter confidence intervals from the fitting of non-linear functions to experimental data, using the SOLVER and Analysis ToolPaK Add-In of the Microsoft Excel spreadsheet. Previously we have shown that Excel can fit complex multiple functions to biological data, obtaining values equivalent to those returned by more specialized statistical or mathematical software. However, a disadvantage of using the Excel method was the inability to return confidence intervals for the computed parameters or the correlations between them. Using a simple Monte-Carlo procedure within the Excel spreadsheet (without recourse to programming), SOLVER can provide parameter estimates (up to 200 at a time) for multiple 'virtual' data sets, from which the required confidence intervals and correlation coefficients can be obtained. The general utility of the method is exemplified by applying it to the analysis of the growth of Listeria monocytogenes, the growth inhibition of Pseudomonas aeruginosa by chlorhexidine and the further analysis of the electrophysiological data from the compound action potential of the rodent optic nerve. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  13. Robust biological parametric mapping: an improved technique for multimodal brain image analysis

    NASA Astrophysics Data System (ADS)

    Yang, Xue; Beason-Held, Lori; Resnick, Susan M.; Landman, Bennett A.

    2011-03-01

    Mapping the quantitative relationship between structure and function in the human brain is an important and challenging problem. Numerous volumetric, surface, region of interest and voxelwise image processing techniques have been developed to statistically assess potential correlations between imaging and non-imaging metrics. Recently, biological parametric mapping has extended the widely popular statistical parametric approach to enable application of the general linear model to multiple image modalities (both for regressors and regressands) along with scalar valued observations. This approach offers great promise for direct, voxelwise assessment of structural and functional relationships with multiple imaging modalities. However, as presented, the biological parametric mapping approach is not robust to outliers and may lead to invalid inferences (e.g., artifactual low p-values) due to slight mis-registration or variation in anatomy between subjects. To enable widespread application of this approach, we introduce robust regression and robust inference in the neuroimaging context of application of the general linear model. Through simulation and empirical studies, we demonstrate that our robust approach reduces sensitivity to outliers without substantial degradation in power. The robust approach and associated software package provides a reliable way to quantitatively assess voxelwise correlations between structural and functional neuroimaging modalities.

  14. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Treesearch

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  15. A Common Mechanism for Resistance to Oxime Reactivation of Acetylcholinesterase Inhibited by Organophosphorus Compounds

    DTIC Science & Technology

    2013-01-01

    application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal

  16. Software Cost Estimating,

    DTIC Science & Technology

    1982-05-13

    Size Of The Software. A favourite measure for software system size is linos of operational code, or deliverable code (operational code plus...regression models, these conversions are either derived from productivity measures using the "cost per instruction" type of equation or they are...appropriate to different development organisattons, differert project types, different sets of units for measuring e and s, and different items

  17. Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis

    PubMed Central

    Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

    2015-01-01

    Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876

  18. Cognition of and Demand for Education and Teaching in Medical Statistics in China: A Systematic Review and Meta-Analysis.

    PubMed

    Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong

    2015-01-01

    Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.

  19. Efficient Craig Interpolation for Linear Diophantine (Dis)Equations and Linear Modular Equations

    DTIC Science & Technology

    2008-02-01

    Craig interpolants has enabled the development of powerful hardware and software model checking techniques. Efficient algorithms are known for computing...interpolants in rational and real linear arithmetic. We focus on subsets of integer linear arithmetic. Our main results are polynomial time algorithms ...congruences), and linear diophantine disequations. We show the utility of the proposed interpolation algorithms for discovering modular/divisibility predicates

  20. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  1. Software Reviews.

    ERIC Educational Resources Information Center

    Mathematics and Computer Education, 1987

    1987-01-01

    Presented are reviews of several microcomputer software programs. Included are reviews of: (1) Microstat (Zenith); (2) MathCAD (MathSoft); (3) Discrete Mathematics (True Basic); (4) CALCULUS (True Basic); (5) Linear-Kit (John Wiley); and (6) Geometry Sensei (Broderbund). (RH)

  2. Specialization Agreements in the Council for Mutual Economic Assistance

    DTIC Science & Technology

    1988-02-01

    proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International

  3. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    EPA Science Inventory

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  4. Data Transformations for Inference with Linear Regression: Clarifications and Recommendations

    ERIC Educational Resources Information Center

    Pek, Jolynn; Wong, Octavia; Wong, C. M.

    2017-01-01

    Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…

  5. USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES

    EPA Science Inventory

    The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...

  6. Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Camilleri, Liberato; Cefai, Carmel

    2013-01-01

    Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…

  7. Simple and multiple linear regression: sample size considerations.

    PubMed

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression

    PubMed Central

    Jiang, Feng; Han, Ji-zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088

  9. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression.

    PubMed

    Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  10. Performance Assessment and Translation of Physiologically Based Pharmacokinetic Models From acslX to Berkeley Madonna, MATLAB, and R Language: Oxytetracycline and Gold Nanoparticles As Case Examples.

    PubMed

    Lin, Zhoumeng; Jaberi-Douraki, Majid; He, Chunla; Jin, Shiqiang; Yang, Raymond S H; Fisher, Jeffrey W; Riviere, Jim E

    2017-07-01

    Many physiologically based pharmacokinetic (PBPK) models for environmental chemicals, drugs, and nanomaterials have been developed to aid risk and safety assessments using acslX. However, acslX has been rendered sunset since November 2015. Alternative modeling tools and tutorials are needed for future PBPK applications. This forum article aimed to: (1) demonstrate the performance of 4 PBPK modeling software packages (acslX, Berkeley Madonna, MATLAB, and R language) tested using 2 existing models (oxytetracycline and gold nanoparticles); (2) provide a tutorial of PBPK model code conversion from acslX to Berkeley Madonna, MATLAB, and R language; (3) discuss the advantages and disadvantages of each software package in the implementation of PBPK models in toxicology, and (4) share our perspective about future direction in this field. Simulation results of plasma/tissue concentrations/amounts of oxytetracycline and gold from different models were compared visually and statistically with linear regression analyses. Simulation results from the original models were correlated well with results from the recoded models, with time-concentration/amount curves nearly superimposable and determination coefficients of 0.86-1.00. Step-by-step explanations of the recoding of the models in different software programs are provided in the Supplementary Data. In summary, this article presents a tutorial of PBPK model code conversion for a small molecule and a nanoparticle among 4 software packages, and a performance comparison of these software packages in PBPK model implementation. This tutorial helps beginners learn PBPK modeling, provides suggestions for selecting a suitable tool for future projects, and may lead to the transition from acslX to alternative modeling tools. © The Author 2017. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Validity and reproducibility of cephalometric measurements obtained from digital photographs of analogue headfilms.

    PubMed

    Grybauskas, Simonas; Balciuniene, Irena; Vetra, Janis

    2007-01-01

    The emerging market of digital cephalographs and computerized cephalometry is overwhelming the need to examine the advantages and drawbacks of manual cephalometry, meanwhile, small offices continue to benefit from the economic efficacy and ease of use of analogue cephalograms. The use of modern cephalometric software requires import of digital cephalograms or digital capture of analogue data: scanning and digital photography. The validity of digital photographs of analogue headfilms rather than original headfilms in clinical practice has not been well established. Digital photography could be a fast and inexpensive method of digital capture of analogue cephalograms for use in digital cephalometry. The objective of this study was to determine the validity and reproducibility of measurements obtained from digital photographs of analogue headfilms in lateral cephalometry. Analogue cephalometric radiographs were performed on 15 human dry skulls. Each of them was traced on acetate paper and photographed three times independently. Acetate tracings and digital photographs were digitized and analyzed in cephalometric software. Linear regression model, paired t-test intergroup analysis and coefficient of repeatability were used to assess validity and reproducibility for 63 angular, linear and derivative measurements. 54 out of 63 measurements were determined to have clinically acceptable reproducibility in the acetate tracing group as well as 46 out of 63 in the digital photography group. The worst reproducibility was determined for measurements dependent on landmarks of incisors and poorly defined outlines, majority of them being angular measurements. Validity was acceptable for all measurements, and although statistically significant differences between methods existed for as many as 15 parameters, they appeared to be clinically insignificant being smaller than 1 unit of measurement. Validity was acceptable for 59 of 63 measurements obtained from digital photographs, substantiating the use of digital photography for headfilm capture and computer-aided cephalometric analysis.

  12. The overlooked potential of Generalized Linear Models in astronomy-II: Gamma regression and photometric redshifts

    NASA Astrophysics Data System (ADS)

    Elliott, J.; de Souza, R. S.; Krone-Martins, A.; Cameron, E.; Ishida, E. E. O.; Hilbe, J.; COIN Collaboration

    2015-04-01

    Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ∼1% for simulated and ∼2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely thousands of galaxies. Our software is made publicly available as a user-friendly package developed in Python, R and via an interactive web application. This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope.

  13. MOFA Software for the COBRA Toolbox

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Griesemer, Marc; Navid, Ali

    MOFA-COBRA is a software code for Matlab that performs Multi-Objective Flux Analysis (MOFA), a solving of linear programming problems. Teh leading software package for conducting different types of analyses using constrain-based models is the COBRA Toolbox for Matlab. MOFA-COBRA is an added tool for COBRA that solves multi-objective problems using a novel algorithm.

  14. Analysis of Binary Adherence Data in the Setting of Polypharmacy: A Comparison of Different Approaches

    PubMed Central

    Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.

    2009-01-01

    Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358

  15. Genetic Programming Transforms in Linear Regression Situations

    NASA Astrophysics Data System (ADS)

    Castillo, Flor; Kordon, Arthur; Villa, Carlos

    The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.

  16. Naval Research Logistics Quarterly. Volume 28. Number 3,

    DTIC Science & Technology

    1981-09-01

    denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions

  17. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  18. PB-AM: An open-source, fully analytical linear poisson-boltzmann solver.

    PubMed

    Felberg, Lisa E; Brookes, David H; Yap, Eng-Hui; Jurrus, Elizabeth; Baker, Nathan A; Head-Gordon, Teresa

    2017-06-05

    We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized PB equation, for molecules represented as non-overlapping spherical cavities. The PB-AM software package includes the generation of outputs files appropriate for visualization using visual molecular dynamics, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmann Solver (APBS) software package to make it more accessible to a larger group of scientists, educators, and students that are more familiar with the APBS framework. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Libraries for Software Use on Peregrine | High-Performance Computing | NREL

    Science.gov Websites

    -specific libraries. Libraries List Name Description BLAS Basic Linear Algebra Subroutines, libraries only managing hierarchically structured data. LAPACK Standard Netlib offering for computational linear algebra

  20. Non-invasive Self-Care Anemia Detection during Pregnancy Using a Smartphone Camera

    NASA Astrophysics Data System (ADS)

    Anggraeni, M. D.; Fatoni, A.

    2017-02-01

    Indonesian maternal mortality rate is the highest in South East Asia. Postpartum hemorrhage is the major causes of maternal mortality in Indonesia. Anemia during pregnancy contributes significantly to postpartum hemorrhage. Early detection of anemia during pregnancy may save mothers from maternal death. This research aim to develop a non-invasive self-care anemia detection based on the palpebral color observation and using a smartphone camera. The color intensity (Red, Green, and Blue) was then measured using a Colorgrab software (Loomatix) and analyzed compared to the hemoglobin concentration of the samples, measured using standard Spectrophotometer method. The result showed that the red color intensity had a high correlation (R2=0.814) with a linear regression of y=14.486x + 50.228. This preliminary study may be used as anemia early detection which more objective compared to visual assessment usually performed.

  1. Commuting to work: RN travel time to employment in rural and urban areas.

    PubMed

    Rosenberg, Marie-Claire; Corcoran, Sean P; Kovner, Christine; Brewer, Carol

    2011-02-01

    To investigate the variation in average daily travel time to work among registered nurses (RNs) living in urban, suburban, and rural areas. We examine how travel time varies across RN characteristics, job setting, and availability of local employment opportunities. Descriptive statistics and linear regression using a 5% sample from the 2000 Census and a longitudinal survey of newly licensed RNs (NLRN). Travel time for NLRN respondents was estimated using geographic information systems (GIS) software. In the NLRN, rural nurses and those living in small towns had significantly longer average commute times. Young married RNs and RNs with children also tended to have longer commute times, as did RNs employed by hospitals. The findings indicate that travel time to work varies significantly across locale types. Further research is needed to understand whether and to what extent lengthy commute times impact RN workforce needs in rural and urban areas.

  2. Treatment of dyeing wastewater by TiO2/H2O2/UV process: experimental design approach for evaluating total organic carbon (TOC) removal efficiency.

    PubMed

    Lee, Seung-Mok; Kim, Young-Gyu; Cho, Il-Hyoung

    2005-01-01

    Optimal operating conditions in order to treat dyeing wastewater were investigated by using the factorial design and responses surface methodology (RSM). The experiment was statistically designed and carried out according to a 22 full factorial design with four factorial points, three center points, and four axial points. Then, the linear and nonlinear regression was applied on the data by using SAS package software. The independent variables were TiO2 dosage, H2O2 concentration and total organic carbon (TOC) removal efficiency of dyeing wastewater was dependent variable. From the factorial design and responses surface methodology (RSM), maximum removal efficiency (85%) of dyeing wastewater was obtained at TiO2 dosage (1.82 gL(-1)), H2O2 concentration (980 mgL(-1)) for oxidation reaction (20 min).

  3. [Health for All-Italia: an indicator system on health].

    PubMed

    Burgio, Alessandra; Crialesi, Roberta; Loghi, Marzia

    2003-01-01

    The Health for All - Italia information system collects health data from several sources. It is intended to be a cornerstone for the achievement of an overview about health in Italy. Health is analyzed at different levels, ranging from health services, health needs, lifestyles, demographic, social, economic and environmental contexts. The database associated software allows to pin down statistical data into graphs and tables, and to carry out simple statistical analysis. It is therefore possible to view the indicators' time series, make simple projections and compare the various indicators over the years for each territorial unit. This is possible by means of tables, graphs (histograms, line graphs, frequencies, linear regression with calculation of correlation coefficients, etc) and maps. These charts can be exported to other programs (i.e. Word, Excel, Power Point), or they can be directly printed in color or black and white.

  4. Relative value of diverse brain MRI and blood-based biomarkers for predicting cognitive decline in the elderly

    NASA Astrophysics Data System (ADS)

    Madsen, Sarah K.; Ver Steeg, Greg; Daianu, Madelaine; Mezher, Adam; Jahanshad, Neda; Nir, Talia M.; Hua, Xue; Gutman, Boris A.; Galstyan, Aram; Thompson, Paul M.

    2016-03-01

    Cognitive decline accompanies many debilitating illnesses, including Alzheimer's disease (AD). In old age, brain tissue loss also occurs along with cognitive decline. Although blood tests are easier to perform than brain MRI, few studies compare brain scans to standard blood tests to see which kinds of information best predict future decline. In 504 older adults from the Alzheimer's Disease Neuroimaging Initiative (ADNI), we first used linear regression to assess the relative value of different types of data to predict cognitive decline, including 196 blood panel biomarkers, 249 MRI biomarkers obtained from the FreeSurfer software, demographics, and the AD-risk gene APOE. A subset of MRI biomarkers was the strongest predictor. There was no specific blood marker that increased predictive accuracy on its own, we found that a novel unsupervised learning method, CorEx, captured weak correlations among blood markers, and the resulting clusters offered unique predictive power.

  5. The effect of occupational health and safety, work environment and discipline on employee performance in a consumer goods company

    NASA Astrophysics Data System (ADS)

    Putri, D. O.; Triatmanto, B.; Setiyadi, S.

    2018-04-01

    Employee performance can be the supporting factor of company performance. However, employee performance can be affected by several factors. Employees can have optimal performance if they feel safe, have good working environment and have discipline. The purposes of this research are to analyze the effect of occupational health and safety, work environment and discipline on the employee performance in PPIC Thermo section in a consumer goods company and to find the dominant variable which primarily affects employee performance. This research was conducted by taking data from 47 respondents. The data were collected using questionnaire. The techniques in data analysis is multiple linear regression with SPSS software. The result shows that occupational health and safety, work environment and discipline are simultaneously significant to the employee performance. Discipline holds the dominant factor which affects employee performance.

  6. Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

    PubMed

    Haoliang Yuan; Yuan Yan Tang

    2017-04-01

    Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

  7. Simple linear and multivariate regression models.

    PubMed

    Rodríguez del Águila, M M; Benítez-Parejo, N

    2011-01-01

    In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.

  8. Optimization of isotherm models for pesticide sorption on biopolymer-nanoclay composite by error analysis.

    PubMed

    Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M

    2017-04-01

    A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Ada Linear-Algebra Program

    NASA Technical Reports Server (NTRS)

    Klumpp, A. R.; Lawson, C. L.

    1988-01-01

    Routines provided for common scalar, vector, matrix, and quaternion operations. Computer program extends Ada programming language to include linear-algebra capabilities similar to HAS/S programming language. Designed for such avionics applications as software for Space Station.

  10. London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

    PubMed Central

    Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

    2017-01-01

    Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343

  11. Selection of software for mechanical engineering undergraduates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cheah, C. T.; Yin, C. S.; Halim, T.

    A major problem with the undergraduate mechanical course is the limited exposure of students to software packages coupled with the long learning curve on the existing software packages. This work proposes the use of appropriate software packages for the entire mechanical engineering curriculum to ensure students get sufficient exposure real life design problems. A variety of software packages are highlighted as being suitable for undergraduate work in mechanical engineering, e.g. simultaneous non-linear equations; uncertainty analysis; 3-D modeling software with the FEA; analysis tools for the solution of problems in thermodynamics, fluid mechanics, mechanical system design, and solid mechanics.

  12. Using Parametric Cost Models to Estimate Engineering and Installation Costs of Selected Electronic Communications Systems

    DTIC Science & Technology

    1994-09-01

    Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of

  13. An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System

    DTIC Science & Technology

    1989-09-01

    residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte

  14. A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

    ERIC Educational Resources Information Center

    Cooper, Paul D.

    2010-01-01

    A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

  15. Conjoint Analysis: A Study of the Effects of Using Person Variables.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…

  16. How Robust Is Linear Regression with Dummy Variables?

    ERIC Educational Resources Information Center

    Blankmeyer, Eric

    2006-01-01

    Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…

  17. Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method

    ERIC Educational Resources Information Center

    Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev

    2018-01-01

    The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…

  18. An Introduction to Graphical and Mathematical Methods for Detecting Heteroscedasticity in Linear Regression.

    ERIC Educational Resources Information Center

    Thompson, Russel L.

    Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…

  19. On the null distribution of Bayes factors in linear regression

    USDA-ARS?s Scientific Manuscript database

    We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...

  20. Common pitfalls in statistical analysis: Linear regression analysis

    PubMed Central

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022

  1. AN ADA LINEAR ALGEBRA PACKAGE MODELED AFTER HAL/S

    NASA Technical Reports Server (NTRS)

    Klumpp, A. R.

    1994-01-01

    This package extends the Ada programming language to include linear algebra capabilities similar to those of the HAL/S programming language. The package is designed for avionics applications such as Space Station flight software. In addition to the HAL/S built-in functions, the package incorporates the quaternion functions used in the Shuttle and Galileo projects, and routines from LINPAK that solve systems of equations involving general square matrices. Language conventions in this package follow those of HAL/S to the maximum extent practical and minimize the effort required for writing new avionics software and translating existent software into Ada. Valid numeric types in this package include scalar, vector, matrix, and quaternion declarations. (Quaternions are fourcomponent vectors used in representing motion between two coordinate frames). Single precision and double precision floating point arithmetic is available in addition to the standard double precision integer manipulation. Infix operators are used instead of function calls to define dot products, cross products, quaternion products, and mixed scalar-vector, scalar-matrix, and vector-matrix products. The package contains two generic programs: one for floating point, and one for integer. The actual component type is passed as a formal parameter to the generic linear algebra package. The procedures for solving systems of linear equations defined by general matrices include GEFA, GECO, GESL, and GIDI. The HAL/S functions include ABVAL, UNIT, TRACE, DET, INVERSE, TRANSPOSE, GET, PUT, FETCH, PLACE, and IDENTITY. This package is written in Ada (Version 1.2) for batch execution and is machine independent. The linear algebra software depends on nothing outside the Ada language except for a call to a square root function for floating point scalars (such as SQRT in the DEC VAX MATHLIB library). This program was developed in 1989, and is a copyrighted work with all copyright vested in NASA.

  2. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    PubMed

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  3. On Fitting Generalized Linear Mixed-effects Models for Binary Responses using Different Statistical Packages

    PubMed Central

    Zhang, Hui; Lu, Naiji; Feng, Changyong; Thurston, Sally W.; Xia, Yinglin; Tu, Xin M.

    2011-01-01

    Summary The generalized linear mixed-effects model (GLMM) is a popular paradigm to extend models for cross-sectional data to a longitudinal setting. When applied to modeling binary responses, different software packages and even different procedures within a package may give quite different results. In this report, we describe the statistical approaches that underlie these different procedures and discuss their strengths and weaknesses when applied to fit correlated binary responses. We then illustrate these considerations by applying these procedures implemented in some popular software packages to simulated and real study data. Our simulation results indicate a lack of reliability for most of the procedures considered, which carries significant implications for applying such popular software packages in practice. PMID:21671252

  4. A Mathematics Software Database Update.

    ERIC Educational Resources Information Center

    Cunningham, R. S.; Smith, David A.

    1987-01-01

    Contains an update of an earlier listing of software for mathematics instruction at the college level. Topics are: advanced mathematics, algebra, calculus, differential equations, discrete mathematics, equation solving, general mathematics, geometry, linear and matrix algebra, logic, statistics and probability, and trigonometry. (PK)

  5. Generic Kalman Filter Software

    NASA Technical Reports Server (NTRS)

    Lisano, Michael E., II; Crues, Edwin Z.

    2005-01-01

    The Generic Kalman Filter (GKF) software provides a standard basis for the development of application-specific Kalman-filter programs. Historically, Kalman filters have been implemented by customized programs that must be written, coded, and debugged anew for each unique application, then tested and tuned with simulated or actual measurement data. Total development times for typical Kalman-filter application programs have ranged from months to weeks. The GKF software can simplify the development process and reduce the development time by eliminating the need to re-create the fundamental implementation of the Kalman filter for each new application. The GKF software is written in the ANSI C programming language. It contains a generic Kalman-filter-development directory that, in turn, contains a code for a generic Kalman filter function; more specifically, it contains a generically designed and generically coded implementation of linear, linearized, and extended Kalman filtering algorithms, including algorithms for state- and covariance-update and -propagation functions. The mathematical theory that underlies the algorithms is well known and has been reported extensively in the open technical literature. Also contained in the directory are a header file that defines generic Kalman-filter data structures and prototype functions and template versions of application-specific subfunction and calling navigation/estimation routine code and headers. Once the user has provided a calling routine and the required application-specific subfunctions, the application-specific Kalman-filter software can be compiled and executed immediately. During execution, the generic Kalman-filter function is called from a higher-level navigation or estimation routine that preprocesses measurement data and post-processes output data. The generic Kalman-filter function uses the aforementioned data structures and five implementation- specific subfunctions, which have been developed by the user on the basis of the aforementioned templates. The GKF software can be used to develop many different types of unfactorized Kalman filters. A developer can choose to implement either a linearized or an extended Kalman filter algorithm, without having to modify the GKF software. Control dynamics can be taken into account or neglected in the filter-dynamics model. Filter programs developed by use of the GKF software can be made to propagate equations of motion for linear or nonlinear dynamical systems that are deterministic or stochastic. In addition, filter programs can be made to operate in user-selectable "covariance analysis" and "propagation-only" modes that are useful in design and development stages.

  6. Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

    NASA Astrophysics Data System (ADS)

    Wu, Cheng; Zhen Yu, Jian

    2018-03-01

    Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.

  7. A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis.

    PubMed

    Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga

    2006-08-01

    A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.

  8. Wavelet regression model in forecasting crude oil price

    NASA Astrophysics Data System (ADS)

    Hamid, Mohd Helmie; Shabri, Ani

    2017-05-01

    This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.

  9. Partitioning sources of variation in vertebrate species richness

    USGS Publications Warehouse

    Boone, R.B.; Krohn, W.B.

    2000-01-01

    Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.

  10. RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

    PubMed

    Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

    2009-01-01

    This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).

  11. A Tale of Two Cultures: Cross Cultural Comparison in Learning the Prezi Presentation Software Tool in the US and Norway

    ERIC Educational Resources Information Center

    Brock, Sabra; Brodahl, Cornelia

    2013-01-01

    Presentation software is an important tool for both student and professorial communicators. PowerPoint has been the standard since it was introduced in 1990. However, new "improved" software platforms are emerging. Prezi is one of these, claiming to remedy the linear thinking that underlies PowerPoint by creating one canvas and…

  12. Artificial intelligence and expert systems in-flight software testing

    NASA Technical Reports Server (NTRS)

    Demasie, M. P.; Muratore, J. F.

    1991-01-01

    The authors discuss the introduction of advanced information systems technologies such as artificial intelligence, expert systems, and advanced human-computer interfaces directly into Space Shuttle software engineering. The reconfiguration automation project (RAP) was initiated to coordinate this move towards 1990s software technology. The idea behind RAP is to automate several phases of the flight software testing procedure and to introduce AI and ES into space shuttle flight software testing. In the first phase of RAP, conventional tools to automate regression testing have already been developed or acquired. There are currently three tools in use.

  13. Menu-Driven Solver Of Linear-Programming Problems

    NASA Technical Reports Server (NTRS)

    Viterna, L. A.; Ferencz, D.

    1992-01-01

    Program assists inexperienced user in formulating linear-programming problems. A Linear Program Solver (ALPS) computer program is full-featured LP analysis program. Solves plain linear-programming problems as well as more-complicated mixed-integer and pure-integer programs. Also contains efficient technique for solution of purely binary linear-programming problems. Written entirely in IBM's APL2/PC software, Version 1.01. Packed program contains licensed material, property of IBM (copyright 1988, all rights reserved).

  14. Advanced Mathematical Tools in Metrology III

    NASA Astrophysics Data System (ADS)

    Ciarlini, P.

    The Table of Contents for the book is as follows: * Foreword * Invited Papers * The ISO Guide to the Expression of Uncertainty in Measurement: A Bridge between Statistics and Metrology * Bootstrap Algorithms and Applications * The TTRSs: 13 Oriented Constraints for Dimensioning, Tolerancing & Inspection * Graded Reference Data Sets and Performance Profiles for Testing Software Used in Metrology * Uncertainty in Chemical Measurement * Mathematical Methods for Data Analysis in Medical Applications * High-Dimensional Empirical Linear Prediction * Wavelet Methods in Signal Processing * Software Problems in Calibration Services: A Case Study * Robust Alternatives to Least Squares * Gaining Information from Biomagnetic Measurements * Full Papers * Increase of Information in the Course of Measurement * A Framework for Model Validation and Software Testing in Regression * Certification of Algorithms for Determination of Signal Extreme Values during Measurement * A Method for Evaluating Trends in Ozone-Concentration Data and Its Application to Data from the UK Rural Ozone Monitoring Network * Identification of Signal Components by Stochastic Modelling in Measurements of Evoked Magnetic Fields from Peripheral Nerves * High Precision 3D-Calibration of Cylindrical Standards * Magnetic Dipole Estimations for MCG-Data * Transfer Functions of Discrete Spline Filters * An Approximation Method for the Linearization of Tridimensional Metrology Problems * Regularization Algorithms for Image Reconstruction from Projections * Quality of Experimental Data in Hydrodynamic Research * Stochastic Drift Models for the Determination of Calibration Intervals * Short Communications * Projection Method for Lidar Measurement * Photon Flux Measurements by Regularised Solution of Integral Equations * Correct Solutions of Fit Problems in Different Experimental Situations * An Algorithm for the Nonlinear TLS Problem in Polynomial Fitting * Designing Axially Symmetric Electromechanical Systems of Superconducting Magnetic Levitation in Matlab Environment * Data Flow Evaluation in Metrology * A Generalized Data Model for Integrating Clinical Data and Biosignal Records of Patients * Assessment of Three-Dimensional Structures in Clinical Dentistry * Maximum Entropy and Bayesian Approaches to Parameter Estimation in Mass Metrology * Amplitude and Phase Determination of Sinusoidal Vibration in the Nanometer Range using Quadrature Signals * A Class of Symmetric Compactly Supported Wavelets and Associated Dual Bases * Analysis of Surface Topography by Maximum Entropy Power Spectrum Estimation * Influence of Different Kinds of Errors on Imaging Results in Optical Tomography * Application of the Laser Interferometry for Automatic Calibration of Height Setting Micrometer * Author Index

  15. System dynamic modeling: an alternative method for budgeting.

    PubMed

    Srijariya, Witsanuchai; Riewpaiboon, Arthorn; Chaikledkaew, Usa

    2008-03-01

    To construct, validate, and simulate a system dynamic financial model and compare it against the conventional method. The study was a cross-sectional analysis of secondary data retrieved from the National Health Security Office (NHSO) in the fiscal year 2004. The sample consisted of all emergency patients who received emergency services outside their registered hospital-catchments area. The dependent variable used was the amount of reimbursed money. Two types of model were constructed, namely, the system dynamic model using the STELLA software and the multiple linear regression model. The outputs of both methods were compared. The study covered 284,716 patients from various levels of providers. The system dynamic model had the capability of producing various types of outputs, for example, financial and graphical analyses. For the regression analysis, statistically significant predictors were composed of service types (outpatient or inpatient), operating procedures, length of stay, illness types (accident or not), hospital characteristics, age, and hospital location (adjusted R(2) = 0.74). The total budget arrived at from using the system dynamic model and regression model was US$12,159,614.38 and US$7,301,217.18, respectively, whereas the actual NHSO reimbursement cost was US$12,840,805.69. The study illustrated that the system dynamic model is a useful financial management tool, although it is not easy to construct. The model is not only more accurate in prediction but is also more capable of analyzing large and complex real-world situations than the conventional method.

  16. Software Reviews.

    ERIC Educational Resources Information Center

    Teles, Elizabeth, Ed.; And Others

    1990-01-01

    Reviewed are two computer software packages for Macintosh microcomputers including "Phase Portraits," an exploratory graphics tool for studying first-order planar systems; and "MacMath," a set of programs for exploring differential equations, linear algebra, and other mathematical topics. Features, ease of use, cost, availability, and hardware…

  17. In vitro Cell Viability by CellProfiler® Software as Equivalent to MTT Assay.

    PubMed

    Gasparini, Luciana S; Macedo, Nayana D; Pimentel, Elisângela F; Fronza, Marcio; Junior, Valdemar L; Borges, Warley S; Cole, Eduardo R; Andrade, Tadeu U; Endringer, Denise C; Lenz, Dominik

    2017-07-01

    This study evaluated in vitro cell viability by the colorimetric MTT stands for 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (3-(4, 5-dimethylthiazolyl-2)-2, 5-diphenyltetrazolium bromide) (3-(4, 5-dimethylthiazolyl-2)-2, 5-diphenyltetrazolium bromide) assay compared to image analysis by CellProfiler ® software. Hepatoma (Hepa-1c1c7) and fibroblast (L929) cells were exposed to isolated substances, camptothecin, lycorine, tazettine, albomaculine, 3-epimacronine, trispheridine, galanthine and Padina gymnospora , Sargassum sp. methanolic extract, and Habranthus itaobinus Ravenna ethyl acetate in different concentrations. After MTT assay, cells were stained with Panotic dye kit. Cell images were obtained with an inverted microscope equipped with a digital camera. The images were analyzed by CellProfiler ® . No cytotoxicity at the highest concentration analyzed for 3-epimacronine, albomaculine, galanthine, trispheridine, P. gymnospora extract and Sargassum sp. extract where detected. Tazettine offered cytotoxicity only against the Hepa1c1c7 cell line. Lycorine, camptothecin, and H. itaobinus extract exhibited cytotoxic effects in both cell lines. The viability methods tested were correlated demonstrated by Bland-Atman test with normal distribution with mean difference between the two methods close to zero, bias value 3.0263. The error was within the limits of the confidence intervals and these values had a narrow difference. The correlation between the two methods was demonstrated by the linear regression plotted as R 2 . CellProfiler ® image analysis presented similar results to the MTT assay in the identification of viable cells, and image analysis may assist part of biological analysis procedures. The presented methodology is inexpensive and reproducible. In vitro cell viability assessment with MTT (3-(4, 5-dimethylthiazolyl-2)-2, 5-diphenyltetrazolium bromide) assay may be replaced by image analysis by CellProfiler ® . The viability methods tested were correlated demonstrated by Bland-Atman test with normal distribution with mean difference between the two methods close to zero, bias value 3.0263. The correlation between the two methods was demonstrated by the linear regression plotted as R2. Abbreviations: HPLC: High pressure liquid chromatography MTT: (3-(4, 5-dimethylthiazolyl-2)-2, 5-diphenyltetrazolium bromide) (3-(4, 5-dimethylthiazolyl-2)-2, 5-diphenyltetrazolium bromide).

  18. Post-processing through linear regression

    NASA Astrophysics Data System (ADS)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  19. Linear regression metamodeling as a tool to summarize and present simulation model results.

    PubMed

    Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

    2013-10-01

    Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

  20. Using crosscorrelation techniques to determine the impulse response of linear systems

    NASA Technical Reports Server (NTRS)

    Dallabetta, Michael J.; Li, Harry W.; Demuth, Howard B.

    1993-01-01

    A crosscorrelation method of measuring the impulse response of linear systems is presented. The technique, implementation, and limitations of this method are discussed. A simple system is designed and built using discrete components and the impulse response of a linear circuit is measured. Theoretical and software simulation results are presented.

  1. Combining ultrasonography and noncontrast helical computerized tomography to evaluate Holmium laser lithotripsy

    PubMed Central

    Mi, Jia; Li, Jie; Zhang, Qinglu; Wang, Xing; Liu, Hongyu; Cao, Yanlu; Liu, Xiaoyan; Sun, Xiao; Shang, Mengmeng; Liu, Qing

    2016-01-01

    Abstract The purpose of the study was to establish a mathematical model for correlating the combination of ultrasonography and noncontrast helical computerized tomography (NCHCT) with the total energy of Holmium laser lithotripsy. In this study, from March 2013 to February 2014, 180 patients with single urinary calculus were examined using ultrasonography and NCHCT before Holmium laser lithotripsy. The calculus location and size, acoustic shadowing (AS) level, twinkling artifact intensity (TAI), and CT value were all documented. The total energy of lithotripsy (TEL) and the calculus composition were also recorded postoperatively. Data were analyzed using Spearman's rank correlation coefficient, with the SPSS 17.0 software package. Multiple linear regression was also used for further statistical analysis. A significant difference in the TEL was observed between renal calculi and ureteral calculi (r = –0.565, P < 0.001), and there was a strong correlation between the calculus size and the TEL (r = 0.675, P < 0.001). The difference in the TEL between the calculi with and without AS was highly significant (r = 0.325, P < 0.001). The CT value of the calculi was significantly correlated with the TEL (r = 0.386, P < 0.001). A correlation between the TAI and TEL was also observed (r = 0.391, P < 0.001). Multiple linear regression analysis revealed that the location, size, and TAI of the calculi were related to the TEL, and the location and size were statistically significant predictors (adjusted r2 = 0.498, P < 0.001). A mathematical model correlating the combination of ultrasonography and NCHCT with TEL was established; this model may provide a foundation to guide the use of energy in Holmium laser lithotripsy. The TEL can be estimated by the location, size, and TAI of the calculus. PMID:27930563

  2. Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence.

    PubMed

    Haworth, Kristina M; Chandler, Heather L

    2017-02-01

    To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Fifty volunteers were seen for two visits 14 ± 2 days apart. Ocular sun exposure was estimated over a 2-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and, possibly, that UVAF remains for at least several months after sun exposure.

  3. The relationship between emotional intelligence and job stress in the faculty of medicine in Isfahan University of Medical Sciences

    PubMed Central

    YAMANI, NIKOO; SHAHABI, MARYAM; HAGHANI, FARIBA

    2014-01-01

    Introduction: health care professionals especially clinicians, undergo lots of job stress (JS). Emotional intelligence (EI) is among the variables that appear to be associated with stress. It is also included among the ways adopted by the individuals in order to resist JS in the workplace. Thus, this study aims to investigate the relationship between EI and JS in the faculty members of Isfahan University of Medical Sciences (IUMS). Methods: This was a correlational study performed on 202 faculty members of IUMS. The data was gathered through two valid and reliable questionnaires (Bradberry EI questionnaire and JS questionnaire), being analyzed by SPSS software using descriptive statistics, Pearson correlation coefficient, t-test, analysis of variance (ANOVA) and linear regression analysis (α=0.05). Results: 142 individuals (70.30%) filled out the questionnaires. 75% of the respondents were male and 98% were married. There was an inverse correlation between the total score of EI and the level of JS (r=-0.235, p=0.005). Moreover, among the factors of EI, self-awareness and self-management scores had significant inverse relationship with the level of JS. Linear regression analysis showed that the EI factors explained approximately 7% of the variance of JS levels of the teachers. Conclusions: Individuals with high EI have less JS. Since the EI can be taught, it can be expected that the JS of faculty members can be reduced through training them on emotional intelligence. Therefore, it is recommended that short-term training courses be scheduled and designed based on the concepts of EI for teachers, particularly clinicians. PMID:25512914

  4. Dual energy X-ray absorptiometry spine scans to determine abdominal fat in post-menopausal women

    PubMed Central

    Bea, J. W.; Blew, R. M.; Going, S. B.; Hsu, C-H; Lee, M. C.; Lee, V. R.; Caan, B.J.; Kwan, M.L.; Lohman, T. G.

    2016-01-01

    Body composition may be a better predictor of chronic disease risk than body mass index (BMI) in older populations. Objectives We sought to validate spine fat fraction (%) from dual energy X-ray absorptiometry (DXA) spine scans as a proxy for total abdominal fat. Methods Total body DXA scan abdominal fat regions of interest (ROI) that have been previously validated by magnetic resonance imaging were assessed among healthy, postmenopausal women who also had antero-posterior spine scans (n=103). ROIs were 1) lumbar vertebrae L2-L4 and 2) L2-Iliac Crest (L2-IC), manually selected by two independent raters, and 3) trunk, auto-selected by DXA software. Intra-class correlation coefficients evaluated intra and inter-rater reliability on a random subset (N=25). Linear regression models, validated by bootstrapping, assessed the relationship between spine fat fraction (%) and total abdominal fat (%) ROIs. Results Mean age, BMI and total body fat were: 66.1 ± 4.8y, 25.8 ± 3.8kg/m2 and 40.0 ± 6.6%, respectively. There were no significant differences within or between raters. Linear regression models adjusted for several participant and scan characteristics were equivalent to using only BMI and spine fat fraction. The model predicted L2-L4 (Adj. R2: 0.83) and L2-IC (Adj.R2:0.84) abdominal fat (%) well; the adjusted R2 for trunk fat (%) was 0.78. Model validation demonstrated minimal over-fitting (Adj. R2: 0.82, 0.83, and 0.77 for L2-L4, L2-IC, and trunk fat respectively). Conclusions The strong correlation between spine fat fraction and DXA abdominal fat measures make it suitable for further development in post-menopausal chronic disease risk prediction models. PMID:27416964

  5. Seasonal Effect on Ocular Sun Exposure and Conjunctival UV Autofluorescence

    PubMed Central

    Haworth, Kristina M.; Chandler, Heather L.

    2016-01-01

    Purpose To evaluate feasibility and repeatability of measures for ocular sun exposure and conjunctival ultraviolet autofluorescence (UVAF), and to test for relationships between the outcomes. Methods Fifty volunteers were seen for 2 visits 14±2 days apart. Ocular sun exposure was estimated over a two-week time period using questionnaires that quantified time outdoors and ocular protection habits. Conjunctival UVAF was imaged using a Nikon D7000 camera system equipped with appropriate flash and filter system; image analysis was done using ImageJ software. Repeatability estimates were made using Bland-Altman plots with mean differences and 95% limits of agreement calculated. Non-normally distributed data was transformed by either log10 or square root methods. Linear regression was conducted to evaluate relationships between measures. Results Mean (±SD) values for ocular sun exposure and conjunctival UVAF were 8.86 (±11.97) hours and 9.15 (±9.47) mm2, respectively. Repeatability was found to be acceptable for both ocular sun exposure and conjunctival UVAF. Univariate linear regression showed outdoor occupation to be a predictor of higher ocular sun exposure; outdoor occupation and winter season of collection both predicted higher total UVAF. Furthermore, increased portion of day spent outdoors while working was associated with increased total conjunctival UVAF. Conclusions We demonstrate feasibility and repeatability of estimating ocular sun exposure using a previously unreported method and for conjunctival UVAF in a group of subjects residing in Ohio. Seasonal temperature variation may have influenced time outdoors and ultimately calculation of ocular sun exposure. As winter season of collection and outdoor occupation both predicted higher total UVAF, our data suggests that ocular sun exposure is associated with conjunctival UVAF and possibly, that UVAF remains for at least several months following sun exposure. PMID:27820717

  6. Structure-function relationships using spectral-domain optical coherence tomography: comparison with scanning laser polarimetry.

    PubMed

    Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe

    2010-12-01

    To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.

  7. A Simulation-Based Comparison of Several Stochastic Linear Regression Methods in the Presence of Outliers.

    ERIC Educational Resources Information Center

    Rule, David L.

    Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…

  8. Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance

    DTIC Science & Technology

    1989-12-01

    v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed

  9. Comparison of Selection Procedures and Validation of Criterion Used in Selection of Significant Control Variates of a Simulation Model

    DTIC Science & Technology

    1990-03-01

    and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models

  10. Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

    ERIC Educational Resources Information Center

    Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

    2013-01-01

    Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…

  11. Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course

    ERIC Educational Resources Information Center

    Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna

    2010-01-01

    Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…

  12. What Is Wrong with ANOVA and Multiple Regression? Analyzing Sentence Reading Times with Hierarchical Linear Models

    ERIC Educational Resources Information Center

    Richter, Tobias

    2006-01-01

    Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…

  13. Some Applied Research Concerns Using Multiple Linear Regression Analysis.

    ERIC Educational Resources Information Center

    Newman, Isadore; Fraas, John W.

    The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…

  14. Using Simple Linear Regression to Assess the Success of the Montreal Protocol in Reducing Atmospheric Chlorofluorocarbons

    ERIC Educational Resources Information Center

    Nelson, Dean

    2009-01-01

    Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…

  15. Quantum State Tomography via Linear Regression Estimation

    PubMed Central

    Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan

    2013-01-01

    A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519

  16. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  17. A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.

    PubMed

    Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S

    2017-06-01

    The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.

  18. SAMPA: A free software tool for skin and membrane permeation data analysis.

    PubMed

    Bezrouk, Aleš; Fiala, Zdeněk; Kotingová, Lenka; Krulichová, Iva Selke; Kopečná, Monika; Vávrová, Kateřina

    2017-10-01

    Skin and membrane permeation experiments comprise an important step in the development of a transdermal or topical formulation or toxicological risk assessment. The standard method for analyzing these data relies on the linear part of a permeation profile. However, it is difficult to objectively determine when the profile becomes linear, or the experiment duration may be insufficient to reach a maximum or steady state. Here, we present a software tool for Skin And Membrane Permeation data Analysis, SAMPA, that is easy to use and overcomes several of these difficulties. The SAMPA method and software have been validated on in vitro and in vivo permeation data on human, pig and rat skin and model stratum corneum lipid membranes using compounds that range from highly lipophilic polycyclic aromatic hydrocarbons to highly hydrophilic antiviral drug, with and without two permeation enhancers. The SAMPA performance was compared with the standard method using a linear part of the permeation profile and a complex mathematical model. SAMPA is a user-friendly, open-source software tool for analyzing the data obtained from skin and membrane permeation experiments. It runs on a Microsoft Windows platform and is freely available as a Supporting file to this article. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Regression of non-linear coupling of noise in LIGO detectors

    NASA Astrophysics Data System (ADS)

    Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.

    2018-03-01

    In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.

  20. Health Monitors for Chronic Disease by Gait Analysis with Mobile Phones

    PubMed Central

    Juen, Joshua; Cheng, Qian; Prieto-Centurion, Valentin; Krishnan, Jerry A.

    2014-01-01

    Abstract We have developed GaitTrack, a phone application to detect health status while the smartphone is carried normally. GaitTrack software monitors walking patterns, using only accelerometers embedded in phones to record spatiotemporal motion, without the need for sensors external to the phone. Our software transforms smartphones into health monitors, using eight parameters of phone motion transformed into body motion by the gait model. GaitTrack is designed to detect health status while the smartphone is carried during normal activities, namely, free-living walking. The current method for assessing free-living walking is medical accelerometers, so we present evidence that mobile phones running our software are more accurate. We then show our gait model is more accurate than medical pedometers for counting steps of patients with chronic disease. Our gait model was evaluated in a pilot study involving 30 patients with chronic lung disease. The six-minute walk test (6MWT) is a major assessment for chronic heart and lung disease, including congestive heart failure and especially chronic obstructive pulmonary disease (COPD), affecting millions of persons. The 6MWT consists of walking back and forth along a measured distance for 6 minutes. The gait model using linear regression performed with 94.13% accuracy in measuring walk distance, compared with the established standard of direct observation. We also evaluated a different statistical model using the same gait parameters to predict health status through lung function. This gait model has high accuracy when applied to demographic cohorts, for example, 89.22% accuracy testing the cohort of 12 female patients with ages 50–64 years. PMID:24694291

  1. QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions.

    PubMed

    Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan

    2012-12-01

    A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Environmental Health Monitor: Advanced Development of Temperature Sensor Suite.

    DTIC Science & Technology

    1995-07-30

    systems was implemented using program code existing at Veritay. The software , written in Microsoft® QuickBASIC, facilitated program changes for...currently unforeseen reason re-calibration is needed, this can be readily * accommodated by a straightforward change in the software program---without...unit. A linear relationship between these differences * was obtained using curve fitting software . The ½/-inch globe to 6-inch globe correlation * was

  3. Conventional multi-slice computed tomography (CT) and cone-beam CT (CBCT) for computer-aided implant placement. Part II: reliability of mucosa-supported stereolithographic guides.

    PubMed

    Arisan, Volkan; Karabuda, Zihni Cüneyt; Pişkin, Bülent; Özdemir, Tayfun

    2013-12-01

    Deviations of implants that were placed by conventional computed tomography (CT)- or cone beam CT (CBCT)-derived mucosa-supported stereolithographic (SLA) surgical guides were analyzed in this study. Eleven patients were randomly scanned by a multi-slice CT (CT group) or a CBCT scanner (CBCT group). A total of 108 implants were planned on the software and placed using SLA guides. A new CT or CBCT scan was obtained and merged with the planning data to identify the deviations between the planned and placed implants. Results were analyzed by Mann-Whitney U test and multiple regressions (p < .05). Mean angular and linear deviations in the CT group were 3.30° (SD 0.36), and 0.75 (SD 0.32) and 0.80 mm (SD 0.35) at the implant shoulder and tip, respectively. In the CBCT group, mean angular and linear deviations were 3.47° (SD 0.37), and 0.81 (SD 0.32) and 0.87 mm (SD 0.32) at the implant shoulder and tip, respectively. No statistically significant differences were detected between the CT and CBCT groups (p = .169 and p = .551, p = .113 for angular and linear deviations, respectively). Implant placement via CT- or CBCT-derived mucosa-supported SLA guides yielded similar deviation values. Results should be confirmed on alternative CBCT scanners. © 2012 Wiley Periodicals, Inc.

  4. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  5. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  6. On fitting generalized linear mixed-effects models for binary responses using different statistical packages.

    PubMed

    Zhang, Hui; Lu, Naiji; Feng, Changyong; Thurston, Sally W; Xia, Yinglin; Zhu, Liang; Tu, Xin M

    2011-09-10

    The generalized linear mixed-effects model (GLMM) is a popular paradigm to extend models for cross-sectional data to a longitudinal setting. When applied to modeling binary responses, different software packages and even different procedures within a package may give quite different results. In this report, we describe the statistical approaches that underlie these different procedures and discuss their strengths and weaknesses when applied to fit correlated binary responses. We then illustrate these considerations by applying these procedures implemented in some popular software packages to simulated and real study data. Our simulation results indicate a lack of reliability for most of the procedures considered, which carries significant implications for applying such popular software packages in practice. Copyright © 2011 John Wiley & Sons, Ltd.

  7. An Excel Solver Exercise to Introduce Nonlinear Regression

    ERIC Educational Resources Information Center

    Pinder, Jonathan P.

    2013-01-01

    Business students taking business analytics courses that have significant predictive modeling components, such as marketing research, data mining, forecasting, and advanced financial modeling, are introduced to nonlinear regression using application software that is a "black box" to the students. Thus, although correct models are…

  8. Validation of a Video Analysis Software Package for Quantifying Movement Velocity in Resistance Exercises.

    PubMed

    Sañudo, Borja; Rueda, David; Pozo-Cruz, Borja Del; de Hoyo, Moisés; Carrasco, Luis

    2016-10-01

    Sañudo, B, Rueda, D, del Pozo-Cruz, B, de Hoyo, M, and Carrasco, L. Validation of a video analysis software package for quantifying movement velocity in resistance exercises. J Strength Cond Res 30(10): 2934-2941, 2016-The aim of this study was to establish the validity of a video analysis software package in measuring mean propulsive velocity (MPV) and the maximal velocity during bench press. Twenty-one healthy males (21 ± 1 year) with weight training experience were recruited, and the MPV and the maximal velocity of the concentric phase (Vmax) were compared with a linear position transducer system during a standard bench press exercise. Participants performed a 1 repetition maximum test using the supine bench press exercise. The testing procedures involved the simultaneous assessment of bench press propulsive velocity using 2 kinematic (linear position transducer and semi-automated tracking software) systems. High Pearson's correlation coefficients for MPV and Vmax between both devices (r = 0.473 to 0.993) were observed. The intraclass correlation coefficients for barbell velocity data and the kinematic data obtained from video analysis were high (>0.79). In addition, the low coefficients of variation indicate that measurements had low variability. Finally, Bland-Altman plots with the limits of agreement of the MPV and Vmax with different loads showed a negative trend, which indicated that the video analysis had higher values than the linear transducer. In conclusion, this study has demonstrated that the software used for the video analysis was an easy to use and cost-effective tool with a very high degree of concurrent validity. This software can be used to evaluate changes in velocity of training load in resistance training, which may be important for the prescription and monitoring of training programmes.

  9. Sci—Fri PM: Topics — 05: Experience with linac simulation software in a teaching environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carlone, Marco; Harnett, Nicole; Jaffray, David

    Medical linear accelerator education is usually restricted to use of academic textbooks and supervised access to accelerators. To facilitate the learning process, simulation software was developed to reproduce the effect of medical linear accelerator beam adjustments on resulting clinical photon beams. The purpose of this report is to briefly describe the method of operation of the software as well as the initial experience with it in a teaching environment. To first and higher orders, all components of medical linear accelerators can be described by analytical solutions. When appropriate calibrations are applied, these analytical solutions can accurately simulate the performance ofmore » all linear accelerator sub-components. Grouped together, an overall medical linear accelerator model can be constructed. Fifteen expressions in total were coded using MATLAB v 7.14. The program was called SIMAC. The SIMAC program was used in an accelerator technology course offered at our institution; 14 delegates attended the course. The professional breakdown of the participants was: 5 physics residents, 3 accelerator technologists, 4 regulators and 1 physics associate. The course consisted of didactic lectures supported by labs using SIMAC. At the conclusion of the course, eight of thirteen delegates were able to successfully perform advanced beam adjustments after two days of theory and use of the linac simulator program. We suggest that this demonstrates good proficiency in understanding of the accelerator physics, which we hope will translate to a better ability to understand real world beam adjustments on a functioning medical linear accelerator.« less

  10. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  11. An Examination of the Demographic and Career Progression of Air Force Institute of Technology Cost Analysis Graduates.

    DTIC Science & Technology

    1997-09-01

    program include the ACEIT software training and the combination of Department of Defense (DOD) application, regression, and statistics. The weaknesses...and Integrated Tools ( ACEIT ) software and training could not be praised enough. AFIT vs. Civilian Institutions. The GCA program provides a Department...very useful to the graduates and beneficial to their careers. The main strengths of the program include the ACEIT software training and the combination

  12. Predictive and mechanistic multivariate linear regression models for reaction development

    PubMed Central

    Santiago, Celine B.; Guo, Jing-Yao

    2018-01-01

    Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711

  13. Adding a Parameter Increases the Variance of an Estimated Regression Function

    ERIC Educational Resources Information Center

    Withers, Christopher S.; Nadarajah, Saralees

    2011-01-01

    The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…

  14. Using nonlinear quantile regression to estimate the self-thinning boundary curve

    Treesearch

    Quang V. Cao; Thomas J. Dean

    2015-01-01

    The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...

  15. Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.

    PubMed

    Habib, I H I; Hassouna, M E M; Zaki, G A

    2005-03-01

    Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.

  16. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  17. Modeling the frequency of opposing left-turn conflicts at signalized intersections using generalized linear regression models.

    PubMed

    Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei

    2014-01-01

    The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.

  18. Online Statistical Modeling (Regression Analysis) for Independent Responses

    NASA Astrophysics Data System (ADS)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  19. Cognitive Factors Related to Drug Abuse Among a Sample of Iranian Male Medical College Students

    PubMed Central

    Jalilian, Farzad; Ataee, Mari; Matin, Behzad Karami; Ahmadpanah, Mohammad; Jouybari, Touraj Ahmadi; Eslami, Ahmad Ali; Mahboubi, Mohammad; Alavijeh, Mehdi Mirzaei

    2015-01-01

    Backgrounds: Drug abuse is one of the most serious social problems in many countries. College students, particularly at their first year of education, are considered as one of the at risk groups for drug abuse. The present study aimed to determine cognitive factors related to drug abuse among a sample of Iranian male medical college students based on the social cognitive theory (SCT). Method: This cross-sectional study was carried out on 425 Iranian male medical college students who were randomly selected to participate voluntarily in the study. The participants filled out a self-administered questionnaire. Data were analyzed by the SPSS software (ver. 21.0) using bivariate correlations, logistic and linear regression at 95% significant level. Results: Attitude, outcome expectation, outcome expectancies, subjective norms, and self-control were cognitive factors that accounted for 49% of the variation in the outcome measure of the intention to abuse drugs. Logistic regression showed that attitude (OR=1.062), outcome expectancies (OR=1.115), and subjective norms (OR=1.269) were the most influential predictors for drug abuse. Conclusions: The findings suggest that designing and implementation of educational programs may be useful to increase negative attitude, outcome expectancies, and subjective norms towards drug abuse for college students in order to prevent drug abuse. PMID:26156919

  20. Inverse odds ratio-weighted estimation for causal mediation analysis.

    PubMed

    Tchetgen Tchetgen, Eric J

    2013-11-20

    An important scientific goal of studies in the health and social sciences is increasingly to determine to what extent the total effect of a point exposure is mediated by an intermediate variable on the causal pathway between the exposure and the outcome. A causal framework has recently been proposed for mediation analysis, which gives rise to new definitions, formal identification results and novel estimators of direct and indirect effects. In the present paper, the author describes a new inverse odds ratio-weighted approach to estimate so-called natural direct and indirect effects. The approach, which uses as a weight the inverse of an estimate of the odds ratio function relating the exposure and the mediator, is universal in that it can be used to decompose total effects in a number of regression models commonly used in practice. Specifically, the approach may be used for effect decomposition in generalized linear models with a nonlinear link function, and in a number of other commonly used models such as the Cox proportional hazards regression for a survival outcome. The approach is simple and can be implemented in standard software provided a weight can be specified for each observation. An additional advantage of the method is that it easily incorporates multiple mediators of a categorical, discrete or continuous nature. Copyright © 2013 John Wiley & Sons, Ltd.

  1. The Seismic Tool-Kit (STK): An Open Source Software For Learning the Basis of Signal Processing and Seismology.

    NASA Astrophysics Data System (ADS)

    Reymond, D.

    2016-12-01

    We present an open source software project (GNU public license), named STK: Seismic Tool-Kit, that is dedicated mainly for learning signal processing and seismology. The STK project that started in 2007, is hosted by SourceForge.net, and count more than 20000 downloads at the date of writing.The STK project is composed of two main branches:First, a graphical interface dedicated to signal processing (in the SAC format (SAC_ASCII and SAC_BIN): where the signal can be plotted, zoomed, filtered, integrated, derivated, ... etc. (a large variety of IFR and FIR filter is proposed). The passage in the frequency domain via the Fourier transform is used to introduce the estimation of spectral density of the signal , with visualization of the Power Spectral Density (PSD) in linear or log scale, and also the evolutive time-frequency representation (or sonagram). The 3-components signals can be also processed for estimating their polarization properties, either for a given window, or either for evolutive windows along the time. This polarization analysis is useful for extracting the polarized noises, differentiating P waves, Rayleigh waves, Love waves, ... etc. Secondly, a panel of Utilities-Program are proposed for working in a terminal mode, with basic programs for computing azimuth and distance in spherical geometry, inter/auto-correlation, spectral density, time-frequency for an entire directory of signals, focal planes, and main components axis, radiation pattern of P waves, Polarization analysis of different waves (including noise), under/over-sampling the signals, cubic-spline smoothing, and linear/non linear regression analysis of data set. STK is developed in C/C++, mainly under Linux OS, and it has been also partially implemented under MS-Windows. STK has been used in some schools for viewing and plotting seismic records provided by IRIS, and it has been used as a practical support for teaching the basis of signal processing. Useful links:http://sourceforge.net/projects/seismic-toolkit/http://sourceforge.net/p/seismic-toolkit/wiki/browse_pages/

  2. VENVAL : a plywood mill cost accounting program

    Treesearch

    Henry Spelter

    1991-01-01

    This report documents a package of computer programs called VENVAL. These programs prepare plywood mill data for a linear programming (LP) model that, in turn, calculates the optimum mix of products to make, given a set of technologies and market prices. (The software to solve a linear program is not provided and must be obtained separately.) Linear programming finds...

  3. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  4. Image interpolation via regularized local linear regression.

    PubMed

    Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang

    2011-12-01

    The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE

  5. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    PubMed Central

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  6. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    PubMed

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  7. Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

    PubMed

    Kumar, K Vasanth; Porkodi, K; Rocha, F

    2008-01-15

    A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.

  8. Applied Multiple Linear Regression: A General Research Strategy

    ERIC Educational Resources Information Center

    Smith, Brandon B.

    1969-01-01

    Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)

  9. Immobilization of Cellulase from Bacillus subtilis UniMAP-KB01 on Multi-walled Carbon Nanotubes for Biofuel Production

    NASA Astrophysics Data System (ADS)

    Naresh, Sandrasekaran; Hoong Shuit, Siew; Kunasundari, Balakrishnan; Hoo Peng, Yong; Qi, Hwa Ng; Teoh, Yi Peng

    2018-03-01

    Bacillus subtilis UniMAP-KB01, a cellulase producer was isolated from Malaysian mangrove soil. Through morphological identification it was observed that the B. subtilis appears to be in rod shaped and identified as a gram positive bacterium. Growth profile of isolated B. subtilis was established by measuring optical density (OD) at 600 nm for every 1 hour intervals. Polymath software was employed to plot the growth profile and the non-linear plot established gave the precision value of linear regression, R2 of 0.9602, root mean square deviation (RMSD) of 0.0176 and variance of 0.0025. The hydrolysis capacity testing revealed the cellulolytic index of 2.83 ± 0.46 after stained with Gram’s Iodine. The harvested crude enzyme after 24 hours incubation in carboxymethylcellulose (CMC) broth at 45°C and 100 RPM, was tested for enzyme activity. Through Filter Paper Assay (FPA), the cellulase activity was calculated to be 0.05 U/mL. The hydrolysis capacity testing and FPA shown an acceptable value for thermophilic bacterial enzyme activity. Thus, this isolated strain reasoned to be potential for producing thermostable cellulase which will be immobilized onto multi-walled carbon nanotubes and the cellulolytic activity will be characterized for biofuel production.

  10. Statistical Modelling of Temperature and Moisture Uptake of Biochars Exposed to Selected Relative Humidity of Air.

    PubMed

    Bastistella, Luciane; Rousset, Patrick; Aviz, Antonio; Caldeira-Pires, Armando; Humbert, Gilles; Nogueira, Manoel

    2018-02-09

    New experimental techniques, as well as modern variants on known methods, have recently been employed to investigate the fundamental reactions underlying the oxidation of biochar. The purpose of this paper was to experimentally and statistically study how the relative humidity of air, mass, and particle size of four biochars influenced the adsorption of water and the increase in temperature. A random factorial design was employed using the intuitive statistical software Xlstat. A simple linear regression model and an analysis of variance with a pairwise comparison were performed. The experimental study was carried out on the wood of Quercus pubescens , Cyclobalanopsis glauca , Trigonostemon huangmosun , and Bambusa vulgaris , and involved five relative humidity conditions (22, 43, 75, 84, and 90%), two mass samples (0.1 and 1 g), and two particle sizes (powder and piece). Two response variables including water adsorption and temperature increase were analyzed and discussed. The temperature did not increase linearly with the adsorption of water. Temperature was modeled by nine explanatory variables, while water adsorption was modeled by eight. Five variables, including factors and their interactions, were found to be common to the two models. Sample mass and relative humidity influenced the two qualitative variables, while particle size and biochar type only influenced the temperature.

  11. Prediction of age and gender using digital radiographic method: A retrospective study.

    PubMed

    Poongodi, V; Kanmani, R; Anandi, M S; Krithika, C L; Kannan, A; Raghuram, P H

    2015-08-01

    To investigate age, sex based on gonial angle, width and breadth of the ramus of the mandible by digital orthopantomograph. A total of 200 panoramic radiographic images were selected. The age of the individuals ranged between 4 and 75 years of both the gender - males (113) and females (87) and selected radiographic images were measured using KLONK image measurement software tool with linear, angular measurement. The investigated radiographs were collected from the records of SRM Dental College, Department of Oral Medicine and Radiology. Radiographs with any pathology, facial deformities, if no observation of mental foramen, congenital deformities, magnification, and distortion were excluded. Mean, median, standard deviation, derived to check the first and third quartile, linear regression is used to check age and gender correlation with angle of mandible, height and width of the ramus of mandible. The radiographic method is a simpler and cost-effective method of age identification compared with histological and biochemical methods. Mandible is strongest facial bone after the skull, pelvic bone. It is validatory to predict age and gender by many previous studies. Radiographic and tomographic images have become an essential aid for human identification in forensic dentistry forensic dentists can choose the most appropriate one since the validity of age and gender estimation crucially depends on the method used and its proper application.

  12. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666

  13. Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

    PubMed

    Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

    2016-03-01

    In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Linear Discriminant Analysis on a Spreadsheet.

    ERIC Educational Resources Information Center

    Busbey, Arthur Bresnahan III

    1989-01-01

    Described is a software package, "Trapeze," within which a routine called LinDis can be used. Discussed are teaching methods, the linear discriminant model and equations, the LinDis worksheet, and an example. The set up for this routine is included. (CW)

  15. Finite element modeling of concrete structures strengthened with FRP laminates

    DOT National Transportation Integrated Search

    2001-05-01

    Linear and non-linear method models were developed for a reinforced concrete bridge that had been strengthened with fiber reinforced polymer (FRP) composites. ANSYS and SAP2000 modeling software were used; however, most of the development effort used...

  16. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  17. An Application to the Prediction of LOD Change Based on General Regression Neural Network

    NASA Astrophysics Data System (ADS)

    Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.

    2011-07-01

    Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.

  18. Solving a mixture of many random linear equations by tensor decomposition and alternating minimization.

    DOT National Transportation Integrated Search

    2016-09-01

    We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...

  19. Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation

    NASA Astrophysics Data System (ADS)

    Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.

    A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.

  20. Evaluating Federal Information Technology Program Success Based on Earned Value Management

    ERIC Educational Resources Information Center

    Moy, Mae N.

    2016-01-01

    Despite the use of earned value management (EVM) techniques to track development progress, federal information (IT) software programs continue to fail by not meeting identified business requirements. The purpose of this logistic regression study was to examine, using IT software data from federal agencies from 2011 to 2014, whether a relationship…

  1. Clinical Evaluation of the BD FACSPresto™ Near-Patient CD4 Counter in Kenya

    PubMed Central

    Angira, Francis; Akoth, Benta; Omolo, Paul; Opollo, Valarie; Bornheimer, Scott; Judge, Kevin; Tilahun, Henok; Lu, Beverly; Omana-Zapata, Imelda; Zeh, Clement

    2016-01-01

    Background The BD FACSPresto™ Near-Patient CD4 Counter was developed to expand HIV/AIDS management in resource-limited settings. It measures absolute CD4 counts (AbsCD4), percent CD4 (%CD4), and hemoglobin (Hb) from a single drop of capillary or venous blood in approximately 23 minutes, with throughput of 10 samples per hour. We assessed the performance of the BD FACSPresto system, evaluating accuracy, stability, linearity, precision, and reference intervals using capillary and venous blood at KEMRI/CDC HIV-research laboratory, Kisumu, Kenya, and precision and linearity at BD Biosciences, California, USA. Methods For accuracy, venous samples were tested using the BD FACSCalibur™ instrument with BD Tritest™ CD3/CD4/CD45 reagent, BD Trucount™ tubes, and BD Multiset™ software for AbsCD4 and %CD4, and the Sysmex™ KX-21N for Hb. Stability studies evaluated duration of staining (18–120-minute incubation), and effects of venous blood storage <6–24 hours post-draw. A normal cohort was tested for reference intervals. Precision covered multiple days, operators, and instruments. Linearity required mixing two pools of samples, to obtain evenly spaced concentrations for AbsCD4, total lymphocytes, and Hb. Results AbsCD4 and %CD4 venous/capillary (N = 189/ N = 162) accuracy results gave Deming regression slopes within 0.97–1.03 and R2 ≥0.96. For Hb, Deming regression results were R2 ≥0.94 and slope ≥0.94 for both venous and capillary samples. Stability varied within 10% 2 hours after staining and for venous blood stored less than 24 hours. Reference intervals results showed that gender—but not age—differences were statistically significant (p<0.05). Precision results had <3.5% coefficient of variation for AbsCD4, %CD4, and Hb, except for low AbsCD4 samples (<6.8%). Linearity was 42–4,897 cells/μL for AbsCD4, 182–11,704 cells/μL for total lymphocytes, and 2–24 g/dL for Hb. Conclusions The BD FACSPresto system provides accurate, precise clinical results for capillary or venous blood samples and is suitable for near-patient CD4 testing. Trial Registration ClinicalTrials.gov NCT02396355 PMID:27483008

  2. Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

    PubMed

    Yang, Xiaowei; Nie, Kun

    2008-03-15

    Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.

  3. Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino

    2018-07-01

    Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.

  4. Linear Proof-Mass Actuator

    NASA Technical Reports Server (NTRS)

    Holloway, Sidney E., III; Crossley, Edward A.; Miller, James B.; Jones, Irby W.; Davis, C. Calvin; Behun, Vaughn D.; Goodrich, Lewis R., Sr.

    1995-01-01

    Linear proof-mass actuator (LPMA) is friction-driven linear mass actuator capable of applying controlled force to structure in outer space to damp out oscillations. Capable of high accelerations and provides smooth, bidirectional travel of mass. Design eliminates gears and belts. LPMA strong enough to be used terrestrially where linear actuators needed to excite or damp out oscillations. High flexibility designed into LPMA by varying size of motors, mass, and length of stroke, and by modifying control software.

  5. Cone-beam computed tomography for lung cancer - validation with CT and monitoring tumour response during chemo-radiation therapy.

    PubMed

    Michienzi, Alissa; Kron, Tomas; Callahan, Jason; Plumridge, Nikki; Ball, David; Everitt, Sarah

    2017-04-01

    Cone-beam computed tomography (CBCT) is a valuable image-guidance tool in radiation therapy (RT). This study was initiated to assess the accuracy of CBCT for quantifying non-small cell lung cancer (NSCLC) tumour volumes compared to the anatomical 'gold standard', CT. Tumour regression or progression on CBCT was also analysed. Patients with Stage I-III NSCLC, prescribed 60 Gy in 30 fractions RT with concurrent platinum-based chemotherapy, routine CBCT and enrolled in a prospective study of serial PET/CT (baseline, weeks two and four) were eligible. Time-matched CBCT and CT gross tumour volumes (GTVs) were manually delineated by a single observer on MIM software, and were analysed descriptively and using Pearson's correlation coefficient (r) and linear regression (R 2 ). Of 94 CT/CBCT pairs, 30 patients were eligible for inclusion. The mean (± SD) CT GTV vs CBCT GTV on the four time-matched pairs were 95 (±182) vs 98.8 (±160.3), 73.6 (±132.4) vs 70.7 (±96.6), 54.7 (±92.9) vs 61.0 (±98.8) and 61.3 (±53.3) vs 62.1 (±47.9) respectively. Pearson's correlation coefficient (r) was 0.98 (95% CI 0.97-0.99, ρ < 0.001). The mean (±SD) CT/CBCT Dice's similarity coefficient was 0.66 (±0.16). Of 289 CBCT scans, tumours in 27 (90%) patients regressed by a mean (±SD) rate of 1.5% (±0.75) per fraction. The mean (±SD) GTV regression was 43.1% (±23.1) from the first to final CBCT. Primary lung tumour volumes observed on CBCT and time-matched CT are highly correlated (although not identical), thereby validating observations of GTV regression on CBCT in NSCLC. © 2016 The Royal Australian and New Zealand College of Radiologists.

  6. Predictive Utility of Marketed Volumetric Software Tools in Subjects at Risk for Alzheimer's: Do Regions Outside the Hippocampus Matter?

    PubMed Central

    Tanpitukpongse, Teerath P.; Mazurowski, Maciej A.; Ikhena, John; Petrella, Jeffrey R.

    2016-01-01

    Background and Purpose To assess prognostic efficacy of individual versus combined regional volumetrics in two commercially-available brain volumetric software packages for predicting conversion of patients with mild cognitive impairment to Alzheimer's disease. Materials and Methods Data was obtained through the Alzheimer's Disease Neuroimaging Initiative. 192 subjects (mean age 74.8 years, 39% female) diagnosed with mild cognitive impairment at baseline were studied. All had T1WI MRI sequences at baseline and 3-year clinical follow-up. Analysis was performed with NeuroQuant® and Neuroreader™. Receiver operating characteristic curves assessing the prognostic efficacy of each software package were generated using a univariable approach employing individual regional brain volumes, as well as two multivariable approaches (multiple regression and random forest), combining multiple volumes. Results On univariable analysis of 11 NeuroQuant® and 11 Neuroreader™ regional volumes, hippocampal volume had the highest area under the curve for both software packages (0.69 NeuroQuant®, 0.68 Neuroreader™), and was not significantly different (p > 0.05) between packages. Multivariable analysis did not increase the area under the curve for either package (0.63 logistic regression, 0.60 random forest NeuroQuant®; 0.65 logistic regression, 0.62 random forest Neuroreader™). Conclusion Of the multiple regional volume measures available in FDA-cleared brain volumetric software packages, hippocampal volume remains the best single predictor of conversion of mild cognitive impairment to Alzheimer's disease at 3-year follow-up. Combining volumetrics did not add additional prognostic efficacy. Therefore, future prognostic studies in MCI, combining such tools with demographic and other biomarker measures, are justified in using hippocampal volume as the only volumetric biomarker. PMID:28057634

  7. Stratification for the propensity score compared with linear regression techniques to assess the effect of treatment or exposure.

    PubMed

    Senn, Stephen; Graf, Erika; Caputo, Angelika

    2007-12-30

    Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.

  8. Finite element modeling of reinforced concrete structures strengthened with FRP laminates : final report.

    DOT National Transportation Integrated Search

    2001-05-01

    Linear and non-linear finite element method models were developed for a reinforced concrete bridge that had been strengthened with fiber reinforced polymer composites. ANSYS and SAP2000 modeling software were used; however, most of the development ef...

  9. Mental Models of Software Forecasting

    NASA Technical Reports Server (NTRS)

    Hihn, J.; Griesel, A.; Bruno, K.; Fouser, T.; Tausworthe, R.

    1993-01-01

    The majority of software engineers resist the use of the currently available cost models. One problem is that the mathematical and statistical models that are currently available do not correspond with the mental models of the software engineers. In an earlier JPL funded study (Hihn and Habib-agahi, 1991) it was found that software engineers prefer to use analogical or analogy-like techniques to derive size and cost estimates, whereas curren CER's hide any analogy in the regression equations. In addition, the currently available models depend upon information which is not available during early planning when the most important forecasts must be made.

  10. The Effects of Multiple Linked Representations on Student Learning in Mathematics.

    ERIC Educational Resources Information Center

    Ozgun-Koca, S. Asli

    This study investigated the effects on student understanding of linear relationships using the linked representation software VideoPoint as compared to using semi-linked representation software. It investigated students' attitudes towards and preferences for mathematical representations--equations, tables, or graphs. An Algebra I class was divided…

  11. Interaction Patterns in Synchronous Online Calculus and Linear Algebra Recitations

    ERIC Educational Resources Information Center

    Mayer, Greg; Hendricks, Cher

    2014-01-01

    This study describes interaction patterns observed during a pilot project that explored the use of web-conferencing (WC) software in two undergraduate distance education courses offered to advanced high-school students. The pilot program replaced video-conferencing technology with WC software during recitations, so as to increase participation in…

  12. Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.

    PubMed

    Trninić, Marko; Jeličić, Mario; Papić, Vladan

    2015-07-01

    In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.

  13. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  14. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    PubMed

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  15. Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool.

    PubMed

    Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi

    2007-10-01

    Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.

  16. An hourly PM10 diagnosis model for the Bilbao metropolitan area using a linear regression methodology.

    PubMed

    González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O

    2013-07-01

    There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.

  17. Visual field progression in glaucoma: estimating the overall significance of deterioration with permutation analyses of pointwise linear regression (PoPLR).

    PubMed

    O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H

    2012-10-01

    To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.

  18. A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

    ERIC Educational Resources Information Center

    Liou, Pey-Yan

    2009-01-01

    The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

  19. Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.

    PubMed

    Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W

    1992-03-01

    The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.

  20. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    PubMed

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  1. Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

    NASA Astrophysics Data System (ADS)

    Gusriani, N.; Firdaniza

    2018-03-01

    The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.

  2. Orthogonal Projection in Teaching Regression and Financial Mathematics

    ERIC Educational Resources Information Center

    Kachapova, Farida; Kachapov, Ilias

    2010-01-01

    Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…

  3. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  4. Effect of Stress Corrosion and Cyclic Fatigue on Fluorapatite Glass-Ceramic

    NASA Astrophysics Data System (ADS)

    Joshi, Gaurav V.

    2011-12-01

    Objective: The objective of this study was to test the following hypotheses: 1. Both cyclic degradation and stress corrosion mechanisms result in subcritical crack growth in a fluorapatite glass-ceramic. 2. There is an interactive effect of stress corrosion and cyclic fatigue to cause subcritical crack growth (SCG) for this material. 3. The material that exhibits rising toughness curve (R-curve) behavior also exhibits a cyclic degradation mechanism. Materials and Methods: The material tested was a fluorapatite glass-ceramic (IPS e.max ZirPress, Ivoclar-Vivadent). Rectangular beam specimens with dimensions of 25 mm x 4 mm x 1.2 mm were fabricated using the press-on technique. Two groups of specimens (N=30) with polished (15 mum) or air abraded surface were tested under rapid monotonic loading. Additional polished specimens were subjected to cyclic loading at two frequencies, 2 Hz (N=44) and 10 Hz (N=36), and at different stress amplitudes. All tests were performed using a fully articulating four-point flexure fixture in deionized water at 37°C. The SCG parameters were determined by using a statistical approach by Munz and Fett (1999). The fatigue lifetime data were fit to a general log-linear model in ALTA PRO software (Reliasoft). Fractographic techniques were used to determine the critical flaw sizes to estimate fracture toughness. To determine the presence of R-curve behavior, non-linear regression was used. Results: Increasing the frequency of cycling did not cause a significant decrease in lifetime. The parameters of the general log-linear model showed that only stress corrosion has a significant effect on lifetime. The parameters are presented in the following table.* SCG parameters (n=19--21) were similar for both frequencies. The regression model showed that the fracture toughness was significantly dependent (p<0.05) on critical flaw size. Conclusions: 1. Cyclic fatigue does not have a significant effect on the SCG in the fluorapatite glass-ceramic IPS e.max ZirPress. 2. There was no interactive effect between cyclic degradation and stress corrosion for this material. 3. The material exhibited a low level of R-curve behavior. It did not exhibit cyclic degradation. *Please refer to dissertation for table.

  5. Analysis of Learning Curve Fitting Techniques.

    DTIC Science & Technology

    1987-09-01

    1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied

  6. On vertical profile of ozone at Syowa

    NASA Technical Reports Server (NTRS)

    Chubachi, Shigeru

    1994-01-01

    The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.

  7. Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

    PubMed

    Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

    2018-01-01

    The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  9. Nonlinear isochrones in murine left ventricular pressure-volume loops: how well does the time-varying elastance concept hold?

    PubMed

    Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P

    2006-04-01

    The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.

  10. Does Nonlinear Modeling Play a Role in Plasmid Bioprocess Monitoring Using Fourier Transform Infrared Spectra?

    PubMed

    Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M

    2017-06-01

    The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.

  11. Application of software technology to automatic test data analysis

    NASA Technical Reports Server (NTRS)

    Stagner, J. R.

    1991-01-01

    The verification process for a major software subsystem was partially automated as part of a feasibility demonstration. The methods employed are generally useful and applicable to other types of subsystems. The effort resulted in substantial savings in test engineer analysis time and offers a method for inclusion of automatic verification as a part of regression testing.

  12. Evaluation of open source data mining software packages

    Treesearch

    Bonnie Ruefenacht; Greg Liknes; Andrew J. Lister; Haans Fisk; Dan Wendt

    2009-01-01

    Since 2001, the USDA Forest Service (USFS) has used classification and regression-tree technology to map USFS Forest Inventory and Analysis (FIA) biomass, forest type, forest type groups, and National Forest vegetation. This prior work used Cubist/See5 software for the analyses. The objective of this project, sponsored by the Remote Sensing Steering Committee (RSSC),...

  13. Heuristic Identification of Biological Architectures for Simulating Complex Hierarchical Genetic Interactions

    PubMed Central

    Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C

    2015-01-01

    Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175

  14. MSP-Tool: a VBA-based software tool for the analysis of multispecimen paleointensity data

    NASA Astrophysics Data System (ADS)

    Monster, Marilyn; de Groot, Lennart; Dekkers, Mark

    2015-12-01

    The multispecimen protocol (MSP) is a method to estimate the Earth's magnetic field's past strength from volcanic rocks or archeological materials. By reducing the amount of heating steps and aligning the specimens parallel to the applied field, thermochemical alteration and multi-domain effects are minimized. We present a new software tool, written for Microsoft Excel 2010 in Visual Basic for Applications (VBA), that evaluates paleointensity data acquired using this protocol. In addition to the three ratios (standard, fraction-corrected and domain-state-corrected) calculated following Dekkers and Böhnel (2006) and Fabian and Leonhardt (2010) and a number of other parameters proposed by Fabian and Leonhardt (2010), it also provides several reliability criteria. These include an alteration criterion, whether or not the linear regression intersects the y axis within the theoretically prescribed range, and two directional checks. Overprints and misalignment are detected by isolating the remaining natural remanent magnetization (NRM) and the partial thermoremanent magnetization (pTRM) gained and comparing their declinations and inclinations. The NRM remaining and pTRM gained are then used to calculate alignment-corrected multispecimen plots. Data are analyzed using bootstrap statistics. The program was tested on lava samples that were given a full TRM and that acquired their pTRMs at angles of 0, 15, 30 and 90° with respect to their NRMs. MSP-Tool adequately detected and largely corrected these artificial alignment errors.

  15. Evaluation of some facial anthropometric parameters in an Iranian population: infancy through adolescence.

    PubMed

    Jahanbin, Arezoo; Rashed, Roozbeh; Yazdani, Roghayeh; Shahri, Naser Mahdavi; Kianifar, Hamidreza

    2013-05-01

    By finding the mean value of anthropometric parameters in normal samples of a population, it is possible to create a template for facial analysis. The aim of our study was to measure the anthropometric parameters in 0- to 12-year-old girls of Fars ethnic origin in the Northeast of Iran. Six hundred sixty-two newborn to 12-year-old girls of Fars ethnic origin participated in the study. A digital camera was used to take frontal full-face photographs of each child. Thirteen measurements were taken with the Smile Analyzer software: al-al, ch-ch, en-en, ex-ex, ft'-ft', go'-go', t-t, zy'-zy', n'-gn', n'-sn, t-g', t-gn', t-sn. Data were analyzed using the SPSS software at the significance level of 0.05. In almost all parameters, we found significant growth acceleration between 2 and 4 years as well as 5 and 6 years of age. Another growth spurt was seen between 9 and 11 years, although it was less noticeable. Comparing the linear regression equations suggests that different craniofacial dimensions do not grow similarly. By age, craniofacial dimensions change at different rates. Different craniofacial dimensions do not grow at consistent rates. Some parts grow slower compared with others. The intercanthal width has the slowest growth. Facial height shows the fastest growth.

  16. The influence of inspiratory effort and emphysema on pulmonary nodule volumetry reproducibility.

    PubMed

    Moser, J B; Mak, S M; McNulty, W H; Padley, S; Nair, A; Shah, P L; Devaraj, A

    2017-11-01

    To evaluate the impact of inspiratory effort and emphysema on reproducibility of pulmonary nodule volumetry. Eighty-eight nodules in 24 patients with emphysema were studied retrospectively. All patients had undergone volumetric inspiratory and end-expiratory thoracic computed tomography (CT) for consideration of bronchoscopic lung volume reduction. Inspiratory and expiratory nodule volumes were measured using commercially available software. Local emphysema extent was established by analysing a segmentation area extended circumferentially around each nodule (quantified as percent of lung with density of -950 HU or less). Lung volumes were established using the same software. Differences in inspiratory and expiratory nodule volumes were illustrated using the Bland-Altman test. The influences of percentage reduction in lung volume at expiration, local emphysema extent, and nodule size on nodule volume variability were tested with multiple linear regression. The majority of nodules (59/88 [67%]) showed an increased volume at expiration. Mean difference in nodule volume between expiration and inspiration was +7.5% (95% confidence interval: -24.1, 39.1%). No relationships were demonstrated between nodule volume variability and emphysema extent, degree of expiration, or nodule size. Expiration causes a modest increase in volumetry-derived nodule volumes; however, the effect is unpredictable. Local emphysema extent had no significant effect on volume variability in the present cohort. Copyright © 2017 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

  17. Occlusal wear and occlusal condition in a convenience sample of young adults.

    PubMed

    Van't Spijker, A; Kreulen, C M; Bronkhorst, E M; Creugers, N H J

    2015-01-01

    To study progression of tooth wear quantitatively in a convenient sample of young adults and to assess possible correlations with occlusal conditions. Twenty-eight dental students participated in a three-year follow up study on tooth wear. Visible wear facets on full arch gypsum casts were assessed using a flatbed scanner and measuring software. Regression analyses were used to assess possible associations between the registered occlusal conditions 'occlusal guidance scheme', 'vertical overbite', 'horizontal overbite', 'depth of sagittal curve', 'canine Angle class relation', 'history of orthodontic treatment', and 'self-reported grinding/clenching' (independent variables) and increase of wear facets (dependent variable). Mean increase in facet surface areas ranged from 1.2 mm2 (premolars, incisors) to 3.4 mm2 (molars); the relative increase ranged from 15% to 23%. Backward regression analysis showed no significant relation for 'group function', 'vertical overbite', 'depth of sagittal curve', 'history of orthodontic treatment' nor 'self-reported clenching. The final multiple linear regression model showed significant associations amongst 'anterior protected articulation' and 'horizontal overbite' and increase of facet surface areas. For all teeth combined, only 'anterior protected articulation' had a significant effect. 'Self reported grinding' did not have a significant effect (p>0.07). In this study 'anterior protected articulation' and 'horizontal overbite', were significantly associated with the progression of tooth wear. Self reported grinding was not significantly associated with progression of tooth wear. Occlusal conditions such as anterior protected articulation and horizontal overbite seem to have an effect on the progression of occlusal tooth wear in this convenient sample of young adults. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Application of General Regression Neural Network to the Prediction of LOD Change

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Hong; Wang, Qi-Jie; Zhu, Jian-Jun; Zhang, Hao

    2012-01-01

    Traditional methods for predicting the change in length of day (LOD change) are mainly based on some linear models, such as the least square model and autoregression model, etc. However, the LOD change comprises complicated non-linear factors and the prediction effect of the linear models is always not so ideal. Thus, a kind of non-linear neural network — general regression neural network (GRNN) model is tried to make the prediction of the LOD change and the result is compared with the predicted results obtained by taking advantage of the BP (back propagation) neural network model and other models. The comparison result shows that the application of the GRNN to the prediction of the LOD change is highly effective and feasible.

  19. Estimating effects of limiting factors with regression quantiles

    USGS Publications Warehouse

    Cade, B.S.; Terrell, J.W.; Schroeder, R.L.

    1999-01-01

    In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.

  20. Evaluating the effect of a third-party implementation of resolution recovery on the quality of SPECT bone scan imaging using visual grading regression.

    PubMed

    Hay, Peter D; Smith, Julie; O'Connor, Richard A

    2016-02-01

    The aim of this study was to evaluate the benefits to SPECT bone scan image quality when applying resolution recovery (RR) during image reconstruction using software provided by a third-party supplier. Bone SPECT data from 90 clinical studies were reconstructed retrospectively using software supplied independent of the gamma camera manufacturer. The current clinical datasets contain 120×10 s projections and are reconstructed using an iterative method with a Butterworth postfilter. Five further reconstructions were created with the following characteristics: 10 s projections with a Butterworth postfilter (to assess intraobserver variation); 10 s projections with a Gaussian postfilter with and without RR; and 5 s projections with a Gaussian postfilter with and without RR. Two expert observers were asked to rate image quality on a five-point scale relative to our current clinical reconstruction. Datasets were anonymized and presented in random order. The benefits of RR on image scores were evaluated using ordinal logistic regression (visual grading regression). The application of RR during reconstruction increased the probability of both observers of scoring image quality as better than the current clinical reconstruction even where the dataset contained half the normal counts. Type of reconstruction and observer were both statistically significant variables in the ordinal logistic regression model. Visual grading regression was found to be a useful method for validating the local introduction of technological developments in nuclear medicine imaging. RR, as implemented by the independent software supplier, improved bone SPECT image quality when applied during image reconstruction. In the majority of clinical cases, acquisition times for bone SPECT intended for the purposes of localization can safely be halved (from 10 s projections to 5 s) when RR is applied.

  1. On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

    PubMed

    Pfeiffer, R M; Riedl, R

    2015-08-15

    We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  2. A General Sparse Tensor Framework for Electronic Structure Theory

    DOE PAGES

    Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I.; ...

    2017-01-24

    Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. But, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We then avoid cumbersome machine-generatedmore » code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.« less

  3. Association of digital cushion thickness with sole temperature measured with the use of infrared thermography.

    PubMed

    Oikonomou, G; Trojacanec, P; Ganda, E K; Bicalho, M L S; Bicalho, R C

    2014-07-01

    The main objective of this study was to investigate the association between digital cushion thickness and sole temperature measured by infrared thermography. Data were collected from 216 lactating Holstein cows at 4 to 10d in milk (DIM). Cows were locomotion scored and sole temperature was measured after claw trimming (a minimum delay of 3 min was allowed for the hoof to cool) using an infrared thermography camera. Temperature was measured at the typical ulcer site of the lateral digit of the left hind foot. Immediately after the thermographic image was obtained, the thickness of the digital cushion was measured by ultrasonography. Rumen fluid samples were collected with a stomach tube and sample pH was measured immediately after collection. Additionally, a blood sample was obtained and used for measurements of serum concentrations of β-hydroxybutyrate (BHBA), nonesterified fatty acids (NEFA), and haptoglobin. To evaluate the associations of digital cushion thickness with sole temperature, a linear regression model was built using the GLIMMIX procedure in SAS software (SAS Institute Inc., Cary, NC). Sole temperature was the response variable, and digital cushion thickness quartiles, locomotion score group, rumen fluid pH, rumen fluid sample volume, environmental temperature, age in days, and serum levels of NEFA, BHBA, and haptoglobin were fitted in the model. Only significant variables were retained in the final model. Simple linear regression scatter plots were used to illustrate associations between sole temperature (measured by infrared thermography at the typical ulcer site) and environmental temperature and between NEFA and BHBA serum levels and haptoglobin. One-way ANOVA was used to compare rumen fluid pH for different locomotion score groups and for different digital cushion quartiles. Results from the multivariable linear regression model showed that sole temperature increased as locomotion scores increased and decreased as digital cushion thickness increased. These results were adjusted for environmental temperature, which was significantly associated with sole temperature. Serum levels of NEFA, BHBA, and haptoglobin were not associated with sole temperature. However, significant correlations existed between serum levels of NEFA and haptoglobin and between serum levels of BHBA and haptoglobin. Rumen fluid pH was not associated with either locomotion score or digital cushion thickness. In conclusion, we show here that digital cushion thickness was associated with sole temperature in cows at 4 to 10 DIM. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  4. Bayesian Asymmetric Regression as a Means to Estimate and Evaluate Oral Reading Fluency Slopes

    ERIC Educational Resources Information Center

    Solomon, Benjamin G.; Forsberg, Ole J.

    2017-01-01

    Bayesian techniques have become increasingly present in the social sciences, fueled by advances in computer speed and the development of user-friendly software. In this paper, we forward the use of Bayesian Asymmetric Regression (BAR) to monitor intervention responsiveness when using Curriculum-Based Measurement (CBM) to assess oral reading…

  5. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  6. Application of third molar development and eruption models in estimating dental age in Malay sub-adults.

    PubMed

    Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc

    2015-08-01

    The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  7. 40 CFR 1066.220 - Linearity verification for chassis dynamometer systems.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... dynamometer speed and torque at least as frequently as indicated in Table 1 of § 1066.215. The intent of... linear regression and the linearity criteria specified in Table 1 of this section. (b) Performance requirements. If a measurement system does not meet the applicable linearity criteria in Table 1 of this...

  8. A Learning Progression Should Address Regression: Insights from Developing Non-Linear Reasoning in Ecology

    ERIC Educational Resources Information Center

    Hovardas, Tasos

    2016-01-01

    Although ecological systems at varying scales involve non-linear interactions, learners insist thinking in a linear fashion when they deal with ecological phenomena. The overall objective of the present contribution was to propose a hypothetical learning progression for developing non-linear reasoning in prey-predator systems and to provide…

  9. Application of Hierarchical Linear Models/Linear Mixed-Effects Models in School Effectiveness Research

    ERIC Educational Resources Information Center

    Ker, H. W.

    2014-01-01

    Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…

  10. Estimating normative limits of Heidelberg Retina Tomograph optic disc rim area with quantile regression.

    PubMed

    Artes, Paul H; Crabb, David P

    2010-01-01

    To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.

  11. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  12. Functional Additive Mixed Models

    PubMed Central

    Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja

    2014-01-01

    We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach. PMID:26347592

  13. Office workers' computer use patterns are associated with workplace stressors.

    PubMed

    Eijckelhof, Belinda H W; Huysmans, Maaike A; Blatter, Birgitte M; Leider, Priscilla C; Johnson, Peter W; van Dieën, Jaap H; Dennerlein, Jack T; van der Beek, Allard J

    2014-11-01

    This field study examined associations between workplace stressors and office workers' computer use patterns. We collected keyboard and mouse activities of 93 office workers (68F, 25M) for approximately two work weeks. Linear regression analyses examined the associations between self-reported effort, reward, overcommitment, and perceived stress and software-recorded computer use duration, number of short and long computer breaks, and pace of input device usage. Daily duration of computer use was, on average, 30 min longer for workers with high compared to low levels of overcommitment and perceived stress. The number of short computer breaks (30 s-5 min long) was approximately 20% lower for those with high compared to low effort and for those with low compared to high reward. These outcomes support the hypothesis that office workers' computer use patterns vary across individuals with different levels of workplace stressors. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  14. Functional Additive Mixed Models.

    PubMed

    Scheipl, Fabian; Staicu, Ana-Maria; Greven, Sonja

    2015-04-01

    We propose an extensive framework for additive regression models for correlated functional responses, allowing for multiple partially nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data. Additionally, our framework includes linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. It accommodates densely or sparsely observed functional responses and predictors which may be observed with additional error and includes both spline-based and functional principal component-based terms. Estimation and inference in this framework is based on standard additive mixed models, allowing us to take advantage of established methods and robust, flexible algorithms. We provide easy-to-use open source software in the pffr() function for the R-package refund. Simulations show that the proposed method recovers relevant effects reliably, handles small sample sizes well and also scales to larger data sets. Applications with spatially and longitudinally observed functional data demonstrate the flexibility in modeling and interpretability of results of our approach.

  15. The QSAR study of flavonoid-metal complexes scavenging rad OH free radical

    NASA Astrophysics Data System (ADS)

    Wang, Bo-chu; Qian, Jun-zhen; Fan, Ying; Tan, Jun

    2014-10-01

    Flavonoid-metal complexes have antioxidant activities. However, quantitative structure-activity relationships (QSAR) of flavonoid-metal complexes and their antioxidant activities has still not been tackled. On the basis of 21 structures of flavonoid-metal complexes and their antioxidant activities for scavenging rad OH free radical, we optimised their structures using Gaussian 03 software package and we subsequently calculated and chose 18 quantum chemistry descriptors such as dipole, charge and energy. Then we chose several quantum chemistry descriptors that are very important to the IC50 of flavonoid-metal complexes for scavenging rad OH free radical through method of stepwise linear regression, Meanwhile we obtained 4 new variables through the principal component analysis. Finally, we built the QSAR models based on those important quantum chemistry descriptors and the 4 new variables as the independent variables and the IC50 as the dependent variable using an Artificial Neural Network (ANN), and we validated the two models using experimental data. These results show that the two models in this paper are reliable and predictable.

  16. Study on energy saving of subway station based on orthogonal experimental method

    NASA Astrophysics Data System (ADS)

    Guo, Lei

    2017-05-01

    With the characteristics of quick, efficient and large amount transport, the subway has become an important way to solve urban traffic congestion. As the subway environment will follow the change of external environment factors such as temperature and load of personnel changes, three-dimensional numerical simulations study is conducted by using CFD software for air distribution of subway platform. The influence of different loads (the supply air temperature and velocity of air condition, personnel load, heat flux of the wall) on the subway platform flow field are also analysed. The orthogonal experiment method is applied to the numerical simulation analysis for human comfort under different parameters. Based on those results, the functional relationship between human comfort and the boundary conditions of the platform is produced by multiple linear regression fitting method, the order of major boundary conditions which affect human comfort is obtained. The above study provides a theoretical basis for the final energy-saving strategies.

  17. Multivariate meta-analysis using individual participant data

    PubMed Central

    Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.

    2016-01-01

    When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484

  18. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    NASA Technical Reports Server (NTRS)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  19. Spatio-temporal water quality mapping from satellite images using geographically and temporally weighted regression

    NASA Astrophysics Data System (ADS)

    Chu, Hone-Jay; Kong, Shish-Jeng; Chang, Chih-Hua

    2018-03-01

    The turbidity (TB) of a water body varies with time and space. Water quality is traditionally estimated via linear regression based on satellite images. However, estimating and mapping water quality require a spatio-temporal nonstationary model, while TB mapping necessitates the use of geographically and temporally weighted regression (GTWR) and geographically weighted regression (GWR) models, both of which are more precise than linear regression. Given the temporal nonstationary models for mapping water quality, GTWR offers the best option for estimating regional water quality. Compared with GWR, GTWR provides highly reliable information for water quality mapping, boasts a relatively high goodness of fit, improves the explanation of variance from 44% to 87%, and shows a sufficient space-time explanatory power. The seasonal patterns of TB and the main spatial patterns of TB variability can be identified using the estimated TB maps from GTWR and by conducting an empirical orthogonal function (EOF) analysis.

  20. Mental chronometry with simple linear regression.

    PubMed

    Chen, J Y

    1997-10-01

    Typically, mental chronometry is performed by means of introducing an independent variable postulated to affect selectively some stage of a presumed multistage process. However, the effect could be a global one that spreads proportionally over all stages of the process. Currently, there is no method to test this possibility although simple linear regression might serve the purpose. In the present study, the regression approach was tested with tasks (memory scanning and mental rotation) that involved a selective effect and with a task (word superiority effect) that involved a global effect, by the dominant theories. The results indicate (1) the manipulation of the size of a memory set or of angular disparity affects the intercept of the regression function that relates the times for memory scanning with different set sizes or for mental rotation with different angular disparities and (2) the manipulation of context affects the slope of the regression function that relates the times for detecting a target character under word and nonword conditions. These ratify the regression approach as a useful method for doing mental chronometry.

Top