Sample records for multi linear regression

  1. On the null distribution of Bayes factors in linear regression

    USDA-ARS?s Scientific Manuscript database

    We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...

  2. Delineating chalk sand distribution of Ekofisk formation using probabilistic neural network (PNN) and stepwise regression (SWR): Case study Danish North Sea field

    NASA Astrophysics Data System (ADS)

    Haris, A.; Nafian, M.; Riyanto, A.

    2017-07-01

    Danish North Sea Fields consist of several formations (Ekofisk, Tor, and Cromer Knoll) that was started from the age of Paleocene to Miocene. In this study, the integration of seismic and well log data set is carried out to determine the chalk sand distribution in the Danish North Sea field. The integration of seismic and well log data set is performed by using the seismic inversion analysis and seismic multi-attribute. The seismic inversion algorithm, which is used to derive acoustic impedance (AI), is model-based technique. The derived AI is then used as external attributes for the input of multi-attribute analysis. Moreover, the multi-attribute analysis is used to generate the linear and non-linear transformation of among well log properties. In the case of the linear model, selected transformation is conducted by weighting step-wise linear regression (SWR), while for the non-linear model is performed by using probabilistic neural networks (PNN). The estimated porosity, which is resulted by PNN shows better suited to the well log data compared with the results of SWR. This result can be understood since PNN perform non-linear regression so that the relationship between the attribute data and predicted log data can be optimized. The distribution of chalk sand has been successfully identified and characterized by porosity value ranging from 23% up to 30%.

  3. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    NASA Astrophysics Data System (ADS)

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  4. Genomic prediction based on data from three layer lines using non-linear regression models.

    PubMed

    Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

    2014-11-06

    Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.

  5. Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case

    NASA Astrophysics Data System (ADS)

    Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann

    2017-04-01

    Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.

  6. Prediction models for CO2 emission in Malaysia using best subsets regression and multi-linear regression

    NASA Astrophysics Data System (ADS)

    Tan, C. H.; Matjafri, M. Z.; Lim, H. S.

    2015-10-01

    This paper presents the prediction models which analyze and compute the CO2 emission in Malaysia. Each prediction model for CO2 emission will be analyzed based on three main groups which is transportation, electricity and heat production as well as residential buildings and commercial and public services. The prediction models were generated using data obtained from World Bank Open Data. Best subset method will be used to remove irrelevant data and followed by multi linear regression to produce the prediction models. From the results, high R-square (prediction) value was obtained and this implies that the models are reliable to predict the CO2 emission by using specific data. In addition, the CO2 emissions from these three groups are forecasted using trend analysis plots for observation purpose.

  7. Comparison of two-concentration with multi-concentration linear regressions: Retrospective data analysis of multiple regulated LC-MS bioanalytical projects.

    PubMed

    Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

    2013-09-01

    Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Multivariate meta-analysis for non-linear and other multi-parameter associations

    PubMed Central

    Gasparrini, A; Armstrong, B; Kenward, M G

    2012-01-01

    In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043

  9. Comparison of buried sand ridges and regressive sand ridges on the outer shelf of the East China Sea

    NASA Astrophysics Data System (ADS)

    Wu, Ziyin; Jin, Xianglong; Zhou, Jieqiong; Zhao, Dineng; Shang, Jihong; Li, Shoujun; Cao, Zhenyi; Liang, Yuyang

    2017-06-01

    Based on multi-beam echo soundings and high-resolution single-channel seismic profiles, linear sand ridges in U14 and U2 on the East China Sea (ECS) shelf are identified and compared in detail. Linear sand ridges in U14 are buried sand ridges, which are 90 m below the seafloor. It is presumed that these buried sand ridges belong to the transgressive systems tract (TST) formed 320-200 ka ago and that their top interface is the maximal flooding surface (MFS). Linear sand ridges in U2 are regressive sand ridges. It is presumed that these buried sand ridges belong to the TST of the last glacial maximum (LGM) and that their top interface is the MFS of the LGM. Four sub-stage sand ridges of U2 are discerned from the high-resolution single-channel seismic profile and four strikes of regressive sand ridges are distinguished from the submarine topographic map based on the multi-beam echo soundings. These multi-stage and multi-strike linear sand ridges are the response of, and evidence for, the evolution of submarine topography with respect to sea-level fluctuations since the LGM. Although the difference in the age of formation between U14 and U2 is 200 ka and their sequences are 90 m apart, the general strikes of the sand ridges are similar. This indicates that the basic configuration of tidal waves on the ECS shelf has been stable for the last 200 ka. A basic evolutionary model of the strata of the ECS shelf is proposed, in which sea-level change is the controlling factor. During the sea-level change of about 100 ka, five to six strata are developed and the sand ridges develop in the TST. A similar story of the evolution of paleo-topography on the ECS shelf has been repeated during the last 300 ka.

  10. Finite-sample and asymptotic sign-based tests for parameters of non-linear quantile regression with Markov noise

    NASA Astrophysics Data System (ADS)

    Sirenko, M. A.; Tarasenko, P. F.; Pushkarev, M. I.

    2017-01-01

    One of the most noticeable features of sign-based statistical procedures is an opportunity to build an exact test for simple hypothesis testing of parameters in a regression model. In this article, we expanded a sing-based approach to the nonlinear case with dependent noise. The examined model is a multi-quantile regression, which makes it possible to test hypothesis not only of regression parameters, but of noise parameters as well.

  11. Using Simulation Technique to overcome the multi-collinearity problem for estimating fuzzy linear regression parameters.

    NASA Astrophysics Data System (ADS)

    Mansoor Gorgees, Hazim; Hilal, Mariam Mohammed

    2018-05-01

    Fatigue cracking is one of the common types of pavement distresses and is an indicator of structural failure; cracks allow moisture infiltration, roughness, may further deteriorate to a pothole. Some causes of pavement deterioration are: traffic loading; environment influences; drainage deficiencies; materials quality problems; construction deficiencies and external contributors. Many researchers have made models that contain many variables like asphalt content, asphalt viscosity, fatigue life, stiffness of asphalt mixture, temperature and other parameters that affect the fatigue life. For this situation, a fuzzy linear regression model was employed and analyzed by using the traditional methods and our proposed method in order to overcome the multi-collinearity problem. The total spread error was used as a criterion to compare the performance of the studied methods. Simulation program was used to obtain the required results.

  12. A comparison of radiometric correction techniques in the evaluation of the relationship between LST and NDVI in Landsat imagery.

    PubMed

    Tan, Kok Chooi; Lim, Hwee San; Matjafri, Mohd Zubir; Abdullah, Khiruddin

    2012-06-01

    Atmospheric corrections for multi-temporal optical satellite images are necessary, especially in change detection analyses, such as normalized difference vegetation index (NDVI) rationing. Abrupt change detection analysis using remote-sensing techniques requires radiometric congruity and atmospheric correction to monitor terrestrial surfaces over time. Two atmospheric correction methods were used for this study: relative radiometric normalization and the simplified method for atmospheric correction (SMAC) in the solar spectrum. A multi-temporal data set consisting of two sets of Landsat images from the period between 1991 and 2002 of Penang Island, Malaysia, was used to compare NDVI maps, which were generated using the proposed atmospheric correction methods. Land surface temperature (LST) was retrieved using ATCOR3_T in PCI Geomatica 10.1 image processing software. Linear regression analysis was utilized to analyze the relationship between NDVI and LST. This study reveals that both of the proposed atmospheric correction methods yielded high accuracy through examination of the linear correlation coefficients. To check for the accuracy of the equation obtained through linear regression analysis for every single satellite image, 20 points were randomly chosen. The results showed that the SMAC method yielded a constant value (in terms of error) to predict the NDVI value from linear regression analysis-derived equation. The errors (average) from both proposed atmospheric correction methods were less than 10%.

  13. [Research on the method of interference correction for nondispersive infrared multi-component gas analysis].

    PubMed

    Sun, You-Wen; Liu, Wen-Qing; Wang, Shi-Mei; Huang, Shu-Hua; Yu, Xiao-Man

    2011-10-01

    A method of interference correction for nondispersive infrared multi-component gas analysis was described. According to the successive integral gas absorption models and methods, the influence of temperature and air pressure on the integral line strengths and linetype was considered, and based on Lorentz detuning linetypes, the absorption cross sections and response coefficients of H2O, CO2, CO, and NO on each filter channel were obtained. The four dimension linear regression equations for interference correction were established by response coefficients, the absorption cross interference was corrected by solving the multi-dimensional linear regression equations, and after interference correction, the pure absorbance signal on each filter channel was only controlled by the corresponding target gas concentration. When the sample cell was filled with gas mixture with a certain concentration proportion of CO, NO and CO2, the pure absorbance after interference correction was used for concentration inversion, the inversion concentration error for CO2 is 2.0%, the inversion concentration error for CO is 1.6%, and the inversion concentration error for NO is 1.7%. Both the theory and experiment prove that the interference correction method proposed for NDIR multi-component gas analysis is feasible.

  14. Comprehensive ripeness-index for prediction of ripening level in mangoes by multivariate modelling of ripening behaviour

    NASA Astrophysics Data System (ADS)

    Eyarkai Nambi, Vijayaram; Thangavel, Kuladaisamy; Manickavasagan, Annamalai; Shahir, Sultan

    2017-01-01

    Prediction of ripeness level in climacteric fruits is essential for post-harvest handling. An index capable of predicting ripening level with minimum inputs would be highly beneficial to the handlers, processors and researchers in fruit industry. A study was conducted with Indian mango cultivars to develop a ripeness index and associated model. Changes in physicochemical, colour and textural properties were measured throughout the ripening period and the period was classified into five stages (unripe, early ripe, partially ripe, ripe and over ripe). Multivariate regression techniques like partial least square regression, principal component regression and multi linear regression were compared and evaluated for its prediction. Multi linear regression model with 12 parameters was found more suitable in ripening prediction. Scientific variable reduction method was adopted to simplify the developed model. Better prediction was achieved with either 2 or 3 variables (total soluble solids, colour and acidity). Cross validation was done to increase the robustness and it was found that proposed ripening index was more effective in prediction of ripening stages. Three-variable model would be suitable for commercial applications where reasonable accuracies are sufficient. However, 12-variable model can be used to obtain more precise results in research and development applications.

  15. Multi-sensory landscape assessment: the contribution of acoustic perception to landscape evaluation.

    PubMed

    Gan, Yonghong; Luo, Tao; Breitung, Werner; Kang, Jian; Zhang, Tianhai

    2014-12-01

    In this paper, the contribution of visual and acoustic preference to multi-sensory landscape evaluation was quantitatively compared. The real landscapes were treated as dual-sensory ambiance and separated into visual landscape and soundscape. Both were evaluated by 63 respondents in laboratory conditions. The analysis of the relationship between respondent's visual and acoustic preference as well as their respective contribution to landscape preference showed that (1) some common attributes are universally identified in assessing visual, aural and audio-visual preference, such as naturalness or degree of human disturbance; (2) with acoustic and visual preferences as variables, a multi-variate linear regression model can satisfactorily predict landscape preference (R(2 )= 0.740), while the coefficients of determination for a unitary linear regression model were 0.345 and 0.720 for visual and acoustic preference as predicting factors, respectively; (3) acoustic preference played a much more important role in landscape evaluation than visual preference in this study (the former is about 4.5 times of the latter), which strongly suggests a rethinking of the role of soundscape in environment perception research and landscape planning practice.

  16. Investigating the detection of multi-homed devices independent of operating systems

    DTIC Science & Technology

    2017-09-01

    timestamp data was used to estimate clock skews using linear regression and linear optimization methods. Analysis revealed that detection depends on...the consistency of the estimated clock skew. Through vertical testing, it was also shown that clock skew consistency depends on the installed...optimization methods. Analysis revealed that detection depends on the consistency of the estimated clock skew. Through vertical testing, it was also

  17. Selection of higher order regression models in the analysis of multi-factorial transcription data.

    PubMed

    Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W; Mansmann, Ulrich; Buch, Thorsten; Tresch, Achim

    2014-01-01

    Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.

  18. Multi-Axis Identifiability Using Single-Surface Parameter Estimation Maneuvers on the X-48B Blended Wing Body

    NASA Technical Reports Server (NTRS)

    Ratnayake, Nalin A.; Koshimoto, Ed T.; Taylor, Brian R.

    2011-01-01

    The problem of parameter estimation on hybrid-wing-body type aircraft is complicated by the fact that many design candidates for such aircraft involve a large number of aero- dynamic control effectors that act in coplanar motion. This fact adds to the complexity already present in the parameter estimation problem for any aircraft with a closed-loop control system. Decorrelation of system inputs must be performed in order to ascertain individual surface derivatives with any sort of mathematical confidence. Non-standard control surface configurations, such as clamshell surfaces and drag-rudder modes, further complicate the modeling task. In this paper, asymmetric, single-surface maneuvers are used to excite multiple axes of aircraft motion simultaneously. Time history reconstructions of the moment coefficients computed by the solved regression models are then compared to each other in order to assess relative model accuracy. The reduced flight-test time required for inner surface parameter estimation using multi-axis methods was found to come at the cost of slightly reduced accuracy and statistical confidence for linear regression methods. Since the multi-axis maneuvers captured parameter estimates similar to both longitudinal and lateral-directional maneuvers combined, the number of test points required for the inner, aileron-like surfaces could in theory have been reduced by 50%. While trends were similar, however, individual parameters as estimated by a multi-axis model were typically different by an average absolute difference of roughly 15-20%, with decreased statistical significance, than those estimated by a single-axis model. The multi-axis model exhibited an increase in overall fit error of roughly 1-5% for the linear regression estimates with respect to the single-axis model, when applied to flight data designed for each, respectively.

  19. Virtual Beach v2.2 User Guide

    EPA Science Inventory

    Virtual Beach version 2.2 (VB 2.2) is a decision support tool. It is designed to construct site-specific Multi-Linear Regression (MLR) models to predict pathogen indicator levels (or fecal indicator bacteria, FIB) at recreational beaches. MLR analysis has outperformed persisten...

  20. Combined genetic algorithm and multiple linear regression (GA-MLR) optimizer: Application to multi-exponential fluorescence decay surface.

    PubMed

    Fisz, Jacek J

    2006-12-07

    The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear combinations of nonlinear functions, is indicated. The VP algorithm does not distinguish the weakly nonlinear parameters from the nonlinear ones and it does not apply to the model functions which are multi-linear combinations of nonlinear functions.

  1. Simulation of multi-stage nonlinear bone remodeling induced by fixed partial dentures of different configurations: a comparative clinical and numerical study.

    PubMed

    Liao, Zhipeng; Yoda, Nobuhiro; Chen, Junning; Zheng, Keke; Sasaki, Keiichi; Swain, Michael V; Li, Qing

    2017-04-01

    This paper aimed to develop a clinically validated bone remodeling algorithm by integrating bone's dynamic properties in a multi-stage fashion based on a four-year clinical follow-up of implant treatment. The configurational effects of fixed partial dentures (FPDs) were explored using a multi-stage remodeling rule. Three-dimensional real-time occlusal loads during maximum voluntary clenching were measured with a piezoelectric force transducer and were incorporated into a computerized tomography-based finite element mandibular model. Virtual X-ray images were generated based on simulation and statistically correlated with clinical data using linear regressions. The strain energy density-driven remodeling parameters were regulated over the time frame considered. A linear single-stage bone remodeling algorithm, with a single set of constant remodeling parameters, was found to poorly fit with clinical data through linear regression (low [Formula: see text] and R), whereas a time-dependent multi-stage algorithm better simulated the remodeling process (high [Formula: see text] and R) against the clinical results. The three-implant-supported and distally cantilevered FPDs presented noticeable and continuous bone apposition, mainly adjacent to the cervical and apical regions. The bridged and mesially cantilevered FPDs showed bone resorption or no visible bone formation in some areas. Time-dependent variation of bone remodeling parameters is recommended to better correlate remodeling simulation with clinical follow-up. The position of FPD pontics plays a critical role in mechanobiological functionality and bone remodeling. Caution should be exercised when selecting the cantilever FPD due to the risk of overloading bone resorption.

  2. Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

    PubMed Central

    Huang, Jian; Zhang, Cun-Hui

    2013-01-01

    The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including the generalized linear models. We study the estimation, prediction, selection and sparsity properties of the weighted ℓ1-penalized estimator in sparse, high-dimensional settings where the number of predictors p can be much larger than the sample size n. Adaptive Lasso is considered as a special case. A multistage method is developed to approximate concave regularized estimation by applying an adaptive Lasso recursively. We provide prediction and estimation oracle inequalities for single- and multi-stage estimators, a general selection consistency theorem, and an upper bound for the dimension of the Lasso estimator. Important models including the linear regression, logistic regression and log-linear models are used throughout to illustrate the applications of the general results. PMID:24348100

  3. Horses Auto-Recruit Their Lungs by Inspiratory Breath Holding Following Recovery from General Anaesthesia

    PubMed Central

    Mosing, Martina; Waldmann, Andreas D.; MacFarlane, Paul; Iff, Samuel; Auer, Ulrike; Bohm, Stephan H.; Bettschart-Wolfensberger, Regula; Bardell, David

    2016-01-01

    This study evaluated the breathing pattern and distribution of ventilation in horses prior to and following recovery from general anaesthesia using electrical impedance tomography (EIT). Six horses were anaesthetised for 6 hours in dorsal recumbency. Arterial blood gas and EIT measurements were performed 24 hours before (baseline) and 1, 2, 3, 4, 5 and 6 hours after horses stood following anaesthesia. At each time point 4 representative spontaneous breaths were analysed. The percentage of the total breath length during which impedance remained greater than 50% of the maximum inspiratory impedance change (breath holding), the fraction of total tidal ventilation within each of four stacked regions of interest (ROI) (distribution of ventilation) and the filling time and inflation period of seven ROI evenly distributed over the dorso-ventral height of the lungs were calculated. Mixed effects multi-linear regression and linear regression were used and significance was set at p<0.05. All horses demonstrated inspiratory breath holding until 5 hours after standing. No change from baseline was seen for the distribution of ventilation during inspiration. Filling time and inflation period were more rapid and shorter in ventral and slower and longer in most dorsal ROI compared to baseline, respectively. In a mixed effects multi-linear regression, breath holding was significantly correlated with PaCO2 in both the univariate and multivariate regression. Following recovery from anaesthesia, horses showed inspiratory breath holding during which gas redistributed from ventral into dorsal regions of the lungs. This suggests auto-recruitment of lung tissue which would have been dependent and likely atelectic during anaesthesia. PMID:27331910

  4. Mixed effect Poisson log-linear models for clinical and epidemiological sleep hypnogram data

    PubMed Central

    Swihart, Bruce J.; Caffo, Brian S.; Crainiceanu, Ciprian; Punjabi, Naresh M.

    2013-01-01

    Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of subjects and repeated measures within those subjects, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation allows synthesis of two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed. Supplementary material includes the analyzed data set as well as the code for a reproducible analysis. PMID:22241689

  5. Breeding value accuracy estimates for growth traits using random regression and multi-trait models in Nelore cattle.

    PubMed

    Boligon, A A; Baldi, F; Mercadante, M E Z; Lobo, R B; Pereira, R J; Albuquerque, L G

    2011-06-28

    We quantified the potential increase in accuracy of expected breeding value for weights of Nelore cattle, from birth to mature age, using multi-trait and random regression models on Legendre polynomials and B-spline functions. A total of 87,712 weight records from 8144 females were used, recorded every three months from birth to mature age from the Nelore Brazil Program. For random regression analyses, all female weight records from birth to eight years of age (data set I) were considered. From this general data set, a subset was created (data set II), which included only nine weight records: at birth, weaning, 365 and 550 days of age, and 2, 3, 4, 5, and 6 years of age. Data set II was analyzed using random regression and multi-trait models. The model of analysis included the contemporary group as fixed effects and age of dam as a linear and quadratic covariable. In the random regression analyses, average growth trends were modeled using a cubic regression on orthogonal polynomials of age. Residual variances were modeled by a step function with five classes. Legendre polynomials of fourth and sixth order were utilized to model the direct genetic and animal permanent environmental effects, respectively, while third-order Legendre polynomials were considered for maternal genetic and maternal permanent environmental effects. Quadratic polynomials were applied to model all random effects in random regression models on B-spline functions. Direct genetic and animal permanent environmental effects were modeled using three segments or five coefficients, and genetic maternal and maternal permanent environmental effects were modeled with one segment or three coefficients in the random regression models on B-spline functions. For both data sets (I and II), animals ranked differently according to expected breeding value obtained by random regression or multi-trait models. With random regression models, the highest gains in accuracy were obtained at ages with a low number of weight records. The results indicate that random regression models provide more accurate expected breeding values than the traditionally finite multi-trait models. Thus, higher genetic responses are expected for beef cattle growth traits by replacing a multi-trait model with random regression models for genetic evaluation. B-spline functions could be applied as an alternative to Legendre polynomials to model covariance functions for weights from birth to mature age.

  6. USING HYDROGRAPHIC DATA AND THE EPA VIRTUAL BEACH MODEL TO TEST PREDICTIONS OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    A modeling study of 2006 Huntington Beach (Lake Erie) beach bacteria concentrations indicates multi-variable linear regression (MLR) can effectively estimate bacteria concentrations compared to the persistence model. Our use of the Virtual Beach (VB) model affirms that fact. VB i...

  7. A novel multi-target regression framework for time-series prediction of drug efficacy.

    PubMed

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-18

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task.

  8. A novel multi-target regression framework for time-series prediction of drug efficacy

    PubMed Central

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-01

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task. PMID:28098186

  9. Multi-linear regression of sea level in the south west Pacific as a first step towards local sea level projections

    NASA Astrophysics Data System (ADS)

    Kumar, Vandhna; Meyssignac, Benoit; Melet, Angélique; Ganachaud, Alexandre

    2017-04-01

    Rising sea levels are a critical concern in small island nations. The problem is especially serious in the western south Pacific, where the total sea level rise over the last 60 years is up to 3 times the global average. In this study, we attempt to reconstruct sea levels at selected sites in the region (Suva, Lautoka, Noumea - Fiji and New Caledonia) as a mutiple-linear regression of atmospheric and oceanic variables. We focus on interannual-to-decadal scale variability, and lower (including the global mean sea level rise) over the 1979-2014 period. Sea levels are taken from tide gauge records and the ORAS4 reanalysis dataset, and are expressed as a sum of steric and mass changes as a preliminary step. The key development in our methodology is using leading wind stress curl as a proxy for the thermosteric component. This is based on the knowledge that wind stress curl anomalies can modulate the thermocline depth and resultant sea levels via Rossby wave propagation. The analysis is primarily based on correlation between local sea level and selected predictors, the dominant one being wind stress curl. In the first step, proxy boxes for wind stress curl are determined via regions of highest correlation. The proportion of sea level explained via linear regression is then removed, leaving a residual. This residual is then correlated with other locally acting potential predictors: halosteric sea level, the zonal and meridional wind stress components, and sea surface temperature. The statistically significant predictors are used in a multi-linear regression function to simulate the observed sea level. The method is able to reproduce between 40 to 80% of the variance in observed sea level. Based on the skill of the model, it has high potential in sea level projection and downscaling studies.

  10. Modeling of thermal degradation kinetics of the C-glucosyl xanthone mangiferin in an aqueous model solution as a function of pH and temperature and protective effect of honeybush extract matrix.

    PubMed

    Beelders, Theresa; de Beer, Dalene; Kidd, Martin; Joubert, Elizabeth

    2018-01-01

    Mangiferin, a C-glucosyl xanthone, abundant in mango and honeybush, is increasingly targeted for its bioactive properties and thus to enhance functional properties of food. The thermal degradation kinetics of mangiferin at pH3, 4, 5, 6 and 7 were each modeled at five temperatures ranging between 60 and 140°C. First-order reaction models were fitted to the data using non-linear regression to determine the reaction rate constant at each pH-temperature combination. The reaction rate constant increased with increasing temperature and pH. Comparison of the reaction rate constants at 100°C revealed an exponential relationship between the reaction rate constant and pH. The data for each pH were also modeled with the Arrhenius equation using non-linear and linear regression to determine the activation energy and pre-exponential factor. Activation energies decreased slightly with increasing pH. Finally, a multi-linear model taking into account both temperature and pH was developed for mangiferin degradation. Sterilization (121°C for 4min) of honeybush extracts dissolved at pH4, 5 and 7 did not cause noticeable degradation of mangiferin, although the multi-linear model predicted 34% degradation at pH7. The extract matrix is postulated to exert a protective effect as changes in potential precursor content could not fully explain the stability of mangiferin. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Multi-decadal trend and space-time variability of sea level over the Indian Ocean since the 1950s: impact of decadal climate modes

    NASA Astrophysics Data System (ADS)

    Han, W.; Stammer, D.; Meehl, G. A.; Hu, A.; Sienz, F.

    2016-12-01

    Sea level varies on decadal and multi-decadal timescales over the Indian Ocean. The variations are not spatially uniform, and can deviate considerably from the global mean sea level rise (SLR) due to various geophysical processes. One of these processes is the change of ocean circulation, which can be partly attributed to natural internal modes of climate variability. Over the Indian Ocean, the most influential climate modes on decadal and multi-decadal timescales are the Interdecadal Pacific Oscillation (IPO) and decadal variability of the Indian Ocean dipole (IOD). Here, we first analyze observational datasets to investigate the impacts of IPO and IOD on spatial patterns of decadal and interdecadal (hereafter decal) sea level variability & multi-decadal trend over the Indian Ocean since the 1950s, using a new statistical approach of Bayesian Dynamical Linear regression Model (DLM). The Bayesian DLM overcomes the limitation of "time-constant (static)" regression coefficients in conventional multiple linear regression model, by allowing the coefficients to vary with time and therefore measuring "time-evolving (dynamical)" relationship between climate modes and sea level. For the multi-decadal sea level trend since the 1950s, our results show that climate modes and non-climate modes (the part that cannot be explained by climate modes) have comparable contributions in magnitudes but with different spatial patterns, with each dominating different regions of the Indian Ocean. For decadal variability, climate modes are the major contributors for sea level variations over most region of the tropical Indian Ocean. The relative importance of IPO and decadal variability of IOD, however, varies spatially. For example, while IOD decadal variability dominates IPO in the eastern equatorial basin (85E-100E, 5S-5N), IPO dominates IOD in causing sea level variations in the tropical southwest Indian Ocean (45E-65E, 12S-2S). To help decipher the possible contribution of external forcing to the multi-decadal sea level trend and decadal variability, we also analyze the model outputs from NCAR's Community Earth System Model (CESM) Large Ensemble Experiments, and compare the results with our observational analyses.

  12. Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.

    PubMed

    Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao

    2016-11-30

    Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Differences in Student Evaluations of Limited-Term Lecturers and Full-Time Faculty

    ERIC Educational Resources Information Center

    Cho, Jeong-Il; Otani, Koichiro; Kim, B. Joon

    2014-01-01

    This study compared student evaluations of teaching (SET) for limited-term lecturers (LTLs) and full-time faculty (FTF) using a Likert-scaled survey administered to students (N = 1,410) at the end of university courses. Data were analyzed using a general linear regression model to investigate the influence of multi-dimensional evaluation items on…

  14. Time-resolved flow reconstruction with indirect measurements using regression models and Kalman-filtered POD ROM

    NASA Astrophysics Data System (ADS)

    Leroux, Romain; Chatellier, Ludovic; David, Laurent

    2018-01-01

    This article is devoted to the estimation of time-resolved particle image velocimetry (TR-PIV) flow fields using a time-resolved point measurements of a voltage signal obtained by hot-film anemometry. A multiple linear regression model is first defined to map the TR-PIV flow fields onto the voltage signal. Due to the high temporal resolution of the signal acquired by the hot-film sensor, the estimates of the TR-PIV flow fields are obtained with a multiple linear regression method called orthonormalized partial least squares regression (OPLSR). Subsequently, this model is incorporated as the observation equation in an ensemble Kalman filter (EnKF) applied on a proper orthogonal decomposition reduced-order model to stabilize it while reducing the effects of the hot-film sensor noise. This method is assessed for the reconstruction of the flow around a NACA0012 airfoil at a Reynolds number of 1000 and an angle of attack of {20}°. Comparisons with multi-time delay-modified linear stochastic estimation show that both the OPLSR and EnKF combined with OPLSR are more accurate as they produce a much lower relative estimation error, and provide a faithful reconstruction of the time evolution of the velocity flow fields.

  15. MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes.

    PubMed

    Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán

    2017-04-24

    We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.

  16. Quantification of endocrine disruptors and pesticides in water by gas chromatography-tandem mass spectrometry. Method validation using weighted linear regression schemes.

    PubMed

    Mansilha, C; Melo, A; Rebelo, H; Ferreira, I M P L V O; Pinho, O; Domingues, V; Pinho, C; Gameiro, P

    2010-10-22

    A multi-residue methodology based on a solid phase extraction followed by gas chromatography-tandem mass spectrometry was developed for trace analysis of 32 compounds in water matrices, including estrogens and several pesticides from different chemical families, some of them with endocrine disrupting properties. Matrix standard calibration solutions were prepared by adding known amounts of the analytes to a residue-free sample to compensate matrix-induced chromatographic response enhancement observed for certain pesticides. Validation was done mainly according to the International Conference on Harmonisation recommendations, as well as some European and American validation guidelines with specifications for pesticides analysis and/or GC-MS methodology. As the assumption of homoscedasticity was not met for analytical data, weighted least squares linear regression procedure was applied as a simple and effective way to counteract the greater influence of the greater concentrations on the fitted regression line, improving accuracy at the lower end of the calibration curve. The method was considered validated for 31 compounds after consistent evaluation of the key analytical parameters: specificity, linearity, limit of detection and quantification, range, precision, accuracy, extraction efficiency, stability and robustness. Copyright © 2010 Elsevier B.V. All rights reserved.

  17. Can multi-slice or navigator-gated R2* MRI replace single-slice breath-hold acquisition for hepatic iron quantification?

    PubMed

    Loeffler, Ralf B; McCarville, M Beth; Wagstaff, Anne W; Smeltzer, Matthew P; Krafft, Axel J; Song, Ruitian; Hankins, Jane S; Hillenbrand, Claudia M

    2017-01-01

    Liver R2* values calculated from multi-gradient echo (mGRE) magnetic resonance images (MRI) are strongly correlated with hepatic iron concentration (HIC) as shown in several independently derived biopsy calibration studies. These calibrations were established for axial single-slice breath-hold imaging at the location of the portal vein. Scanning in multi-slice mode makes the exam more efficient, since whole-liver coverage can be achieved with two breath-holds and the optimal slice can be selected afterward. Navigator echoes remove the need for breath-holds and allow use in sedated patients. To evaluate if the existing biopsy calibrations can be applied to multi-slice and navigator-controlled mGRE imaging in children with hepatic iron overload, by testing if there is a bias-free correlation between single-slice R2* and multi-slice or multi-slice navigator controlled R2*. This study included MRI data from 71 patients with transfusional iron overload, who received an MRI exam to estimate HIC using gradient echo sequences. Patient scans contained 2 or 3 of the following imaging methods used for analysis: single-slice images (n = 71), multi-slice images (n = 69) and navigator-controlled images (n = 17). Small and large blood corrected region of interests were selected on axial images of the liver to obtain R2* values for all data sets. Bland-Altman and linear regression analysis were used to compare R2* values from single-slice images to those of multi-slice images and navigator-controlled images. Bland-Altman analysis showed that all imaging method comparisons were strongly associated with each other and had high correlation coefficients (0.98 ≤ r ≤ 1.00) with P-values ≤0.0001. Linear regression yielded slopes that were close to 1. We found that navigator-gated or breath-held multi-slice R2* MRI for HIC determination measures R2* values comparable to the biopsy-validated single-slice, single breath-hold scan. We conclude that these three R2* methods can be interchangeably used in existing R2*-HIC calibrations.

  18. Modeling of bromate formation by ozonation of surface waters in drinking water treatment.

    PubMed

    Legube, Bernard; Parinet, Bernard; Gelinet, Karine; Berne, Florence; Croue, Jean-Philippe

    2004-04-01

    The main objective of this paper is to try to develop statistically and chemically rational models for bromate formation by ozonation of clarified surface waters. The results presented here show that bromate formation by ozonation of natural waters in drinking water treatment is directly proportional to the "Ct" value ("Ctau" in this study). Moreover, this proportionality strongly depends on many parameters: increasing of pH, temperature and bromide level leading to an increase of bromate formation; ammonia and dissolved organic carbon concentrations causing a reverse effect. Taking into account limitation of theoretical modeling, we proposed to predict bromate formation by stochastic simulations (multi-linear regression and artificial neural networks methods) from 40 experiments (BrO(3)(-) vs. "Ctau") carried out with three sand filtered waters sampled on three different waterworks. With seven selected variables we used a simple architecture of neural networks, optimized by "neural connection" of SPSS Inc./Recognition Inc. The bromate modeling by artificial neural networks gives better result than multi-linear regression. The artificial neural networks model allowed us classifying variables by decreasing order of influence (for the studied cases in our variables scale): "Ctau", [N-NH(4)(+)], [Br(-)], pH, temperature, DOC, alkalinity.

  19. On the concept of sloped motion for free-floating wave energy converters.

    PubMed

    Payne, Grégory S; Pascal, Rémy; Vaillant, Guillaume

    2015-10-08

    A free-floating wave energy converter (WEC) concept whose power take-off (PTO) system reacts against water inertia is investigated herein. The main focus is the impact of inclining the PTO direction on the system performance. The study is based on a numerical model whose formulation is first derived in detail. Hydrodynamics coefficients are obtained using the linear boundary element method package WAMIT. Verification of the model is provided prior to its use for a PTO parametric study and a multi-objective optimization based on a multi-linear regression method. It is found that inclining the direction of the PTO at around 50° to the vertical is highly beneficial for the WEC performance in that it provides a high capture width ratio over a broad region of the wave period range.

  20. On the concept of sloped motion for free-floating wave energy converters

    PubMed Central

    Payne, Grégory S.; Pascal, Rémy; Vaillant, Guillaume

    2015-01-01

    A free-floating wave energy converter (WEC) concept whose power take-off (PTO) system reacts against water inertia is investigated herein. The main focus is the impact of inclining the PTO direction on the system performance. The study is based on a numerical model whose formulation is first derived in detail. Hydrodynamics coefficients are obtained using the linear boundary element method package WAMIT. Verification of the model is provided prior to its use for a PTO parametric study and a multi-objective optimization based on a multi-linear regression method. It is found that inclining the direction of the PTO at around 50° to the vertical is highly beneficial for the WEC performance in that it provides a high capture width ratio over a broad region of the wave period range. PMID:26543397

  1. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  2. A systematic study on the influencing parameters and improvement of quantitative analysis of multi-component with single marker method using notoginseng as research subject.

    PubMed

    Wang, Chao-Qun; Jia, Xiu-Hong; Zhu, Shu; Komatsu, Katsuko; Wang, Xuan; Cai, Shao-Qing

    2015-03-01

    A new quantitative analysis of multi-component with single marker (QAMS) method for 11 saponins (ginsenosides Rg1, Rb1, Rg2, Rh1, Rf, Re and Rd; notoginsenosides R1, R4, Fa and K) in notoginseng was established, when 6 of these saponins were individually used as internal referring substances to investigate the influences of chemical structure, concentrations of quantitative components, and purities of the standard substances on the accuracy of the QAMS method. The results showed that the concentration of the analyte in sample solution was the major influencing parameter, whereas the other parameters had minimal influence on the accuracy of the QAMS method. A new method for calculating the relative correction factors by linear regression was established (linear regression method), which demonstrated to decrease standard method differences of the QAMS method from 1.20%±0.02% - 23.29%±3.23% to 0.10%±0.09% - 8.84%±2.85% in comparison with the previous method. And the differences between external standard method and the QAMS method using relative correction factors calculated by linear regression method were below 5% in the quantitative determination of Rg1, Re, R1, Rd and Fa in 24 notoginseng samples and Rb1 in 21 notoginseng samples. And the differences were mostly below 10% in the quantitative determination of Rf, Rg2, R4 and N-K (the differences of these 4 constituents bigger because their contents lower) in all the 24 notoginseng samples. The results indicated that the contents assayed by the new QAMS method could be considered as accurate as those assayed by external standard method. In addition, a method for determining applicable concentration ranges of the quantitative components assayed by QAMS method was established for the first time, which could ensure its high accuracy and could be applied to QAMS methods of other TCMs. The present study demonstrated the practicability of the application of the QAMS method for the quantitative analysis of multi-component and the quality control of TCMs and TCM prescriptions. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Boosting multi-state models.

    PubMed

    Reulen, Holger; Kneib, Thomas

    2016-04-01

    One important goal in multi-state modelling is to explore information about conditional transition-type-specific hazard rate functions by estimating influencing effects of explanatory variables. This may be performed using single transition-type-specific models if these covariate effects are assumed to be different across transition-types. To investigate whether this assumption holds or whether one of the effects is equal across several transition-types (cross-transition-type effect), a combined model has to be applied, for instance with the use of a stratified partial likelihood formulation. Here, prior knowledge about the underlying covariate effect mechanisms is often sparse, especially about ineffectivenesses of transition-type-specific or cross-transition-type effects. As a consequence, data-driven variable selection is an important task: a large number of estimable effects has to be taken into account if joint modelling of all transition-types is performed. A related but subsequent task is model choice: is an effect satisfactory estimated assuming linearity, or is the true underlying nature strongly deviating from linearity? This article introduces component-wise Functional Gradient Descent Boosting (short boosting) for multi-state models, an approach performing unsupervised variable selection and model choice simultaneously within a single estimation run. We demonstrate that features and advantages in the application of boosting introduced and illustrated in classical regression scenarios remain present in the transfer to multi-state models. As a consequence, boosting provides an effective means to answer questions about ineffectiveness and non-linearity of single transition-type-specific or cross-transition-type effects.

  4. Monthly monsoon rainfall forecasting using artificial neural networks

    NASA Astrophysics Data System (ADS)

    Ganti, Ravikumar

    2014-10-01

    Indian agriculture sector heavily depends on monsoon rainfall for successful harvesting. In the past, prediction of rainfall was mainly performed using regression models, which provide reasonable accuracy in the modelling and forecasting of complex physical systems. Recently, Artificial Neural Networks (ANNs) have been proposed as efficient tools for modelling and forecasting. A feed-forward multi-layer perceptron type of ANN architecture trained using the popular back-propagation algorithm was employed in this study. Other techniques investigated for modeling monthly monsoon rainfall include linear and non-linear regression models for comparison purposes. The data employed in this study include monthly rainfall and monthly average of the daily maximum temperature in the North Central region in India. Specifically, four regression models and two ANN model's were developed. The performance of various models was evaluated using a wide variety of standard statistical parameters and scatter plots. The results obtained in this study for forecasting monsoon rainfalls using ANNs have been encouraging. India's economy and agricultural activities can be effectively managed with the help of the availability of the accurate monsoon rainfall forecasts.

  5. Hourly predictive Levenberg-Marquardt ANN and multi linear regression models for predicting of dew point temperature

    NASA Astrophysics Data System (ADS)

    Zounemat-Kermani, Mohammad

    2012-08-01

    In this study, the ability of two models of multi linear regression (MLR) and Levenberg-Marquardt (LM) feed-forward neural network was examined to estimate the hourly dew point temperature. Dew point temperature is the temperature at which water vapor in the air condenses into liquid. This temperature can be useful in estimating meteorological variables such as fog, rain, snow, dew, and evapotranspiration and in investigating agronomical issues as stomatal closure in plants. The availability of hourly records of climatic data (air temperature, relative humidity and pressure) which could be used to predict dew point temperature initiated the practice of modeling. Additionally, the wind vector (wind speed magnitude and direction) and conceptual input of weather condition were employed as other input variables. The three quantitative standard statistical performance evaluation measures, i.e. the root mean squared error, mean absolute error, and absolute logarithmic Nash-Sutcliffe efficiency coefficient ( {| {{{Log}}({{NS}})} |} ) were employed to evaluate the performances of the developed models. The results showed that applying wind vector and weather condition as input vectors along with meteorological variables could slightly increase the ANN and MLR predictive accuracy. The results also revealed that LM-NN was superior to MLR model and the best performance was obtained by considering all potential input variables in terms of different evaluation criteria.

  6. Photometric redshift estimation based on data mining with PhotoRApToR

    NASA Astrophysics Data System (ADS)

    Cavuoti, S.; Brescia, M.; De Stefano, V.; Longo, G.

    2015-03-01

    Photometric redshifts (photo-z) are crucial to the scientific exploitation of modern panchromatic digital surveys. In this paper we present PhotoRApToR (Photometric Research Application To Redshift): a Java/C ++ based desktop application capable to solve non-linear regression and multi-variate classification problems, in particular specialized for photo-z estimation. It embeds a machine learning algorithm, namely a multi-layer neural network trained by the Quasi Newton learning rule, and special tools dedicated to pre- and post-processing data. PhotoRApToR has been successfully tested on several scientific cases. The application is available for free download from the DAME Program web site.

  7. On approaches to analyze the sensitivity of simulated hydrologic fluxes to model parameters in the community land model

    DOE PAGES

    Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...

    2015-12-04

    Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less

  8. Monopole and dipole estimation for multi-frequency sky maps by linear regression

    NASA Astrophysics Data System (ADS)

    Wehus, I. K.; Fuskeland, U.; Eriksen, H. K.; Banday, A. J.; Dickinson, C.; Ghosh, T.; Górski, K. M.; Lawrence, C. R.; Leahy, J. P.; Maino, D.; Reich, P.; Reich, W.

    2017-01-01

    We describe a simple but efficient method for deriving a consistent set of monopole and dipole corrections for multi-frequency sky map data sets, allowing robust parametric component separation with the same data set. The computational core of this method is linear regression between pairs of frequency maps, often called T-T plots. Individual contributions from monopole and dipole terms are determined by performing the regression locally in patches on the sky, while the degeneracy between different frequencies is lifted whenever the dominant foreground component exhibits a significant spatial spectral index variation. Based on this method, we present two different, but each internally consistent, sets of monopole and dipole coefficients for the nine-year WMAP, Planck 2013, SFD 100 μm, Haslam 408 MHz and Reich & Reich 1420 MHz maps. The two sets have been derived with different analysis assumptions and data selection, and provide an estimate of residual systematic uncertainties. In general, our values are in good agreement with previously published results. Among the most notable results are a relative dipole between the WMAP and Planck experiments of 10-15μK (depending on frequency), an estimate of the 408 MHz map monopole of 8.9 ± 1.3 K, and a non-zero dipole in the 1420 MHz map of 0.15 ± 0.03 K pointing towards Galactic coordinates (l,b) = (308°,-36°) ± 14°. These values represent the sum of any instrumental and data processing offsets, as well as any Galactic or extra-Galactic component that is spectrally uniform over the full sky.

  9. Factors associated with single-vehicle and multi-vehicle road traffic collision injuries in Ireland.

    PubMed

    Donnelly-Swift, Erica; Kelly, Alan

    2016-12-01

    Generalised linear regression models were used to identify factors associated with fatal/serious road traffic collision injuries for single- and multi-vehicle collisions. Single-vehicle collisions and multi-vehicle collisions occurring during the hours of darkness or on a wet road surface had reduced likelihood of a fatal/serious injury. Single-vehicle 'driver with passengers' collisions occurring at junctions or on a hill/gradient were less likely to result in a fatal/serious injury. Multi-vehicle rear-end/angle collisions had reduced likelihood of a fatal/serious injury. Single-vehicle 'driver only' collisions and multi-vehicle collisions occurring on a public/bank holiday or on a hill/gradient were more likely to result in a fatal/serious injury. Single-vehicle collisions involving male drivers had increased likelihood of a fatal/serious injury and single-vehicle 'driver with passengers' collisions involving drivers under the age of 25 years also had increased likelihood of a fatal/serious injury. Findings can enlighten decision-makers to circumstances leading to fatal/serious injuries.

  10. Physical Interpretation of the Correlation Between Multi-Angle Spectral Data and Canopy Height

    NASA Technical Reports Server (NTRS)

    Schull, M. A.; Ganguly, S.; Samanta, A.; Huang, D.; Shabanov, N. V.; Jenkins, J. P.; Chiu, J. C.; Marshak, A.; Blair, J. B.; Myneni, R. B.; hide

    2007-01-01

    Recent empirical studies have shown that multi-angle spectral data can be useful for predicting canopy height, but the physical reason for this correlation was not understood. We follow the concept of canopy spectral invariants, specifically escape probability, to gain insight into the observed correlation. Airborne Multi-Angle Imaging Spectrometer (AirMISR) and airborne Laser Vegetation Imaging Sensor (LVIS) data acquired during a NASA Terrestrial Ecology Program aircraft campaign underlie our analysis. Two multivariate linear regression models were developed to estimate LVIS height measures from 28 AirMISR multi-angle spectral reflectances and from the spectrally invariant escape probability at 7 AirMISR view angles. Both models achieved nearly the same accuracy, suggesting that canopy spectral invariant theory can explain the observed correlation. We hypothesize that the escape probability is sensitive to the aspect ratio (crown diameter to crown height). The multi-angle spectral data alone therefore may not provide enough information to retrieve canopy height globally

  11. Performance characteristics of LOX-H2, tangential-entry, swirl-coaxial, rocket injectors

    NASA Technical Reports Server (NTRS)

    Howell, Doug; Petersen, Eric; Clark, Jim

    1993-01-01

    Development of a high performing swirl-coaxial injector requires an understanding of fundamental performance characteristics. This paper addresses the findings of studies on cold flow atomic characterizations which provided information on the influence of fluid properties and element operating conditions on the produced droplet sprays. These findings are applied to actual rocket conditions. The performance characteristics of swirl-coaxial injection elements under multi-element hot-fire conditions were obtained by analysis of combustion performance data from three separate test series. The injection elements are described and test results are analyzed using multi-variable linear regression. A direct comparison of test results indicated that reduced fuel injection velocity improved injection element performance through improved propellant mixing.

  12. Multi-objective Optimization of Solar Irradiance and Variance at Pertinent Inclination Angles

    NASA Astrophysics Data System (ADS)

    Jain, Dhanesh; Lalwani, Mahendra

    2018-05-01

    The performance of photovoltaic panel gets highly affected bychange in atmospheric conditions and angle of inclination. This article evaluates the optimum tilt angle and orientation angle (surface azimuth angle) for solar photovoltaic array in order to get maximum solar irradiance and to reduce variance of radiation at different sets or subsets of time periods. Non-linear regression and adaptive neural fuzzy interference system (ANFIS) methods are used for predicting the solar radiation. The results of ANFIS are more accurate in comparison to non-linear regression. These results are further used for evaluating the correlation and applied for estimating the optimum combination of tilt angle and orientation angle with the help of general algebraic modelling system and multi-objective genetic algorithm. The hourly average solar irradiation is calculated at different combinations of tilt angle and orientation angle with the help of horizontal surface radiation data of Jodhpur (Rajasthan, India). The hourly average solar irradiance is calculated for three cases: zero variance, with actual variance and with double variance at different time scenarios. It is concluded that monthly collected solar radiation produces better result as compared to bimonthly, seasonally, half-yearly and yearly collected solar radiation. The profit obtained for monthly varying angle has 4.6% more with zero variance and 3.8% more with actual variance, than the annually fixed angle.

  13. Searching for the main anti-bacterial components in artificial Calculus bovis using UPLC and microcalorimetry coupled with multi-linear regression analysis.

    PubMed

    Zang, Qing-Ce; Wang, Jia-Bo; Kong, Wei-Jun; Jin, Cheng; Ma, Zhi-Jie; Chen, Jing; Gong, Qian-Feng; Xiao, Xiao-He

    2011-12-01

    The fingerprints of artificial Calculus bovis extracts from different solvents were established by ultra-performance liquid chromatography (UPLC) and the anti-bacterial activities of artificial C. bovis extracts on Staphylococcus aureus (S. aureus) growth were studied by microcalorimetry. The UPLC fingerprints were evaluated using hierarchical clustering analysis. Some quantitative parameters obtained from the thermogenic curves of S. aureus growth affected by artificial C. bovis extracts were analyzed using principal component analysis. The spectrum-effect relationships between UPLC fingerprints and anti-bacterial activities were investigated using multi-linear regression analysis. The results showed that peak 1 (taurocholate sodium), peak 3 (unknown compound), peak 4 (cholic acid), and peak 6 (chenodeoxycholic acid) are more significant than the other peaks with the standard parameter estimate 0.453, -0.166, 0.749, 0.025, respectively. So, compounds cholic acid, taurocholate sodium, and chenodeoxycholic acid might be the major anti-bacterial components in artificial C. bovis. Altogether, this work provides a general model of the combination of UPLC chromatography and anti-bacterial effect to study the spectrum-effect relationships of artificial C. bovis extracts, which can be used to discover the main anti-bacterial components in artificial C. bovis or other Chinese herbal medicines with anti-bacterial effects. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Using Linear Equating to Map PROMIS(®) Global Health Items and the PROMIS-29 V2.0 Profile Measure to the Health Utilities Index Mark 3.

    PubMed

    Hays, Ron D; Revicki, Dennis A; Feeny, David; Fayers, Peter; Spritzer, Karen L; Cella, David

    2016-10-01

    Preference-based health-related quality of life (HR-QOL) scores are useful as outcome measures in clinical studies, for monitoring the health of populations, and for estimating quality-adjusted life-years. This was a secondary analysis of data collected in an internet survey as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)) project. To estimate Health Utilities Index Mark 3 (HUI-3) preference scores, we used the ten PROMIS(®) global health items, the PROMIS-29 V2.0 single pain intensity item and seven multi-item scales (physical functioning, fatigue, pain interference, depressive symptoms, anxiety, ability to participate in social roles and activities, sleep disturbance), and the PROMIS-29 V2.0 items. Linear regression analyses were used to identify significant predictors, followed by simple linear equating to avoid regression to the mean. The regression models explained 48 % (global health items), 61 % (PROMIS-29 V2.0 scales), and 64 % (PROMIS-29 V2.0 items) of the variance in the HUI-3 preference score. Linear equated scores were similar to observed scores, although differences tended to be larger for older study participants. HUI-3 preference scores can be estimated from the PROMIS(®) global health items or PROMIS-29 V2.0. The estimated HUI-3 scores from the PROMIS(®) health measures can be used for economic applications and as a measure of overall HR-QOL in research.

  15. Dynamic whole body PET parametric imaging: II. Task-oriented statistical estimation

    PubMed Central

    Karakatsanis, Nicolas A.; Lodge, Martin A.; Zhou, Y.; Wahl, Richard L.; Rahmim, Arman

    2013-01-01

    In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (~15–20cm) of a single bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate Ki and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final Ki parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion study, was employed along with extensive Monte Carlo simulations and an initial clinical FDG patient dataset to validate and demonstrate the potential of the proposed statistical estimation methods. Both simulated and clinical results suggest that hybrid regression in the context of whole-body Patlak Ki imaging considerably reduces MSE without compromising high CNR. Alternatively, for a given CNR, hybrid regression enables larger reductions than OLS in the number of dynamic frames per bed, allowing for even shorter acquisitions of ~30min, thus further contributing to the clinical adoption of the proposed framework. Compared to the SUV approach, whole body parametric imaging can provide better tumor quantification, and can act as a complement to SUV, for the task of tumor detection. PMID:24080994

  16. Dynamic whole-body PET parametric imaging: II. Task-oriented statistical estimation.

    PubMed

    Karakatsanis, Nicolas A; Lodge, Martin A; Zhou, Y; Wahl, Richard L; Rahmim, Arman

    2013-10-21

    In the context of oncology, dynamic PET imaging coupled with standard graphical linear analysis has been previously employed to enable quantitative estimation of tracer kinetic parameters of physiological interest at the voxel level, thus, enabling quantitative PET parametric imaging. However, dynamic PET acquisition protocols have been confined to the limited axial field-of-view (~15-20 cm) of a single-bed position and have not been translated to the whole-body clinical imaging domain. On the contrary, standardized uptake value (SUV) PET imaging, considered as the routine approach in clinical oncology, commonly involves multi-bed acquisitions, but is performed statically, thus not allowing for dynamic tracking of the tracer distribution. Here, we pursue a transition to dynamic whole-body PET parametric imaging, by presenting, within a unified framework, clinically feasible multi-bed dynamic PET acquisition protocols and parametric imaging methods. In a companion study, we presented a novel clinically feasible dynamic (4D) multi-bed PET acquisition protocol as well as the concept of whole-body PET parametric imaging employing Patlak ordinary least squares (OLS) regression to estimate the quantitative parameters of tracer uptake rate Ki and total blood distribution volume V. In the present study, we propose an advanced hybrid linear regression framework, driven by Patlak kinetic voxel correlations, to achieve superior trade-off between contrast-to-noise ratio (CNR) and mean squared error (MSE) than provided by OLS for the final Ki parametric images, enabling task-based performance optimization. Overall, whether the observer's task is to detect a tumor or quantitatively assess treatment response, the proposed statistical estimation framework can be adapted to satisfy the specific task performance criteria, by adjusting the Patlak correlation-coefficient (WR) reference value. The multi-bed dynamic acquisition protocol, as optimized in the preceding companion study, was employed along with extensive Monte Carlo simulations and an initial clinical (18)F-deoxyglucose patient dataset to validate and demonstrate the potential of the proposed statistical estimation methods. Both simulated and clinical results suggest that hybrid regression in the context of whole-body Patlak Ki imaging considerably reduces MSE without compromising high CNR. Alternatively, for a given CNR, hybrid regression enables larger reductions than OLS in the number of dynamic frames per bed, allowing for even shorter acquisitions of ~30 min, thus further contributing to the clinical adoption of the proposed framework. Compared to the SUV approach, whole-body parametric imaging can provide better tumor quantification, and can act as a complement to SUV, for the task of tumor detection.

  17. Comparing Different Approaches for Mapping Urban Vegetation Cover from Landsat ETM+ Data: A Case Study on Brussels

    PubMed Central

    Van de Voorde, Tim; Vlaeminck, Jeroen; Canters, Frank

    2008-01-01

    Urban growth and its related environmental problems call for sustainable urban management policies to safeguard the quality of urban environments. Vegetation plays an important part in this as it provides ecological, social, health and economic benefits to a city's inhabitants. Remotely sensed data are of great value to monitor urban green and despite the clear advantages of contemporary high resolution images, the benefits of medium resolution data should not be discarded. The objective of this research was to estimate fractional vegetation cover from a Landsat ETM+ image with sub-pixel classification, and to compare accuracies obtained with multiple stepwise regression analysis, linear spectral unmixing and multi-layer perceptrons (MLP) at the level of meaningful urban spatial entities. Despite the small, but nevertheless statistically significant differences at pixel level between the alternative approaches, the spatial pattern of vegetation cover and estimation errors is clearly distinctive at neighbourhood level. At this spatially aggregated level, a simple regression model appears to attain sufficient accuracy. For mapping at a spatially more detailed level, the MLP seems to be the most appropriate choice. Brightness normalisation only appeared to affect the linear models, especially the linear spectral unmixing. PMID:27879914

  18. Lateral-Directional Parameter Estimation on the X-48B Aircraft Using an Abstracted, Multi-Objective Effector Model

    NASA Technical Reports Server (NTRS)

    Ratnayake, Nalin A.; Waggoner, Erin R.; Taylor, Brian R.

    2011-01-01

    The problem of parameter estimation on hybrid-wing-body aircraft is complicated by the fact that many design candidates for such aircraft involve a large number of aerodynamic control effectors that act in coplanar motion. This adds to the complexity already present in the parameter estimation problem for any aircraft with a closed-loop control system. Decorrelation of flight and simulation data must be performed in order to ascertain individual surface derivatives with any sort of mathematical confidence. Non-standard control surface configurations, such as clamshell surfaces and drag-rudder modes, further complicate the modeling task. In this paper, time-decorrelation techniques are applied to a model structure selected through stepwise regression for simulated and flight-generated lateral-directional parameter estimation data. A virtual effector model that uses mathematical abstractions to describe the multi-axis effects of clamshell surfaces is developed and applied. Comparisons are made between time history reconstructions and observed data in order to assess the accuracy of the regression model. The Cram r-Rao lower bounds of the estimated parameters are used to assess the uncertainty of the regression model relative to alternative models. Stepwise regression was found to be a useful technique for lateral-directional model design for hybrid-wing-body aircraft, as suggested by available flight data. Based on the results of this study, linear regression parameter estimation methods using abstracted effectors are expected to perform well for hybrid-wing-body aircraft properly equipped for the task.

  19. Can single empirical algorithms accurately predict inland shallow water quality status from high resolution, multi-sensor, multi-temporal satellite data?

    NASA Astrophysics Data System (ADS)

    Theologou, I.; Patelaki, M.; Karantzalos, K.

    2015-04-01

    Assessing and monitoring water quality status through timely, cost effective and accurate manner is of fundamental importance for numerous environmental management and policy making purposes. Therefore, there is a current need for validated methodologies which can effectively exploit, in an unsupervised way, the enormous amount of earth observation imaging datasets from various high-resolution satellite multispectral sensors. To this end, many research efforts are based on building concrete relationships and empirical algorithms from concurrent satellite and in-situ data collection campaigns. We have experimented with Landsat 7 and Landsat 8 multi-temporal satellite data, coupled with hyperspectral data from a field spectroradiometer and in-situ ground truth data with several physico-chemical and other key monitoring indicators. All available datasets, covering a 4 years period, in our case study Lake Karla in Greece, were processed and fused under a quantitative evaluation framework. The performed comprehensive analysis posed certain questions regarding the applicability of single empirical models across multi-temporal, multi-sensor datasets towards the accurate prediction of key water quality indicators for shallow inland systems. Single linear regression models didn't establish concrete relations across multi-temporal, multi-sensor observations. Moreover, the shallower parts of the inland system followed, in accordance with the literature, different regression patterns. Landsat 7 and 8 resulted in quite promising results indicating that from the recreation of the lake and onward consistent per-sensor, per-depth prediction models can be successfully established. The highest rates were for chl-a (r2=89.80%), dissolved oxygen (r2=88.53%), conductivity (r2=88.18%), ammonium (r2=87.2%) and pH (r2=86.35%), while the total phosphorus (r2=70.55%) and nitrates (r2=55.50%) resulted in lower correlation rates.

  20. Multi-frame linear regressive filter for the measurement of infrared pixel spatial response and MTF from sparse data.

    PubMed

    Huard, Edouard; Derelle, Sophie; Jaeck, Julien; Nghiem, Jean; Haïdar, Riad; Primot, Jérôme

    2018-03-05

    A challenging point in the prediction of the image quality of infrared imaging systems is the evaluation of the detector modulation transfer function (MTF). In this paper, we present a linear method to get a 2D continuous MTF from sparse spectral data. Within the method, an object with a predictable sparse spatial spectrum is imaged by the focal plane array. The sparse data is then treated to return the 2D continuous MTF with the hypothesis that all the pixels have an identical spatial response. The linearity of the treatment is a key point to estimate directly the error bars of the resulting detector MTF. The test bench will be presented along with measurement tests on a 25 μm pitch InGaAs detector.

  1. A practical data processing workflow for multi-OMICS projects.

    PubMed

    Kohl, Michael; Megger, Dominik A; Trippler, Martin; Meckel, Hagen; Ahrens, Maike; Bracht, Thilo; Weber, Frank; Hoffmann, Andreas-Claudius; Baba, Hideo A; Sitek, Barbara; Schlaak, Jörg F; Meyer, Helmut E; Stephan, Christian; Eisenacher, Martin

    2014-01-01

    Multi-OMICS approaches aim on the integration of quantitative data obtained for different biological molecules in order to understand their interrelation and the functioning of larger systems. This paper deals with several data integration and data processing issues that frequently occur within this context. To this end, the data processing workflow within the PROFILE project is presented, a multi-OMICS project that aims on identification of novel biomarkers and the development of new therapeutic targets for seven important liver diseases. Furthermore, a software called CrossPlatformCommander is sketched, which facilitates several steps of the proposed workflow in a semi-automatic manner. Application of the software is presented for the detection of novel biomarkers, their ranking and annotation with existing knowledge using the example of corresponding Transcriptomics and Proteomics data sets obtained from patients suffering from hepatocellular carcinoma. Additionally, a linear regression analysis of Transcriptomics vs. Proteomics data is presented and its performance assessed. It was shown, that for capturing profound relations between Transcriptomics and Proteomics data, a simple linear regression analysis is not sufficient and implementation and evaluation of alternative statistical approaches are needed. Additionally, the integration of multivariate variable selection and classification approaches is intended for further development of the software. Although this paper focuses only on the combination of data obtained from quantitative Proteomics and Transcriptomics experiments, several approaches and data integration steps are also applicable for other OMICS technologies. Keeping specific restrictions in mind the suggested workflow (or at least parts of it) may be used as a template for similar projects that make use of different high throughput techniques. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Multi-ethnic analysis reveals soluble L-selectin may be post-transcriptionally regulated by 3′UTR polymorphism: the Multi-Ethnic Study of Atherosclerosis (MESA)

    PubMed Central

    Berardi, Cecilia; Larson, Nicholas B.; Decker, Paul A.; Wassel, Christina L.; Kirsch, Phillip S.; Pankow, James S.; Sale, Michele M.; de Andrade, Mariza; Sicotte, Hugues; Tang, Weihong; Hanson, Naomi Q.; Tsai, Michael Y.; da Chen, Yii-Der I; Bielinski, Suzette J.

    2015-01-01

    L-selectin is constitutively expressed on leukocytes and mediates their interaction with endothelial cells during inflammation. Previous studies on the association of soluble L-selectin (sL-selectin) with cardiovascular disease (CVD) are inconsistent. Genetic variants associated with sL-selectin levels may be a better surrogate of levels over a lifetime. We explored the association of genetic variants and sL-selectin levels in a race/ethnicity stratified random sample of 2,403 participants in the Multi-Ethnic Study of Atherosclerosis (MESA). Through a genome-wide analysis with additive linear regression models, we found that rs12938 on the SELL gene accounted for a significant portion of the protein level variance across all four races/ethnicities. To evaluate potential additional associations, elastic net models were used for variants located in the SELL/SELP/SELE genetic region and an additional two SNPs, rs3917768 and rs4987361, were associated with sL-selectin levels in African Americans. These variants accounted for a portion of protein variance that ranged from 4% in Hispanic to 14% in African Americans. To investigate the relationship of these variants with CVD, 6,317 subjects were used. No significant association was found between any of the identified SNPs and carotid intima-media thickness or presence of carotid plaque using linear and logistic regression, respectively. Similarly no significant results were found for coronary artery calcium or coronary heart disease events. In conclusion, we found that variants within the SELL gene are associated with sL-selectin levels. Despite accounting for a significant portion of the protein level variance, none of the variants was associated with clinical or subclinical CVD. PMID:25576479

  3. Electromyographic analyses of muscle pre-activation induced by single joint exercise.

    PubMed

    Júnior, Valdinar A R; Bottaro, Martim; Pereira, Maria C C; Andrade, Marcelino M; P Júnior, Paulo R W; Carmo, Jake C

    2010-01-01

    To investigate whether performing a low-intensity, single-joint exercises for knee extensors was an efficient strategy for increasing the number of motor units recruited in the vastus lateralis muscle during a subsequent multi-joint exercises. Nine healthy male participants (23.33+/-3.46 yrs) underwent bouts of exercise in which knee extension and 45 degrees , and leg press exercises were performed in sequence. In the low-intensity bout (R30), 15 unilateral knee extensions were performed, followed by 15 repetitions of the leg presses at 30% and 60% of one maximum repetition load (1-MR), respectively. In the high-intensity bout (R60), the same sequence was performed, but the applied load was 60% of 1-MR for both exercises. A single set of 15 repetitions of the leg press at 60% of 1-MR was performed as a control exercise (CR). The surface electromyographic signals of the vastus lateralis muscle were recorded by means of a linear electrode array. The root mean square (RMS) values were determined for each repetition of the leg press, and linear regressions were calculated from these results. The slopes of the straight lines obtained were then normalized using the linear coefficients of the regression equations and compared using one-way ANOVAs for repeated measures. The slopes observed in the CR were significantly lower than those in the R30 and R60 (p<0.05). The results indicated that the recruitment of motor units was more effective when a single-joint exercise preceded the multi-joint exercise. Article registered in the Australian New Zealand Clinical Trials Registry (ANZCTR) under the number ACTRN12609000413224.

  4. An automated ranking platform for machine learning regression models for meat spoilage prediction using multi-spectral imaging and metabolic profiling.

    PubMed

    Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady

    2017-09-01

    Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  6. Gas demand forecasting by a new artificial intelligent algorithm

    NASA Astrophysics Data System (ADS)

    Khatibi. B, Vahid; Khatibi, Elham

    2012-01-01

    Energy demand forecasting is a key issue for consumers and generators in all energy markets in the world. This paper presents a new forecasting algorithm for daily gas demand prediction. This algorithm combines a wavelet transform and forecasting models such as multi-layer perceptron (MLP), linear regression or GARCH. The proposed method is applied to real data from the UK gas markets to evaluate their performance. The results show that the forecasting accuracy is improved significantly by using the proposed method.

  7. Hepatic fat quantification: a prospective comparison of magnetic resonance spectroscopy and analysis methods for chemical-shift gradient echo magnetic resonance imaging with histologic assessment as the reference standard.

    PubMed

    Kang, Bo-Kyeong; Yu, Eun Sil; Lee, Seung Soo; Lee, Youngjoo; Kim, Namkug; Sirlin, Claude B; Cho, Eun Yoon; Yeom, Suk Keu; Byun, Jae Ho; Park, Seong Ho; Lee, Moon-Gyu

    2012-06-01

    The aims of this study were to assess the confounding effects of hepatic iron deposition, inflammation, and fibrosis on hepatic steatosis (HS) evaluation by magnetic resonance imaging (MRI) and magnetic resonance spectroscopy (MRS) and to assess the accuracies of MRI and MRS for HS evaluation, using histology as the reference standard. In this institutional review board-approved prospective study, 56 patients gave informed consents and underwent chemical-shift MRI and MRS of the liver on a 1.5-T magnetic resonance scanner. To estimate MRI fat fraction (FF), 4 analysis methods were used (dual-echo, triple-echo, multiecho, and multi-interference), and MRS FF was calculated with T2 correction. Degrees of HS, iron deposition, inflammation, and fibrosis were analyzed in liver resection (n = 37) and biopsy (n = 19) specimens. The confounding effects of histology on fat quantification were assessed by multiple linear regression analysis. Using the histologic degree of HS as the reference standard, the accuracies of each method in estimating HS and diagnosing an HS of 5% or greater were determined by linear regression and receiver operating characteristic analyses. Iron deposition significantly confounded estimations of FF by the dual-echo (P < 0.001) and triple-echo (P = 0.033) methods, whereas no histologic feature confounded the multiecho and multi-interference methods or MRS. The MRS (r = 0.95) showed the strongest correlation with histologic degree of HS, followed by the multiecho (r = 0.92), multi-interference (r = 0.91), triple-echo (r = 0.90), and dual-echo (r = 0.85) methods. For diagnosing HS, the areas under the curve tended to be higher for MRS (0.96) and the multiecho (0.95), multi-interference (0.95), and triple-echo (0.95) methods than for the dual-echo method (0.88) (P ≥ 0.13). The multiecho and multi-interference MRI methods and MRS can accurately quantify hepatic fat, with coexisting histologic abnormalities having no confounding effects.

  8. Stochastic search, optimization and regression with energy applications

    NASA Astrophysics Data System (ADS)

    Hannah, Lauren A.

    Designing clean energy systems will be an important task over the next few decades. One of the major roadblocks is a lack of mathematical tools to economically evaluate those energy systems. However, solutions to these mathematical problems are also of interest to the operations research and statistical communities in general. This thesis studies three problems that are of interest to the energy community itself or provide support for solution methods: R&D portfolio optimization, nonparametric regression and stochastic search with an observable state variable. First, we consider the one stage R&D portfolio optimization problem to avoid the sequential decision process associated with the multi-stage. The one stage problem is still difficult because of a non-convex, combinatorial decision space and a non-convex objective function. We propose a heuristic solution method that uses marginal project values---which depend on the selected portfolio---to create a linear objective function. In conjunction with the 0-1 decision space, this new problem can be solved as a knapsack linear program. This method scales well to large decision spaces. We also propose an alternate, provably convergent algorithm that does not exploit problem structure. These methods are compared on a solid oxide fuel cell R&D portfolio problem. Next, we propose Dirichlet Process mixtures of Generalized Linear Models (DPGLM), a new method of nonparametric regression that accommodates continuous and categorical inputs, and responses that can be modeled by a generalized linear model. We prove conditions for the asymptotic unbiasedness of the DP-GLM regression mean function estimate. We also give examples for when those conditions hold, including models for compactly supported continuous distributions and a model with continuous covariates and categorical response. We empirically analyze the properties of the DP-GLM and why it provides better results than existing Dirichlet process mixture regression models. We evaluate DP-GLM on several data sets, comparing it to modern methods of nonparametric regression like CART, Bayesian trees and Gaussian processes. Compared to existing techniques, the DP-GLM provides a single model (and corresponding inference algorithms) that performs well in many regression settings. Finally, we study convex stochastic search problems where a noisy objective function value is observed after a decision is made. There are many stochastic search problems whose behavior depends on an exogenous state variable which affects the shape of the objective function. Currently, there is no general purpose algorithm to solve this class of problems. We use nonparametric density estimation to take observations from the joint state-outcome distribution and use them to infer the optimal decision for a given query state. We propose two solution methods that depend on the problem characteristics: function-based and gradient-based optimization. We examine two weighting schemes, kernel-based weights and Dirichlet process-based weights, for use with the solution methods. The weights and solution methods are tested on a synthetic multi-product newsvendor problem and the hour-ahead wind commitment problem. Our results show that in some cases Dirichlet process weights offer substantial benefits over kernel based weights and more generally that nonparametric estimation methods provide good solutions to otherwise intractable problems.

  9. An extended Kalman-Bucy filter for atmospheric temperature profile retrieval with a passive microwave sounder

    NASA Technical Reports Server (NTRS)

    Ledsham, W. H.; Staelin, D. H.

    1978-01-01

    An extended Kalman-Bucy filter has been implemented for atmospheric temperature profile retrievals from observations made using the Scanned Microwave Spectrometer (SCAMS) instrument carried on the Nimbus 6 satellite. This filter has the advantage that it requires neither stationary statistics in the underlying processes nor linear production of the observed variables from the variables to be estimated. This extended Kalman-Bucy filter has yielded significant performance improvement relative to multiple regression retrieval methods. A multi-spot extended Kalman-Bucy filter has also been developed in which the temperature profiles at a number of scan angles in a scanning instrument are retrieved simultaneously. These multi-spot retrievals are shown to outperform the single-spot Kalman retrievals.

  10. Evaluation of force-velocity and power-velocity relationship of arm muscles.

    PubMed

    Sreckovic, Sreten; Cuk, Ivan; Djuric, Sasa; Nedeljkovic, Aleksandar; Mirkov, Dragan; Jaric, Slobodan

    2015-08-01

    A number of recent studies have revealed an approximately linear force-velocity (F-V) and, consequently, a parabolic power-velocity (P-V) relationship of multi-joint tasks. However, the measurement characteristics of their parameters have been neglected, particularly those regarding arm muscles, which could be a problem for using the linear F-V model in both research and routine testing. Therefore, the aims of the present study were to evaluate the strength, shape, reliability, and concurrent validity of the F-V relationship of arm muscles. Twelve healthy participants performed maximum bench press throws against loads ranging from 20 to 70 % of their maximum strength, and linear regression model was applied on the obtained range of F and V data. One-repetition maximum bench press and medicine ball throw tests were also conducted. The observed individual F-V relationships were exceptionally strong (r = 0.96-0.99; all P < 0.05) and fairly linear, although it remains unresolved whether a polynomial fit could provide even stronger relationships. The reliability of parameters obtained from the linear F-V regressions proved to be mainly high (ICC > 0.80), while their concurrent validity regarding directly measured F, P, and V ranged from high (for maximum F) to medium-to-low (for maximum P and V). The findings add to the evidence that the linear F-V and, consequently, parabolic P-V models could be used to study the mechanical properties of muscular systems, as well as to design a relatively simple, reliable, and ecologically valid routine test of the muscle ability of force, power, and velocity production.

  11. Accounting for autocorrelation in multi-drug resistant tuberculosis predictors using a set of parsimonious orthogonal eigenvectors aggregated in geographic space.

    PubMed

    Jacob, Benjamin J; Krapp, Fiorella; Ponce, Mario; Gottuzzo, Eduardo; Griffith, Daniel A; Novak, Robert J

    2010-05-01

    Spatial autocorrelation is problematic for classical hierarchical cluster detection tests commonly used in multi-drug resistant tuberculosis (MDR-TB) analyses as considerable random error can occur. Therefore, when MDRTB clusters are spatially autocorrelated the assumption that the clusters are independently random is invalid. In this research, a product moment correlation coefficient (i.e., the Moran's coefficient) was used to quantify local spatial variation in multiple clinical and environmental predictor variables sampled in San Juan de Lurigancho, Lima, Peru. Initially, QuickBird 0.61 m data, encompassing visible bands and the near infra-red bands, were selected to synthesize images of land cover attributes of the study site. Data of residential addresses of individual patients with smear-positive MDR-TB were geocoded, prevalence rates calculated and then digitally overlaid onto the satellite data within a 2 km buffer of 31 georeferenced health centers, using a 10 m2 grid-based algorithm. Geographical information system (GIS)-gridded measurements of each health center were generated based on preliminary base maps of the georeferenced data aggregated to block groups and census tracts within each buffered area. A three-dimensional model of the study site was constructed based on a digital elevation model (DEM) to determine terrain covariates associated with the sampled MDR-TB covariates. Pearson's correlation was used to evaluate the linear relationship between the DEM and the sampled MDR-TB data. A SAS/GIS(R) module was then used to calculate univariate statistics and to perform linear and non-linear regression analyses using the sampled predictor variables. The estimates generated from a global autocorrelation analyses were then spatially decomposed into empirical orthogonal bases using a negative binomial regression with a non-homogeneous mean. Results of the DEM analyses indicated a statistically non-significant, linear relationship between georeferenced health centers and the sampled covariate elevation. The data exhibited positive spatial autocorrelation and the decomposition of Moran's coefficient into uncorrelated, orthogonal map pattern components revealed global spatial heterogeneities necessary to capture latent autocorrelation in the MDR-TB model. It was thus shown that Poisson regression analyses and spatial eigenvector mapping can elucidate the mechanics of MDR-TB transmission by prioritizing clinical and environmental-sampled predictor variables for identifying high risk populations.

  12. Walkability and cardiometabolic risk factors: Cross-sectional and longitudinal associations from the Multi-Ethnic Study of Atherosclerosis.

    PubMed

    Braun, Lindsay M; Rodríguez, Daniel A; Evenson, Kelly R; Hirsch, Jana A; Moore, Kari A; Diez Roux, Ana V

    2016-05-01

    We used data from 3227 older adults in the Multi-Ethnic Study of Atherosclerosis (2004-2012) to explore cross-sectional and longitudinal associations between walkability and cardiometabolic risk factors. In cross-sectional analyses, linear regression was used to estimate associations of Street Smart Walk Score® with glucose, triglycerides, HDL and LDL cholesterol, systolic and diastolic blood pressure, and waist circumference, while logistic regression was used to estimate associations with odds of metabolic syndrome. Econometric fixed effects models were used to estimate longitudinal associations of changes in walkability with changes in each risk factor among participants who moved residential locations between 2004 and 2012 (n=583). Most cross-sectional and longitudinal associations were small and statistically non-significant. We found limited evidence that higher walkability was cross-sectionally associated with lower blood pressure but that increases in walkability were associated with increases in triglycerides and blood pressure over time. Further research over longer time periods is needed to understand the potential for built environment interventions to improve cardiometabolic health. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Application of multi-scale wavelet entropy and multi-resolution Volterra models for climatic downscaling

    NASA Astrophysics Data System (ADS)

    Sehgal, V.; Lakhanpal, A.; Maheswaran, R.; Khosa, R.; Sridhar, Venkataramana

    2018-01-01

    This study proposes a wavelet-based multi-resolution modeling approach for statistical downscaling of GCM variables to mean monthly precipitation for five locations at Krishna Basin, India. Climatic dataset from NCEP is used for training the proposed models (Jan.'69 to Dec.'94) and are applied to corresponding CanCM4 GCM variables to simulate precipitation for the validation (Jan.'95-Dec.'05) and forecast (Jan.'06-Dec.'35) periods. The observed precipitation data is obtained from the India Meteorological Department (IMD) gridded precipitation product at 0.25 degree spatial resolution. This paper proposes a novel Multi-Scale Wavelet Entropy (MWE) based approach for clustering climatic variables into suitable clusters using k-means methodology. Principal Component Analysis (PCA) is used to obtain the representative Principal Components (PC) explaining 90-95% variance for each cluster. A multi-resolution non-linear approach combining Discrete Wavelet Transform (DWT) and Second Order Volterra (SoV) is used to model the representative PCs to obtain the downscaled precipitation for each downscaling location (W-P-SoV model). The results establish that wavelet-based multi-resolution SoV models perform significantly better compared to the traditional Multiple Linear Regression (MLR) and Artificial Neural Networks (ANN) based frameworks. It is observed that the proposed MWE-based clustering and subsequent PCA, helps reduce the dimensionality of the input climatic variables, while capturing more variability compared to stand-alone k-means (no MWE). The proposed models perform better in estimating the number of precipitation events during the non-monsoon periods whereas the models with clustering without MWE over-estimate the rainfall during the dry season.

  14. Extraction of object skeletons in multispectral imagery by the orthogonal regression fitting

    NASA Astrophysics Data System (ADS)

    Palenichka, Roman M.; Zaremba, Marek B.

    2003-03-01

    Accurate and automatic extraction of skeletal shape of objects of interest from satellite images provides an efficient solution to such image analysis tasks as object detection, object identification, and shape description. The problem of skeletal shape extraction can be effectively solved in three basic steps: intensity clustering (i.e. segmentation) of objects, extraction of a structural graph of the object shape, and refinement of structural graph by the orthogonal regression fitting. The objects of interest are segmented from the background by a clustering transformation of primary features (spectral components) with respect to each pixel. The structural graph is composed of connected skeleton vertices and represents the topology of the skeleton. In the general case, it is a quite rough piecewise-linear representation of object skeletons. The positions of skeleton vertices on the image plane are adjusted by means of the orthogonal regression fitting. It consists of changing positions of existing vertices according to the minimum of the mean orthogonal distances and, eventually, adding new vertices in-between if a given accuracy if not yet satisfied. Vertices of initial piecewise-linear skeletons are extracted by using a multi-scale image relevance function. The relevance function is an image local operator that has local maximums at the centers of the objects of interest.

  15. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  16. Electrochemical Behavior and Determination of Chlorogenic Acid Based on Multi-Walled Carbon Nanotubes Modified Screen-Printed Electrode

    PubMed Central

    Ma, Xiaoyan; Yang, Hongqiao; Xiong, Huabin; Li, Xiaofen; Gao, Jinting; Gao, Yuntao

    2016-01-01

    In this paper, the multi-walled carbon nanotubes modified screen-printed electrode (MWCNTs/SPE) was prepared and the MWCNTs/SPE was employed for the electrochemical determination of the antioxidant substance chlorogenic acids (CGAs). A pair of well-defined redox peaks of CGA was observed at the MWCNTs/SPE in 0.10 mol/L acetic acid-sodium acetate buffer (pH 6.2) and the electrode process was adsorption-controlled. Cyclic voltammetry (CV) and differential pulse voltammetry (DPV) methods for the determination of CGA were proposed based on the MWCNTs/SPE. Under the optimal conditions, the proposed method exhibited linear ranges from 0.17 to 15.8 µg/mL, and the linear regression equation was Ipa (µA) = 4.1993 C (×10−5 mol/L) + 1.1039 (r = 0.9976) and the detection limit for CGA could reach 0.12 µg/mL. The recovery of matrine was 94.74%–106.65% (RSD = 2.92%) in coffee beans. The proposed method is quick, sensitive, reliable, and can be used for the determination of CGA. PMID:27801797

  17. Riemannian multi-manifold modeling and clustering in brain networks

    NASA Astrophysics Data System (ADS)

    Slavakis, Konstantinos; Salsabilian, Shiva; Wack, David S.; Muldoon, Sarah F.; Baidoo-Williams, Henry E.; Vettel, Jean M.; Cieslak, Matthew; Grafton, Scott T.

    2017-08-01

    This paper introduces Riemannian multi-manifold modeling in the context of brain-network analytics: Brainnetwork time-series yield features which are modeled as points lying in or close to a union of a finite number of submanifolds within a known Riemannian manifold. Distinguishing disparate time series amounts thus to clustering multiple Riemannian submanifolds. To this end, two feature-generation schemes for brain-network time series are put forth. The first one is motivated by Granger-causality arguments and uses an auto-regressive moving average model to map low-rank linear vector subspaces, spanned by column vectors of appropriately defined observability matrices, to points into the Grassmann manifold. The second one utilizes (non-linear) dependencies among network nodes by introducing kernel-based partial correlations to generate points in the manifold of positivedefinite matrices. Based on recently developed research on clustering Riemannian submanifolds, an algorithm is provided for distinguishing time series based on their Riemannian-geometry properties. Numerical tests on time series, synthetically generated from real brain-network structural connectivity matrices, reveal that the proposed scheme outperforms classical and state-of-the-art techniques in clustering brain-network states/structures.

  18. Expanding the occupational health methodology: A concatenated artificial neural network approach to model the burnout process in Chinese nurses.

    PubMed

    Ladstätter, Felix; Garrosa, Eva; Moreno-Jiménez, Bernardo; Ponsoda, Vicente; Reales Aviles, José Manuel; Dai, Junming

    2016-01-01

    Artificial neural networks are sophisticated modelling and prediction tools capable of extracting complex, non-linear relationships between predictor (input) and predicted (output) variables. This study explores this capacity by modelling non-linearities in the hardiness-modulated burnout process with a neural network. Specifically, two multi-layer feed-forward artificial neural networks are concatenated in an attempt to model the composite non-linear burnout process. Sensitivity analysis, a Monte Carlo-based global simulation technique, is then utilised to examine the first-order effects of the predictor variables on the burnout sub-dimensions and consequences. Results show that (1) this concatenated artificial neural network approach is feasible to model the burnout process, (2) sensitivity analysis is a prolific method to study the relative importance of predictor variables and (3) the relationships among variables involved in the development of burnout and its consequences are to different degrees non-linear. Many relationships among variables (e.g., stressors and strains) are not linear, yet researchers use linear methods such as Pearson correlation or linear regression to analyse these relationships. Artificial neural network analysis is an innovative method to analyse non-linear relationships and in combination with sensitivity analysis superior to linear methods.

  19. Multi-model ensemble combinations of the water budget in the East/Japan Sea

    NASA Astrophysics Data System (ADS)

    HAN, S.; Hirose, N.; Usui, N.; Miyazawa, Y.

    2016-02-01

    The water balance of East/Japan Sea is determined mainly by inflow and outflow through the Korea/Tsushima, Tsugaru and Soya/La Perouse Straits. However, the volume transports measured at three straits remain quantitatively unbalanced. This study examined the seasonal variation of the volume transport using the multiple linear regression and ridge regression of multi-model ensemble (MME) methods to estimate physically consistent circulation in East/Japan Sea by using four different data assimilation models. The MME outperformed all of the single models by reducing uncertainties, especially the multicollinearity problem with the ridge regression. However, the regression constants turned out to be inconsistent with each other if the MME was applied separately for each strait. The MME for a connected system was thus performed to find common constants for these straits. The estimation of this MME was found to be similar to the MME result of sea level difference (SLD). The estimated mean transport (2.42 Sv) was smaller than the measurement data at the Korea/Tsushima Strait, but the calibrated transport of the Tsugaru Strait (1.63 Sv) was larger than the observed data. The MME results of transport and SLD also suggested that the standard deviation (STD) of the Korea/Tsushima Strait is larger than the STD of the observation, whereas the estimated results were almost identical to that observed for the Tsugaru and Soya/La Perouse Straits. The similarity between MME results enhances the reliability of the present MME estimation.

  20. Multi-model ensemble estimation of volume transport through the straits of the East/Japan Sea

    NASA Astrophysics Data System (ADS)

    Han, Sooyeon; Hirose, Naoki; Usui, Norihisa; Miyazawa, Yasumasa

    2016-01-01

    The volume transports measured at the Korea/Tsushima, Tsugaru, and Soya/La Perouse Straits remain quantitatively inconsistent. However, data assimilation models at least provide a self-consistent budget despite subtle differences among the models. This study examined the seasonal variation of the volume transport using the multiple linear regression and ridge regression of multi-model ensemble (MME) methods to estimate more accurately transport at these straits by using four different data assimilation models. The MME outperformed all of the single models by reducing uncertainties, especially the multicollinearity problem with the ridge regression. However, the regression constants turned out to be inconsistent with each other if the MME was applied separately for each strait. The MME for a connected system was thus performed to find common constants for these straits. The estimation of this MME was found to be similar to the MME result of sea level difference (SLD). The estimated mean transport (2.43 Sv) was smaller than the measurement data at the Korea/Tsushima Strait, but the calibrated transport of the Tsugaru Strait (1.63 Sv) was larger than the observed data. The MME results of transport and SLD also suggested that the standard deviation (STD) of the Korea/Tsushima Strait is larger than the STD of the observation, whereas the estimated results were almost identical to that observed for the Tsugaru and Soya/La Perouse Straits. The similarity between MME results enhances the reliability of the present MME estimation.

  1. Kinetic analysis of non-isothermal solid-state reactions: multi-stage modeling without assumptions in the reaction mechanism.

    PubMed

    Pomerantsev, Alexey L; Kutsenova, Alla V; Rodionova, Oxana Ye

    2017-02-01

    A novel non-linear regression method for modeling non-isothermal thermogravimetric data is proposed. Experiments for several heating rates are analyzed simultaneously. The method is applicable to complex multi-stage processes when the number of stages is unknown. Prior knowledge of the type of kinetics is not required. The main idea is a consequent estimation of parameters when the overall model is successively changed from one level of modeling to another. At the first level, the Avrami-Erofeev functions are used. At the second level, the Sestak-Berggren functions are employed with the goal to broaden the overall model. The method is tested using both simulated and real-world data. A comparison of the proposed method with a recently published 'model-free' deconvolution method is presented.

  2. Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

    PubMed Central

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-01-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882

  3. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    PubMed

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  4. Combined analysis of magnetic and gravity anomalies using normalized source strength (NSS)

    NASA Astrophysics Data System (ADS)

    Li, L.; Wu, Y.

    2017-12-01

    Gravity field and magnetic field belong to potential fields which lead inherent multi-solution. Combined analysis of magnetic and gravity anomalies based on Poisson's relation is used to determinate homology gravity and magnetic anomalies and decrease the ambiguity. The traditional combined analysis uses the linear regression of the reduction to pole (RTP) magnetic anomaly to the first order vertical derivative of the gravity anomaly, and provides the quantitative or semi-quantitative interpretation by calculating the correlation coefficient, slope and intercept. In the calculation process, due to the effect of remanent magnetization, the RTP anomaly still contains the effect of oblique magnetization. In this case the homology gravity and magnetic anomalies display irrelevant results in the linear regression calculation. The normalized source strength (NSS) can be transformed from the magnetic tensor matrix, which is insensitive to the remanence. Here we present a new combined analysis using NSS. Based on the Poisson's relation, the gravity tensor matrix can be transformed into the pseudomagnetic tensor matrix of the direction of geomagnetic field magnetization under the homologous condition. The NSS of pseudomagnetic tensor matrix and original magnetic tensor matrix are calculated and linear regression analysis is carried out. The calculated correlation coefficient, slope and intercept indicate the homology level, Poisson's ratio and the distribution of remanent respectively. We test the approach using synthetic model under complex magnetization, the results show that it can still distinguish the same source under the condition of strong remanence, and establish the Poisson's ratio. Finally, this approach is applied in China. The results demonstrated that our approach is feasible.

  5. Determination of pKa values of alendronate sodium in aqueous solution by piecewise linear regression based on acid-base potentiometric titration.

    PubMed

    Ke, Jing; Dou, Hanfei; Zhang, Ximin; Uhagaze, Dushimabararezi Serge; Ding, Xiali; Dong, Yuming

    2016-12-01

    As a mono-sodium salt form of alendronic acid, alendronate sodium presents multi-level ionization for the dissociation of its four hydroxyl groups. The dissociation constants of alendronate sodium were determined in this work by studying the piecewise linear relationship between volume of titrant and pH value based on acid-base potentiometric titration reaction. The distribution curves of alendronate sodium were drawn according to the determined pKa values. There were 4 dissociation constants (pKa 1 =2.43, pKa 2 =7.55, pKa 3 =10.80, pKa 4 =11.99, respectively) of alendronate sodium, and 12 existing forms, of which 4 could be ignored, existing in different pH environments.

  6. Statistical post-processing of seasonal multi-model forecasts: Why is it so hard to beat the multi-model mean?

    NASA Astrophysics Data System (ADS)

    Siegert, Stefan

    2017-04-01

    Initialised climate forecasts on seasonal time scales, run several months or even years ahead, are now an integral part of the battery of products offered by climate services world-wide. The availability of seasonal climate forecasts from various modeling centres gives rise to multi-model ensemble forecasts. Post-processing such seasonal-to-decadal multi-model forecasts is challenging 1) because the cross-correlation structure between multiple models and observations can be complicated, 2) because the amount of training data to fit the post-processing parameters is very limited, and 3) because the forecast skill of numerical models tends to be low on seasonal time scales. In this talk I will review new statistical post-processing frameworks for multi-model ensembles. I will focus particularly on Bayesian hierarchical modelling approaches, which are flexible enough to capture commonly made assumptions about collective and model-specific biases of multi-model ensembles. Despite the advances in statistical methodology, it turns out to be very difficult to out-perform the simplest post-processing method, which just recalibrates the multi-model ensemble mean by linear regression. I will discuss reasons for this, which are closely linked to the specific characteristics of seasonal multi-model forecasts. I explore possible directions for improvements, for example using informative priors on the post-processing parameters, and jointly modelling forecasts and observations.

  7. SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *

    PubMed Central

    Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.

    2014-01-01

    The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844

  8. Pseudo-second order models for the adsorption of safranin onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth

    2007-04-02

    Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.

  9. Multi-linear regression models predict the effects of water chemistry on acute lead toxicity to Ceriodaphnia dubia and Pimephales promelas.

    PubMed

    Esbaugh, A J; Brix, K V; Mager, E M; Grosell, M

    2011-09-01

    The current study examined the acute toxicity of lead (Pb) to Ceriodaphnia dubia and Pimephales promelas in a variety of natural waters. The natural waters were selected to range in pertinent water chemistry parameters such as calcium, pH, total CO(2) and dissolved organic carbon (DOC). Acute toxicity was determined for C. dubia and P. promelas using standard 48h and 96h protocols, respectively. For both organisms acute toxicity varied markedly according to water chemistry, with C. dubia LC50s ranging from 29 to 180μg/L and P. promelas LC50s ranging from 41 to 3598μg/L. Additionally, no Pb toxicity was observed for P. promelas in three alkaline natural waters. With respect to water chemistry parameters, DOC had the strongest protective impact for both organisms. A multi-linear regression (MLR) approach combining previous lab data and the current data was used to identify the relative importance of individual water chemistry components in predicting acute Pb toxicity for both species. As anticipated, the P. promelas best-fit MLR model combined DOC, calcium and pH. Unexpectedly, in the C. dubiaMLR model the importance of pH, TCO(2) and calcium was minimal while DOC and ionic strength were the controlling water quality variables. Adjusted R(2) values of 0.82 and 0.64 for the P. promelas and C. dubia models, respectively, are comparable to previously developed biotic ligand models for other metals. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. Modelling the Relationship Between Land Surface Temperature and Landscape Patterns of Land Use Land Cover Classification Using Multi Linear Regression Models

    NASA Astrophysics Data System (ADS)

    Bernales, A. M.; Antolihao, J. A.; Samonte, C.; Campomanes, F.; Rojas, R. J.; dela Serna, A. M.; Silapan, J.

    2016-06-01

    The threat of the ailments related to urbanization like heat stress is very prevalent. There are a lot of things that can be done to lessen the effect of urbanization to the surface temperature of the area like using green roofs or planting trees in the area. So land use really matters in both increasing and decreasing surface temperature. It is known that there is a relationship between land use land cover (LULC) and land surface temperature (LST). Quantifying this relationship in terms of a mathematical model is very important so as to provide a way to predict LST based on the LULC alone. This study aims to examine the relationship between LST and LULC as well as to create a model that can predict LST using class-level spatial metrics from LULC. LST was derived from a Landsat 8 image and LULC classification was derived from LiDAR and Orthophoto datasets. Class-level spatial metrics were created in FRAGSTATS with the LULC and LST as inputs and these metrics were analysed using a statistical framework. Multi linear regression was done to create models that would predict LST for each class and it was found that the spatial metric "Effective mesh size" was a top predictor for LST in 6 out of 7 classes. The model created can still be refined by adding a temporal aspect by analysing the LST of another farming period (for rural areas) and looking for common predictors between LSTs of these two different farming periods.

  11. Comparative data mining analysis for information retrieval of MODIS images: monitoring lake turbidity changes at Lake Okeechobee, Florida

    NASA Astrophysics Data System (ADS)

    Chang, Ni-Bin; Daranpob, Ammarin; Yang, Y. Jeffrey; Jin, Kang-Ren

    2009-09-01

    In the remote sensing field, a frequently recurring question is: Which computational intelligence or data mining algorithms are most suitable for the retrieval of essential information given that most natural systems exhibit very high non-linearity. Among potential candidates might be empirical regression, neural network model, support vector machine, genetic algorithm/genetic programming, analytical equation, etc. This paper compares three types of data mining techniques, including multiple non-linear regression, artificial neural networks, and genetic programming, for estimating multi-temporal turbidity changes following hurricane events at Lake Okeechobee, Florida. This retrospective analysis aims to identify how the major hurricanes impacted the water quality management in 2003-2004. The Moderate Resolution Imaging Spectroradiometer (MODIS) Terra 8-day composite imageries were used to retrieve the spatial patterns of turbidity distributions for comparison against the visual patterns discernible in the in-situ observations. By evaluating four statistical parameters, the genetic programming model was finally selected as the most suitable data mining tool for classification in which the MODIS band 1 image and wind speed were recognized as the major determinants by the model. The multi-temporal turbidity maps generated before and after the major hurricane events in 2003-2004 showed that turbidity levels were substantially higher after hurricane episodes. The spatial patterns of turbidity confirm that sediment-laden water travels to the shore where it reduces the intensity of the light necessary to submerged plants for photosynthesis. This reduction results in substantial loss of biomass during the post-hurricane period.

  12. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  13. Fuzzy regression modeling for tool performance prediction and degradation detection.

    PubMed

    Li, X; Er, M J; Lim, B S; Zhou, J H; Gan, O P; Rutkowski, L

    2010-10-01

    In this paper, the viability of using Fuzzy-Rule-Based Regression Modeling (FRM) algorithm for tool performance and degradation detection is investigated. The FRM is developed based on a multi-layered fuzzy-rule-based hybrid system with Multiple Regression Models (MRM) embedded into a fuzzy logic inference engine that employs Self Organizing Maps (SOM) for clustering. The FRM converts a complex nonlinear problem to a simplified linear format in order to further increase the accuracy in prediction and rate of convergence. The efficacy of the proposed FRM is tested through a case study - namely to predict the remaining useful life of a ball nose milling cutter during a dry machining process of hardened tool steel with a hardness of 52-54 HRc. A comparative study is further made between four predictive models using the same set of experimental data. It is shown that the FRM is superior as compared with conventional MRM, Back Propagation Neural Networks (BPNN) and Radial Basis Function Networks (RBFN) in terms of prediction accuracy and learning speed.

  14. A novel method for determination of aragonite saturation state on the continental shelf of central Oregon using multi-parameter relationships with hydrographic data

    NASA Astrophysics Data System (ADS)

    Juranek, L. W.; Feely, R. A.; Peterson, W. T.; Alin, S. R.; Hales, B.; Lee, K.; Sabine, C. L.; Peterson, J.

    2009-12-01

    We developed a multiple linear regression model to robustly determine aragonite saturation state (Ωarag) from observations of temperature and oxygen (R2 = 0.987, RMS error 0.053), using data collected in the Pacific Northwest region in late May 2007. The seasonal evolution of Ωarag near central Oregon was evaluated by applying the regression model to a monthly (winter)/bi-weekly (summer) water-column hydrographic time-series collected over the shelf and slope in 2007. The Ωarag predicted by the regression model was less than 1, the thermodynamic calcification/dissolution threshold, over shelf/slope bottom waters throughout the entire 2007 upwelling season (May-November), with the Ωarag = 1 horizon shoaling to 30 m by late summer. The persistence of water with Ωarag < 1 on the continental shelf has not been previously noted and could have notable ecological consequences for benthic and pelagic calcifying organisms such as mussels, oysters, abalone, echinoderms, and pteropods.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Qichun; Zhang, Xuesong; Xu, Xingya

    Riverine carbon cycling is an important, but insufficiently investigated component of the global carbon cycle. Analyses of environmental controls on riverine carbon cycling are critical for improved understanding of mechanisms regulating carbon processing and storage along the terrestrial-aquatic continuum. Here, we compile and analyze riverine dissolved organic carbon (DOC) concentration data from 1402 United States Geological Survey (USGS) gauge stations to examine the spatial variability and environmental controls of DOC concentrations in the United States (U.S.) surface waters. DOC concentrations exhibit high spatial variability, with an average of 6.42 ± 6.47 mg C/ L (Mean ± Standard Deviation). In general,more » high DOC concentrations occur in the Upper Mississippi River basin and the Southeastern U.S., while low concentrations are mainly distributed in the Western U.S. Single-factor analysis indicates that slope of drainage areas, wetlands, forests, percentage of first-order streams, and instream nutrients (such as nitrogen and phosphorus) pronouncedly influence DOC concentrations, but the explanatory power of each bivariate model is lower than 35%. Analyses based on the general multi-linear regression models suggest DOC concentrations are jointly impacted by multiple factors. Soil properties mainly show positive correlations with DOC concentrations; forest and shrub lands have positive correlations with DOC concentrations, but urban area and croplands demonstrate negative impacts; total instream phosphorus and dam density correlate positively with DOC concentrations. Notably, the relative importance of these environmental controls varies substantially across major U.S. water resource regions. In addition, DOC concentrations and environmental controls also show significant variability from small streams to large rivers, which may be caused by changing carbon sources and removal rates by river orders. In sum, our results reveal that general multi-linear regression analysis of twenty one terrestrial and aquatic environmental factors can partially explain (56%) the DOC concentration variation. In conclusion, this study highlights the complexity of the interactions among these environmental factors in determining DOC concentrations, thus calls for processes-based, non-linear methodologies to constrain uncertainties in riverine DOC cycling.« less

  16. Linear regression crash prediction models : issues and proposed solutions.

    DOT National Transportation Integrated Search

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  17. Temporal Drivers of Liking Based on Functional Data Analysis and Non-Additive Models for Multi-Attribute Time-Intensity Data of Fruit Chews.

    PubMed

    Kuesten, Carla; Bi, Jian

    2018-06-03

    Conventional drivers of liking analysis was extended with a time dimension into temporal drivers of liking (TDOL) based on functional data analysis methodology and non-additive models for multiple-attribute time-intensity (MATI) data. The non-additive models, which consider both direct effects and interaction effects of attributes to consumer overall liking, include Choquet integral and fuzzy measure in the multi-criteria decision-making, and linear regression based on variance decomposition. Dynamics of TDOL, i.e., the derivatives of the relative importance functional curves were also explored. Well-established R packages 'fda', 'kappalab' and 'relaimpo' were used in the paper for developing TDOL. Applied use of these methods shows that the relative importance of MATI curves offers insights for understanding the temporal aspects of consumer liking for fruit chews.

  18. Advances in simultaneous atmospheric profile and cloud parameter regression based retrieval from high-spectral resolution radiance measurements

    NASA Astrophysics Data System (ADS)

    Weisz, Elisabeth; Smith, William L.; Smith, Nadia

    2013-06-01

    The dual-regression (DR) method retrieves information about the Earth surface and vertical atmospheric conditions from measurements made by any high-spectral resolution infrared sounder in space. The retrieved information includes temperature and atmospheric gases (such as water vapor, ozone, and carbon species) as well as surface and cloud top parameters. The algorithm was designed to produce a high-quality product with low latency and has been demonstrated to yield accurate results in real-time environments. The speed of the retrieval is achieved through linear regression, while accuracy is achieved through a series of classification schemes and decision-making steps. These steps are necessary to account for the nonlinearity of hyperspectral retrievals. In this work, we detail the key steps that have been developed in the DR method to advance accuracy in the retrieval of nonlinear parameters, specifically cloud top pressure. The steps and their impact on retrieval results are discussed in-depth and illustrated through relevant case studies. In addition to discussing and demonstrating advances made in addressing nonlinearity in a linear geophysical retrieval method, advances toward multi-instrument geophysical analysis by applying the DR to three different operational sounders in polar orbit are also noted. For any area on the globe, the DR method achieves consistent accuracy and precision, making it potentially very valuable to both the meteorological and environmental user communities.

  19. Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

    ERIC Educational Resources Information Center

    Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

    2013-01-01

    In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

  20. Covariance functions for body weight from birth to maturity in Nellore cows.

    PubMed

    Boligon, A A; Mercadante, M E Z; Forni, S; Lôbo, R B; Albuquerque, L G

    2010-03-01

    The objective of this study was to estimate (co)variance functions using random regression models on Legendre polynomials for the analysis of repeated measures of BW from birth to adult age. A total of 82,064 records from 8,145 females were analyzed. Different models were compared. The models included additive direct and maternal effects, and animal and maternal permanent environmental effects as random terms. Contemporary group and dam age at calving (linear and quadratic effect) were included as fixed effects, and orthogonal Legendre polynomials of animal age (cubic regression) were considered as random covariables. Eight models with polynomials of third to sixth order were used to describe additive direct and maternal effects, and animal and maternal permanent environmental effects. Residual effects were modeled using 1 (i.e., assuming homogeneity of variances across all ages) or 5 age classes. The model with 5 classes was the best to describe the trajectory of residuals along the growth curve. The model including fourth- and sixth-order polynomials for additive direct and animal permanent environmental effects, respectively, and third-order polynomials for maternal genetic and maternal permanent environmental effects were the best. Estimates of (co)variance obtained with the multi-trait and random regression models were similar. Direct heritability estimates obtained with the random regression models followed a trend similar to that obtained with the multi-trait model. The largest estimates of maternal heritability were those of BW taken close to 240 d of age. In general, estimates of correlation between BW from birth to 8 yr of age decreased with increasing distance between ages.

  1. Effectiveness of a worksite mindfulness-based multi-component intervention on lifestyle behaviors

    PubMed Central

    2014-01-01

    Introduction Overweight and obesity are associated with an increased risk of morbidity. Mindfulness training could be an effective strategy to optimize lifestyle behaviors related to body weight gain. The aim of this study was to evaluate the effectiveness of a worksite mindfulness-based multi-component intervention on vigorous physical activity in leisure time, sedentary behavior at work, fruit intake and determinants of these behaviors. The control group received information on existing lifestyle behavior- related facilities that were already available at the worksite. Methods In a randomized controlled trial design (n = 257), 129 workers received a mindfulness training, followed by e-coaching, lunch walking routes and fruit. Outcome measures were assessed at baseline and after 6 and 12 months using questionnaires. Physical activity was also measured using accelerometers. Effects were analyzed using linear mixed effect models according to the intention-to-treat principle. Linear regression models (complete case analyses) were used as sensitivity analyses. Results There were no significant differences in lifestyle behaviors and determinants of these behaviors between the intervention and control group after 6 or 12 months. The sensitivity analyses showed effect modification for gender in sedentary behavior at work at 6-month follow-up, although the main analyses did not. Conclusions This study did not show an effect of a worksite mindfulness-based multi-component intervention on lifestyle behaviors and behavioral determinants after 6 and 12 months. The effectiveness of a worksite mindfulness-based multi-component intervention as a health promotion intervention for all workers could not be established. PMID:24467802

  2. Synthesis, spectral studies and antimicrobial activities of some 2-naphthyl pyrazoline derivatives

    NASA Astrophysics Data System (ADS)

    Sakthinathan, S. P.; Vanangamudi, G.; Thirunarayanan, G.

    A series of 2-naphthyl pyrazolines were synthesized by the cyclization of 2-naphthyl chalcones and phenylhydrazine hydrochloride in the presence of sodium acetate. The yields of pyrazoline derivatives are more than 80%. The synthesized pyrazolines were characterized by their physical constants, IR, 1H, 13C and MS spectra. From the IR and NMR spectra the Cdbnd N (cm-1) stretches, the pyrazoline ring proton chemical shifts (ppm) of δ, Hb and Hc and also the carbon chemical shifts (ppm) of δCdbnd N are correlated with Hammett substituent constants, F and R, and Swain-Lupton's parameters using single and multi-regression analyses. From the results of linear regression analysis, the effect of substituents on the group frequencies has been predicted. The antimicrobial activities of all synthesized pyrazolines have been studied.

  3. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  4. Estimating PM2.5 Concentrations in Xi'an City Using a Generalized Additive Model with Multi-Source Monitoring Data

    PubMed Central

    Song, Yong-Ze; Yang, Hong-Lei; Peng, Jun-Huan; Song, Yi-Rong; Sun, Qian; Li, Yuan

    2015-01-01

    Particulate matter with an aerodynamic diameter <2.5 μm (PM2.5) represents a severe environmental problem and is of negative impact on human health. Xi'an City, with a population of 6.5 million, is among the highest concentrations of PM2.5 in China. In 2013, in total, there were 191 days in Xi’an City on which PM2.5 concentrations were greater than 100 μg/m3. Recently, a few studies have explored the potential causes of high PM2.5 concentration using remote sensing data such as the MODIS aerosol optical thickness (AOT) product. Linear regression is a commonly used method to find statistical relationships among PM2.5 concentrations and other pollutants, including CO, NO2, SO2, and O3, which can be indicative of emission sources. The relationships of these variables, however, are usually complicated and non-linear. Therefore, a generalized additive model (GAM) is used to estimate the statistical relationships between potential variables and PM2.5 concentrations. This model contains linear functions of SO2 and CO, univariate smoothing non-linear functions of NO2, O3, AOT and temperature, and bivariate smoothing non-linear functions of location and wind variables. The model can explain 69.50% of PM2.5 concentrations, with R2 = 0.691, which improves the result of a stepwise linear regression (R2 = 0.582) by 18.73%. The two most significant variables, CO concentration and AOT, represent 20.65% and 19.54% of the deviance, respectively, while the three other gas-phase concentrations, SO2, NO2, and O3 account for 10.88% of the total deviance. These results show that in Xi'an City, the traffic and other industrial emissions are the primary source of PM2.5. Temperature, location, and wind variables also non-linearly related with PM2.5. PMID:26540446

  5. A parameter estimation subroutine package

    NASA Technical Reports Server (NTRS)

    Bierman, G. J.; Nead, M. W.

    1978-01-01

    Linear least squares estimation and regression analyses continue to play a major role in orbit determination and related areas. A library of FORTRAN subroutines were developed to facilitate analyses of a variety of estimation problems. An easy to use, multi-purpose set of algorithms that are reasonably efficient and which use a minimal amount of computer storage are presented. Subroutine inputs, outputs, usage and listings are given, along with examples of how these routines can be used. The routines are compact and efficient and are far superior to the normal equation and Kalman filter data processing algorithms that are often used for least squares analyses.

  6. Quantifying Main Trends in Lysozyme Nucleation: The Effects of Precipitant Concentration, Supersaturation and Impurities

    NASA Technical Reports Server (NTRS)

    Burke, Michael W.; Leardi, Riccardo; Judge, Russell A.; Pusey, Marc L.; Whitaker, Ann F. (Technical Monitor)

    2001-01-01

    Full factorial experimental design incorporating multi-linear regression analysis of the experimental data allows quick identification of main trends and effects using a limited number of experiments. In this study these techniques were employed to identify the effect of precipitant concentration, supersaturation, and the presence of an impurity, the physiological lysozyme dimer, on the nucleation rate and crystal dimensions of the tetragonal forin of chicken egg white lysozyme. Decreasing precipitant concentration, increasing supers aturation, and increasing impurity, were found to increase crystal numbers. The crystal axial ratio decreased with increasing precipitant concentration, independent of impurity.

  7. Groundwater-level prediction using multiple linear regression and artificial neural network techniques: a comparative assessment

    NASA Astrophysics Data System (ADS)

    Sahoo, Sasmita; Jha, Madan K.

    2013-12-01

    The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.

  8. Transmission of linear regression patterns between time series: From relationship in time series to complex networks

    NASA Astrophysics Data System (ADS)

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  9. Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

    PubMed

    Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

    2014-07-01

    The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.

  10. Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math.

    PubMed

    Raizada, Rajeev D S; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D; Ansari, Daniel; Kuhl, Patricia K

    2010-05-15

    A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain-behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain-behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain-behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  11. Validating Variational Bayes Linear Regression Method With Multi-Central Datasets.

    PubMed

    Murata, Hiroshi; Zangwill, Linda M; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Hirasawa, Kazunori; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Shoji, Nobuyuki; Asaoka, Ryo

    2018-04-01

    To validate the prediction accuracy of variational Bayes linear regression (VBLR) with two datasets external to the training dataset. The training dataset consisted of 7268 eyes of 4278 subjects from the University of Tokyo Hospital. The Japanese Archive of Multicentral Databases in Glaucoma (JAMDIG) dataset consisted of 271 eyes of 177 patients, and the Diagnostic Innovations in Glaucoma Study (DIGS) dataset includes 248 eyes of 173 patients, which were used for validation. Prediction accuracy was compared between the VBLR and ordinary least squared linear regression (OLSLR). First, OLSLR and VBLR were carried out using total deviation (TD) values at each of the 52 test points from the second to fourth visual fields (VFs) (VF2-4) to 2nd to 10th VF (VF2-10) of each patient in JAMDIG and DIGS datasets, and the TD values of the 11th VF test were predicted every time. The predictive accuracy of each method was compared through the root mean squared error (RMSE) statistic. OLSLR RMSEs with the JAMDIG and DIGS datasets were between 31 and 4.3 dB, and between 19.5 and 3.9 dB. On the other hand, VBLR RMSEs with JAMDIG and DIGS datasets were between 5.0 and 3.7, and between 4.6 and 3.6 dB. There was statistically significant difference between VBLR and OLSLR for both datasets at every series (VF2-4 to VF2-10) (P < 0.01 for all tests). However, there was no statistically significant difference in VBLR RMSEs between JAMDIG and DIGS datasets at any series of VFs (VF2-2 to VF2-10) (P > 0.05). VBLR outperformed OLSLR to predict future VF progression, and the VBLR has a potential to be a helpful tool at clinical settings.

  12. Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math

    PubMed Central

    Raizada, Rajeev D.S.; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D.; Ansari, Daniel; Kuhl, Patricia K.

    2010-01-01

    A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain–behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain–behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain–behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. PMID:20132896

  13. Estimating severity of sideways fall using a generic multi linear regression model based on kinematic input variables.

    PubMed

    van der Zijden, A M; Groen, B E; Tanck, E; Nienhuis, B; Verdonschot, N; Weerdesteyn, V

    2017-03-21

    Many research groups have studied fall impact mechanics to understand how fall severity can be reduced to prevent hip fractures. Yet, direct impact force measurements with force plates are restricted to a very limited repertoire of experimental falls. The purpose of this study was to develop a generic model for estimating hip impact forces (i.e. fall severity) in in vivo sideways falls without the use of force plates. Twelve experienced judokas performed sideways Martial Arts (MA) and Block ('natural') falls on a force plate, both with and without a mat on top. Data were analyzed to determine the hip impact force and to derive 11 selected (subject-specific and kinematic) variables. Falls from kneeling height were used to perform a stepwise regression procedure to assess the effects of these input variables and build the model. The final model includes four input variables, involving one subject-specific measure and three kinematic variables: maximum upper body deceleration, body mass, shoulder angle at the instant of 'maximum impact' and maximum hip deceleration. The results showed that estimated and measured hip impact forces were linearly related (explained variances ranging from 46 to 63%). Hip impact forces of MA falls onto the mat from a standing position (3650±916N) estimated by the final model were comparable with measured values (3698±689N), even though these data were not used for training the model. In conclusion, a generic linear regression model was developed that enables the assessment of fall severity through kinematic measures of sideways falls, without using force plates. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

    PubMed

    Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

    2018-06-01

    This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.

  15. The use of experimental design in the development of an HPLC-ECD method for the analysis of captopril.

    PubMed

    Khamanga, Sandile M; Walker, Roderick B

    2011-01-15

    An accurate, sensitive and specific high performance liquid chromatography-electrochemical detection (HPLC-ECD) method that was developed and validated for captopril (CPT) is presented. Separation was achieved using a Phenomenex(®) Luna 5 μm (C(18)) column and a mobile phase comprised of phosphate buffer (adjusted to pH 3.0): acetonitrile in a ratio of 70:30 (v/v). Detection was accomplished using a full scan multi channel ESA Coulometric detector in the "oxidative-screen" mode with the upstream electrode (E(1)) set at +600 mV and the downstream (analytical) electrode (E(2)) set at +950 mV, while the potential of the guard cell was maintained at +1050 mV. The detector gain was set at 300. Experimental design using central composite design (CCD) was used to facilitate method development. Mobile phase pH, molarity and concentration of acetonitrile (ACN) were considered the critical factors to be studied to establish the retention time of CPT and cyclizine (CYC) that was used as the internal standard. Twenty experiments including centre points were undertaken and a quadratic model was derived for the retention time for CPT using the experimental data. The method was validated for linearity, accuracy, precision, limits of quantitation and detection, as per the ICH guidelines. The system was found to produce sharp and well-resolved peaks for CPT and CYC with retention times of 3.08 and 7.56 min, respectively. Linear regression analysis for the calibration curve showed a good linear relationship with a regression coefficient of 0.978 in the concentration range of 2-70 μg/mL. The linear regression equation was y=0.0131x+0.0275. The limits of detection (LOQ) and quantitation (LOD) were found to be 2.27 and 0.6 μg/mL, respectively. The method was used to analyze CPT in tablets. The wide range for linearity, accuracy, sensitivity, short retention time and composition of the mobile phase indicated that this method is better for the quantification of CPT than the pharmacopoeial methods. Copyright © 2010 Elsevier B.V. All rights reserved.

  16. Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling.

    PubMed

    Perdikaris, P; Raissi, M; Damianou, A; Lawrence, N D; Karniadakis, G E

    2017-02-01

    Multi-fidelity modelling enables accurate inference of quantities of interest by synergistically combining realizations of low-cost/low-fidelity models with a small set of high-fidelity observations. This is particularly effective when the low- and high-fidelity models exhibit strong correlations, and can lead to significant computational gains over approaches that solely rely on high-fidelity models. However, in many cases of practical interest, low-fidelity models can only be well correlated to their high-fidelity counterparts for a specific range of input parameters, and potentially return wrong trends and erroneous predictions if probed outside of their validity regime. Here we put forth a probabilistic framework based on Gaussian process regression and nonlinear autoregressive schemes that is capable of learning complex nonlinear and space-dependent cross-correlations between models of variable fidelity, and can effectively safeguard against low-fidelity models that provide wrong trends. This introduces a new class of multi-fidelity information fusion algorithms that provide a fundamental extension to the existing linear autoregressive methodologies, while still maintaining the same algorithmic complexity and overall computational cost. The performance of the proposed methods is tested in several benchmark problems involving both synthetic and real multi-fidelity datasets from computational fluid dynamics simulations.

  17. 3D Multi-segment foot kinematics in children: A developmental study in typically developing boys.

    PubMed

    Deschamps, Kevin; Staes, Filip; Peerlinck, Kathelijne; Van Geet, Christel; Hermans, Cedric; Matricali, Giovanni Arnoldo; Lobet, Sebastien

    2017-02-01

    The relationship between age and 3D rotations objectivized with multisegment foot models has not been quantified until now. The purpose of this study was therefore to investigate the relationship between age and multi-segment foot kinematics in a cross-sectional database. Barefoot multi-segment foot kinematics of thirty two typically developing boys, aged 6-20 years, were captured with the Rizzoli Multi-segment Foot Model. One-dimensional statistical parametric mapping linear regression was used to examine the relationship between age and 3D inter-segment rotations of the dominant leg during the full gait cycle. Age was significantly correlated with sagittal plane kinematics of the midfoot and the calcaneus-metatarsus inter-segment angle (p<0.0125). Age was also correlated with the transverse plane kinematics of the calcaneus-metatarsus angle (p<0.0001). Gait labs should consider age related differences and variability if optimal decision making is pursued. It remains unclear if this is of interest for all foot models, however, the current study highlights that this is of particular relevance for foot models which incorporate a separate midfoot segment. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

    DTIC Science & Technology

    1974-01-01

    REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans

  19. Element enrichment factor calculation using grain-size distribution and functional data regression.

    PubMed

    Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

    2015-01-01

    In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Development of efficiency indicators of operating room management for multi-institutional comparisons.

    PubMed

    Tanaka, Masayuki; Lee, Jason; Ikai, Hiroshi; Imanaka, Yuichi

    2013-04-01

    The efficiency of a hospital's operating room (OR) management can affect its overall profitability. However, existing indicators that assess OR management efficiency do not take into account differences in hospital size, manpower and functional characteristics, thereby rendering them unsuitable for multi-institutional comparisons. The aim of this study was to develop indicators of OR management efficiency that would take into account differences in hospital size and manpower, which may then be applied to multi-institutional comparisons. Using administrative data from 224 hospitals in Japan from 2008 to 2010, we performed four multiple linear regression analyses at the hospital level, in which the dependent variables were the number of operations per OR per month, procedural fees per OR per month, total utilization times per OR per month and total fees per OR per month for each of the models. The expected values of these four indicators were produced using multiple regression analysis results, adjusting for differences in hospital size and manpower, which are beyond the control of process owners' management. However, more than half of the variations in three of these four indicators were shown to be explained by differences in hospital size and manpower. Using the ratio of observed to expected values (OE ratio), as well as the difference between the two values (OE difference) allows hospitals to identify weaknesses in efficiency with more validity when compared to unadjusted indicators. The new indicators may support the improvement and sustainment of a high-quality health care system. © 2012 Blackwell Publishing Ltd.

  1. Who Will Win?: Predicting the Presidential Election Using Linear Regression

    ERIC Educational Resources Information Center

    Lamb, John H.

    2007-01-01

    This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…

  2. A multi-criteria index for ecological evaluation of tropical agriculture in southeastern Mexico.

    PubMed

    Huerta, Esperanza; Kampichler, Christian; Ochoa-Gaona, Susana; De Jong, Ben; Hernandez-Daumas, Salvador; Geissen, Violette

    2014-01-01

    The aim of this study was to generate an easy to use index to evaluate the ecological state of agricultural land from a sustainability perspective. We selected environmental indicators, such as the use of organic soil amendments (green manure) versus chemical fertilizers, plant biodiversity (including crop associations), variables which characterize soil conservation of conventional agricultural systems, pesticide use, method and frequency of tillage. We monitored the ecological state of 52 agricultural plots to test the performance of the index. The variables were hierarchically aggregated with simple mathematical algorithms, if-then rules, and rule-based fuzzy models, yielding the final multi-criteria index with values from 0 (worst) to 1 (best conditions). We validated the model through independent evaluation by experts, and we obtained a linear regression with an r2 = 0.61 (p = 2.4e-06, d.f. = 49) between index output and the experts' evaluation.

  3. A Multi-Criteria Index for Ecological Evaluation of Tropical Agriculture in Southeastern Mexico

    PubMed Central

    Huerta, Esperanza; Kampichler, Christian; Ochoa-Gaona, Susana; De Jong, Ben; Hernandez-Daumas, Salvador; Geissen, Violette

    2014-01-01

    The aim of this study was to generate an easy to use index to evaluate the ecological state of agricultural land from a sustainability perspective. We selected environmental indicators, such as the use of organic soil amendments (green manure) versus chemical fertilizers, plant biodiversity (including crop associations), variables which characterize soil conservation of conventional agricultural systems, pesticide use, method and frequency of tillage. We monitored the ecological state of 52 agricultural plots to test the performance of the index. The variables were hierarchically aggregated with simple mathematical algorithms, if-then rules, and rule-based fuzzy models, yielding the final multi-criteria index with values from 0 (worst) to 1 (best conditions). We validated the model through independent evaluation by experts, and we obtained a linear regression with an r2 = 0.61 (p = 2.4e-06, d.f. = 49) between index output and the experts’ evaluation. PMID:25405980

  4. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  5. Multi-Target Regression via Robust Low-Rank Learning.

    PubMed

    Zhen, Xiantong; Yu, Mengyang; He, Xiaofei; Li, Shuo

    2018-02-01

    Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.

  6. Comparison of multi-subject ICA methods for analysis of fMRI data

    PubMed Central

    Erhardt, Erik Barry; Rachakonda, Srinivas; Bedrick, Edward; Allen, Elena; Adali, Tülay; Calhoun, Vince D.

    2010-01-01

    Spatial independent component analysis (ICA) applied to functional magnetic resonance imaging (fMRI) data identifies functionally connected networks by estimating spatially independent patterns from their linearly mixed fMRI signals. Several multi-subject ICA approaches estimating subject-specific time courses (TCs) and spatial maps (SMs) have been developed, however there has not yet been a full comparison of the implications of their use. Here, we provide extensive comparisons of four multi-subject ICA approaches in combination with data reduction methods for simulated and fMRI task data. For multi-subject ICA, the data first undergo reduction at the subject and group levels using principal component analysis (PCA). Comparisons of subject-specific, spatial concatenation, and group data mean subject-level reduction strategies using PCA and probabilistic PCA (PPCA) show that computationally intensive PPCA is equivalent to PCA, and that subject-specific and group data mean subject-level PCA are preferred because of well-estimated TCs and SMs. Second, aggregate independent components are estimated using either noise free ICA or probabilistic ICA (PICA). Third, subject-specific SMs and TCs are estimated using back-reconstruction. We compare several direct group ICA (GICA) back-reconstruction approaches (GICA1-GICA3) and an indirect back-reconstruction approach, spatio-temporal regression (STR, or dual regression). Results show the earlier group ICA (GICA1) approximates STR, however STR has contradictory assumptions and may show mixed-component artifacts in estimated SMs. Our evidence-based recommendation is to use GICA3, introduced here, with subject-specific PCA and noise-free ICA, providing the most robust and accurate estimated SMs and TCs in addition to offering an intuitive interpretation. PMID:21162045

  7. Water requirements of canine athletes during multi-day exercise.

    PubMed

    Stephens-Brown, Lara; Davis, Michael

    2018-03-23

    Exercise increases water requirements, but there is little information regarding water loss in dogs performing multi-day exercise OBJECTIVES: Quantify the daily water turnover of working dogs during multi-day exercise and establish the suitability of SC administration of tracer to determine water turnover. Fifteen privately owned Labrador retrievers trained for explosive detection duties and 16 privately owned Alaskan Huskies conditioned for mid-distance racing. All dogs received 0.3 g D 2 O/kg body weight by IV infusion, gavage, or SC injection before the start of a multi-day exercise challenge. Explosive detection dogs conducted 5 days of simulated off-leash explosive detection activity. Alaskan sled dogs completed a mid-distance stage race totaling 222 km in 2 days. Total body water (TBW) and daily water turnover were calculated using both indicator dilution and elimination regression techniques. Total body water (% of body weight) varied from 60% ± 8.6% in minimally conditioned Labrador retrievers to 74% ± 4.5% in highly conditioned Labrador retrievers. Daily water turnover was as high as 45% of TBW during exercise in cold conditions. There was no effect of sex or speed on daily water turnover. There was good agreement between results calculated using the indicator dilution approach and those calculated using a semilog linear regression approach when indicator isotope was administered IV or SC. Water requirements are influenced primarily by the amount of work done. SC administration of isotope-labeled water offers a simple and accurate alternative method for metabolic studies. © 2018 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.

  8. Single-chip source-free terahertz spectroscope across 0.04-0.99 THz: combining sub-wavelength near-field sensing and regression analysis.

    PubMed

    Wu, Xue; Sengupta, Kaushik

    2018-03-19

    This paper demonstrates a methodology to miniaturize THz spectroscopes into a single silicon chip by eliminating traditional solid-state architectural components such as complex tunable THz and optical sources, nonlinear mixing and amplifiers. The proposed method achieves this by extracting incident THz spectral signatures from the surface of an on-chip antenna itself. The information is sensed through the spectrally-sensitive 2D distribution of the impressed current surface under the THz incident field. By converting the antenna from a single-port to a massively multi-port architecture with integrated electronics and deep subwavelength sensing, THz spectral estimation is converted into a linear estimation problem. We employ rigorous regression techniques and analysis to demonstrate a single silicon chip system operating at room temperature across 0.04-0.99 THz with 10 MHz accuracy in spectrum estimation of THz tones across the entire spectrum.

  9. Development and validation of multivariate calibration methods for simultaneous estimation of Paracetamol, Enalapril maleate and hydrochlorothiazide in pharmaceutical dosage form

    NASA Astrophysics Data System (ADS)

    Singh, Veena D.; Daharwal, Sanjay J.

    2017-01-01

    Three multivariate calibration spectrophotometric methods were developed for simultaneous estimation of Paracetamol (PARA), Enalapril maleate (ENM) and Hydrochlorothiazide (HCTZ) in tablet dosage form; namely multi-linear regression calibration (MLRC), trilinear regression calibration method (TLRC) and classical least square (CLS) method. The selectivity of the proposed methods were studied by analyzing the laboratory prepared ternary mixture and successfully applied in their combined dosage form. The proposed methods were validated as per ICH guidelines and good accuracy; precision and specificity were confirmed within the concentration range of 5-35 μg mL- 1, 5-40 μg mL- 1 and 5-40 μg mL- 1of PARA, HCTZ and ENM, respectively. The results were statistically compared with reported HPLC method. Thus, the proposed methods can be effectively useful for the routine quality control analysis of these drugs in commercial tablet dosage form.

  10. Indirect synthesis of multi-degree of freedom transient systems. [linear programming for a kinematically linear system

    NASA Technical Reports Server (NTRS)

    Pilkey, W. D.; Chen, Y. H.

    1974-01-01

    An indirect synthesis method is used in the efficient optimal design of multi-degree of freedom, multi-design element, nonlinear, transient systems. A limiting performance analysis which requires linear programming for a kinematically linear system is presented. The system is selected using system identification methods such that the designed system responds as closely as possible to the limiting performance. The efficiency is a result of the method avoiding the repetitive systems analyses accompanying other numerical optimization methods.

  11. SAND: an automated VLBI imaging and analysing pipeline - I. Stripping component trajectories

    NASA Astrophysics Data System (ADS)

    Zhang, M.; Collioud, A.; Charlot, P.

    2018-02-01

    We present our implementation of an automated very long baseline interferometry (VLBI) data-reduction pipeline that is dedicated to interferometric data imaging and analysis. The pipeline can handle massive VLBI data efficiently, which makes it an appropriate tool to investigate multi-epoch multiband VLBI data. Compared to traditional manual data reduction, our pipeline provides more objective results as less human interference is involved. The source extraction is carried out in the image plane, while deconvolution and model fitting are performed in both the image plane and the uv plane for parallel comparison. The output from the pipeline includes catalogues of CLEANed images and reconstructed models, polarization maps, proper motion estimates, core light curves and multiband spectra. We have developed a regression STRIP algorithm to automatically detect linear or non-linear patterns in the jet component trajectories. This algorithm offers an objective method to match jet components at different epochs and to determine their proper motions.

  12. Conventional multi-slice computed tomography (CT) and cone-beam CT (CBCT) for computer-aided implant placement. Part II: reliability of mucosa-supported stereolithographic guides.

    PubMed

    Arisan, Volkan; Karabuda, Zihni Cüneyt; Pişkin, Bülent; Özdemir, Tayfun

    2013-12-01

    Deviations of implants that were placed by conventional computed tomography (CT)- or cone beam CT (CBCT)-derived mucosa-supported stereolithographic (SLA) surgical guides were analyzed in this study. Eleven patients were randomly scanned by a multi-slice CT (CT group) or a CBCT scanner (CBCT group). A total of 108 implants were planned on the software and placed using SLA guides. A new CT or CBCT scan was obtained and merged with the planning data to identify the deviations between the planned and placed implants. Results were analyzed by Mann-Whitney U test and multiple regressions (p < .05). Mean angular and linear deviations in the CT group were 3.30° (SD 0.36), and 0.75 (SD 0.32) and 0.80 mm (SD 0.35) at the implant shoulder and tip, respectively. In the CBCT group, mean angular and linear deviations were 3.47° (SD 0.37), and 0.81 (SD 0.32) and 0.87 mm (SD 0.32) at the implant shoulder and tip, respectively. No statistically significant differences were detected between the CT and CBCT groups (p = .169 and p = .551, p = .113 for angular and linear deviations, respectively). Implant placement via CT- or CBCT-derived mucosa-supported SLA guides yielded similar deviation values. Results should be confirmed on alternative CBCT scanners. © 2012 Wiley Periodicals, Inc.

  13. Applying the methodology of Design of Experiments to stability studies: a Partial Least Squares approach for evaluation of drug stability.

    PubMed

    Jordan, Nika; Zakrajšek, Jure; Bohanec, Simona; Roškar, Robert; Grabnar, Iztok

    2018-05-01

    The aim of the present research is to show that the methodology of Design of Experiments can be applied to stability data evaluation, as they can be seen as multi-factor and multi-level experimental designs. Linear regression analysis is usually an approach for analyzing stability data, but multivariate statistical methods could also be used to assess drug stability during the development phase. Data from a stability study for a pharmaceutical product with hydrochlorothiazide (HCTZ) as an unstable drug substance was used as a case example in this paper. The design space of the stability study was modeled using Umetrics MODDE 10.1 software. We showed that a Partial Least Squares model could be used for a multi-dimensional presentation of all data generated in a stability study and for determination of the relationship among factors that influence drug stability. It might also be used for stability predictions and potentially for the optimization of the extent of stability testing needed to determine shelf life and storage conditions, which would be time and cost-effective for the pharmaceutical industry.

  14. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value

  15. Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine

    NASA Astrophysics Data System (ADS)

    Ebrahimi, Hadi; Rajaee, Taher

    2017-01-01

    Simulation of groundwater level (GWL) fluctuations is an important task in management of groundwater resources. In this study, the effect of wavelet analysis on the training of the artificial neural network (ANN), multi linear regression (MLR) and support vector regression (SVR) approaches was investigated, and the ANN, MLR and SVR along with the wavelet-ANN (WNN), wavelet-MLR (WLR) and wavelet-SVR (WSVR) models were compared in simulating one-month-ahead of GWL. The only variable used to develop the models was the monthly GWL data recorded over a period of 11 years from two wells in the Qom plain, Iran. The results showed that decomposing GWL time series into several sub-time series, extremely improved the training of the models. For both wells 1 and 2, the Meyer and Db5 wavelets produced better results compared to the other wavelets; which indicated wavelet types had similar behavior in similar case studies. The optimal number of delays was 6 months, which seems to be due to natural phenomena. The best WNN model, using Meyer mother wavelet with two decomposition levels, simulated one-month-ahead with RMSE values being equal to 0.069 m and 0.154 m for wells 1 and 2, respectively. The RMSE values for the WLR model were 0.058 m and 0.111 m, and for WSVR model were 0.136 m and 0.060 m for wells 1 and 2, respectively.

  16. Time-resolved perfusion imaging at the angiography suite: preclinical comparison of a new flat-detector application to computed tomography perfusion.

    PubMed

    Jürgens, Julian H W; Schulz, Nadine; Wybranski, Christian; Seidensticker, Max; Streit, Sebastian; Brauner, Jan; Wohlgemuth, Walter A; Deuerling-Zheng, Yu; Ricke, Jens; Dudeck, Oliver

    2015-02-01

    The objective of this study was to compare the parameter maps of a new flat-panel detector application for time-resolved perfusion imaging in the angiography room (FD-CTP) with computed tomography perfusion (CTP) in an experimental tumor model. Twenty-four VX2 tumors were implanted into the hind legs of 12 rabbits. Three weeks later, FD-CTP (Artis zeego; Siemens) and CTP (SOMATOM Definition AS +; Siemens) were performed. The parameter maps for the FD-CTP were calculated using a prototype software, and those for the CTP were calculated with VPCT-body software on a dedicated syngo MultiModality Workplace. The parameters were compared using Pearson product-moment correlation coefficient and linear regression analysis. The Pearson product-moment correlation coefficient showed good correlation values for both the intratumoral blood volume of 0.848 (P < 0.01) and the blood flow of 0.698 (P < 0.01). The linear regression analysis of the perfusion between FD-CTP and CTP showed for the blood volume a regression equation y = 4.44x + 36.72 (P < 0.01) and for the blood flow y = 0.75x + 14.61 (P < 0.01). This preclinical study provides evidence that FD-CTP allows a time-resolved (dynamic) perfusion imaging of tumors similar to CTP, which provides the basis for clinical applications such as the assessment of tumor response to locoregional therapies directly in the angiography suite.

  17. [Ultrasonic measurements of fetal thalamus, caudate nucleus and lenticular nucleus in prenatal diagnosis].

    PubMed

    Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei

    2015-05-19

    To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.

  18. Local Linear Regression for Data with AR Errors.

    PubMed

    Li, Runze; Li, Yan

    2009-07-01

    In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.

  19. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  20. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  1. Integration of least angle regression with empirical Bayes for multi-locus genome-wide association studies

    USDA-ARS?s Scientific Manuscript database

    Multi-locus genome-wide association studies has become the state-of-the-art procedure to identify quantitative trait loci (QTL) associated with traits simultaneously. However, implementation of multi-locus model is still difficult. In this study, we integrated least angle regression with empirical B...

  2. Morse Code, Scrabble, and the Alphabet

    ERIC Educational Resources Information Center

    Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

    2004-01-01

    In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…

  3. A comparison of two adaptive multivariate analysis methods (PLSR and ANN) for winter wheat yield forecasting using Landsat-8 OLI images

    NASA Astrophysics Data System (ADS)

    Chen, Pengfei; Jing, Qi

    2017-02-01

    An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.

  4. Covariate Selection for Multilevel Models with Missing Data

    PubMed Central

    Marino, Miguel; Buxton, Orfeu M.; Li, Yi

    2017-01-01

    Missing covariate data hampers variable selection in multilevel regression settings. Current variable selection techniques for multiply-imputed data commonly address missingness in the predictors through list-wise deletion and stepwise-selection methods which are problematic. Moreover, most variable selection methods are developed for independent linear regression models and do not accommodate multilevel mixed effects regression models with incomplete covariate data. We develop a novel methodology that is able to perform covariate selection across multiply-imputed data for multilevel random effects models when missing data is present. Specifically, we propose to stack the multiply-imputed data sets from a multiple imputation procedure and to apply a group variable selection procedure through group lasso regularization to assess the overall impact of each predictor on the outcome across the imputed data sets. Simulations confirm the advantageous performance of the proposed method compared with the competing methods. We applied the method to reanalyze the Healthy Directions-Small Business cancer prevention study, which evaluated a behavioral intervention program targeting multiple risk-related behaviors in a working-class, multi-ethnic population. PMID:28239457

  5. Depressive disorder in pregnant Latin women: does intimate partner violence matter?

    PubMed

    Fonseca-Machado, Mariana de Oliveira; Alves, Lisiane Camargo; Monteiro, Juliana Cristina Dos Santos; Stefanello, Juliana; Nakano, Ana Márcia Spanó; Haas, Vanderlei José; Gomes-Sponholz, Flávia

    2015-05-01

    To identify the association of antenatal depressive symptoms with intimate partner violence during the current pregnancy in Brazilian women. Intimate partner violence is an important risk factor for antenatal depression. To the authors' knowledge, there has been no study to date that assessed the association between intimate partner violence during pregnancy and antenatal depressive symptoms among Brazilian women. Cross-sectional study. Three hundred and fifty-eight pregnant women were enrolled in the study. The Edinburgh Postnatal Depression Scale and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence were used to measure antenatal depressive symptoms and psychological, physical and sexual acts of intimate partner violence during the current pregnancy respectively. Multiple logistic regression and multiple linear regression were used for data analysis. The prevalence of antenatal depressive symptoms, as determined by the cut-off score of 12 in the Edinburgh Postnatal Depression Scale, was 28·2% (101). Of the participants, 63 (17·6%) reported some type of intimate partner violence during pregnancy. Among them, 60 (95·2%) reported suffering psychological violence, 23 (36·5%) physical violence and one (1·6%) sexual violence. Multiple logistic regression and multiple linear regression indicated that antenatal depressive symptoms are extremely associated with intimate partner violence during pregnancy. Among Brazilian women, exposure to intimate partner violence during pregnancy increases the chances of experiencing antenatal depressive symptoms. Clinical nurses and nurses midwifes should pay attention to the particularities of Brazilian women, especially with regard to the occurrence of intimate partner violence, whose impacts on the mental health of this population are extremely significant, both during the gestational period and postpartum. © 2015 John Wiley & Sons Ltd.

  6. Very-short-term wind power prediction by a hybrid model with single- and multi-step approaches

    NASA Astrophysics Data System (ADS)

    Mohammed, E.; Wang, S.; Yu, J.

    2017-05-01

    Very-short-term wind power prediction (VSTWPP) has played an essential role for the operation of electric power systems. This paper aims at improving and applying a hybrid method of VSTWPP based on historical data. The hybrid method is combined by multiple linear regressions and least square (MLR&LS), which is intended for reducing prediction errors. The predicted values are obtained through two sub-processes:1) transform the time-series data of actual wind power into the power ratio, and then predict the power ratio;2) use the predicted power ratio to predict the wind power. Besides, the proposed method can include two prediction approaches: single-step prediction (SSP) and multi-step prediction (MSP). WPP is tested comparatively by auto-regressive moving average (ARMA) model from the predicted values and errors. The validity of the proposed hybrid method is confirmed in terms of error analysis by using probability density function (PDF), mean absolute percent error (MAPE) and means square error (MSE). Meanwhile, comparison of the correlation coefficients between the actual values and the predicted values for different prediction times and window has confirmed that MSP approach by using the hybrid model is the most accurate while comparing to SSP approach and ARMA. The MLR&LS is accurate and promising for solving problems in WPP.

  7. Evaluation of a Microbiological Multi-Residue System on the detection of antibacterial substances in ewe milk.

    PubMed

    Althaus, Rafael; Berruga, Maria Isabel; Montero, Ana; Roca, Marta; Molina, Maria Pilar

    2009-01-19

    To protect both, public health and the dairy industry, from the presence of antibiotic residues in milk, control programmes have been established, which include the needed screening tests. This work focuses on the application of a Microbiological Multi-Residue System in ewe milk, a method based on the use of six different plates, each seeded with one of the following bacteria: Geobacillus stearothermophilus var. calidolactis (beta-lactams), Bacillus subtilis at pH 8.0 (aminoglycosides), Kocuria rhizophila (macrolides), Escherichia coli (quinolones), B. cereus (tetracyclines) and B. subtilis at pH 7.0 (sulphonamides), respectively. Twenty-three antimicrobial substances were analysed and a logistic regression was established for each substance assayed to relate the antibiotic concentration and the zone of microbial growth inhibition. Great linearity in the response was observed (regression coefficients of over 0.97). This fact suggests the possibility of establishing a decision level of antibiotic concentrations near to the Maximum Residue Limits (MRL). Zones of inhibition were suggested as proposed action levels for the different antimicrobial groups (diameters of inhibition of 18 mm for the aminoglycoside, beta-lactam and sulphonamide plates; 19 mm for the tetracycline plate, 21 mm for the macrolide plate, and 24 mm for the quinolone plate). Specificity and cross-reactivity were also assayed.

  8. Monitoring Springs in the Mojave Desert Using Landsat Time Series Analysis

    NASA Technical Reports Server (NTRS)

    Potter, Christopher S.

    2018-01-01

    The purpose of this study, based on Landsat satellite data was to characterize variations and trends over 30 consecutive years (1985-2016) in perennial vegetation green cover at over 400 confirmed Mojave Desert spring locations. These springs were surveyed between in 2015 and 2016 on lands managed in California by the U.S. Bureau of Land Management (BLM) and on several land trusts within the Barstow, Needles, and Ridgecrest BLM Field Offices. The normalized difference vegetation index (NDVI) from July Landsat images was computed at each spring location and a trend model was first fit to the multi-year NDVI time series using least squares linear regression.Â

  9. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  10. Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

    NASA Astrophysics Data System (ADS)

    Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

    2017-11-01

    Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.

  11. A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

    PubMed

    Ferrari, Alberto; Comelli, Mario

    2016-12-01

    In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    PubMed

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  13. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  14. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  15. Immobilization of Cellulase from Bacillus subtilis UniMAP-KB01 on Multi-walled Carbon Nanotubes for Biofuel Production

    NASA Astrophysics Data System (ADS)

    Naresh, Sandrasekaran; Hoong Shuit, Siew; Kunasundari, Balakrishnan; Hoo Peng, Yong; Qi, Hwa Ng; Teoh, Yi Peng

    2018-03-01

    Bacillus subtilis UniMAP-KB01, a cellulase producer was isolated from Malaysian mangrove soil. Through morphological identification it was observed that the B. subtilis appears to be in rod shaped and identified as a gram positive bacterium. Growth profile of isolated B. subtilis was established by measuring optical density (OD) at 600 nm for every 1 hour intervals. Polymath software was employed to plot the growth profile and the non-linear plot established gave the precision value of linear regression, R2 of 0.9602, root mean square deviation (RMSD) of 0.0176 and variance of 0.0025. The hydrolysis capacity testing revealed the cellulolytic index of 2.83 ± 0.46 after stained with Gram’s Iodine. The harvested crude enzyme after 24 hours incubation in carboxymethylcellulose (CMC) broth at 45°C and 100 RPM, was tested for enzyme activity. Through Filter Paper Assay (FPA), the cellulase activity was calculated to be 0.05 U/mL. The hydrolysis capacity testing and FPA shown an acceptable value for thermophilic bacterial enzyme activity. Thus, this isolated strain reasoned to be potential for producing thermostable cellulase which will be immobilized onto multi-walled carbon nanotubes and the cellulolytic activity will be characterized for biofuel production.

  16. Multi-disease analysis of maternal antibody decay using non-linear mixed models accounting for censoring.

    PubMed

    Goeyvaerts, Nele; Leuridan, Elke; Faes, Christel; Van Damme, Pierre; Hens, Niel

    2015-09-10

    Biomedical studies often generate repeated measures of multiple outcomes on a set of subjects. It may be of interest to develop a biologically intuitive model for the joint evolution of these outcomes while assessing inter-subject heterogeneity. Even though it is common for biological processes to entail non-linear relationships, examples of multivariate non-linear mixed models (MNMMs) are still fairly rare. We contribute to this area by jointly analyzing the maternal antibody decay for measles, mumps, rubella, and varicella, allowing for a different non-linear decay model for each infectious disease. We present a general modeling framework to analyze multivariate non-linear longitudinal profiles subject to censoring, by combining multivariate random effects, non-linear growth and Tobit regression. We explore the hypothesis of a common infant-specific mechanism underlying maternal immunity using a pairwise correlated random-effects approach and evaluating different correlation matrix structures. The implied marginal correlation between maternal antibody levels is estimated using simulations. The mean duration of passive immunity was less than 4 months for all diseases with substantial heterogeneity between infants. The maternal antibody levels against rubella and varicella were found to be positively correlated, while little to no correlation could be inferred for the other disease pairs. For some pairs, computational issues occurred with increasing correlation matrix complexity, which underlines the importance of further developing estimation methods for MNMMs. Copyright © 2015 John Wiley & Sons, Ltd.

  17. Dosing algorithm for warfarin using CYP2C9 and VKORC1 genotyping from a multi-ethnic population: comparison with other equations.

    PubMed

    Wu, Alan H B; Wang, Ping; Smith, Andrew; Haller, Christine; Drake, Katherine; Linder, Mark; Valdes, Roland

    2008-02-01

    Polymorphism in the genes for cytochrome (CYP)2C9 and the vitamin K epoxide reductase complex subunit 1 (VKORC1) affect the pharmacokinetics and pharmacodynamics of warfarin. We developed and validated a warfarin-dosing algorithm for a multi-ethnic population that predicts the best dose for stable anticoagulation, and compared its performance against other regression equations. We determined the allele and haplotype frequencies of genes for CYP2C9 and VKORC1 on 167 Caucasian, African-American, Asian and Hispanic patients on warfarin. On a subset where complete data were available (n=92), we developed a dosing equation that predicts the actual dose needed to maintain target anticoagulation using demographic variables and genotypes. This regression was validated against an independent group of subjects. We also applied our data to five other published warfarin-dosing equations. The allele frequency for CYP2C9*2 and *3 and the A allele for VKORC1 3673 was similar to previously published reports. For Caucasians and Asians, VKORC1 SNPs were in Hardy-Weinberg linkage equilibrium. Some VKORC1 SNPs among the African-American population and one SNP among Hispanics were not in equilibrium. The linear regression of predicted versus actual warfarin dose produced r-values of 0.71 for the training set and 0.67 for the validation set. The regression coefficient improved (to r=0.78 and 0.75, respectively) when rare genotypes were eliminated or when the 7566 VKORC1 genotype was added to the model. All of the regression models tested produced a similar degree of correlation. The exclusion of rare genotypes that are more associated with certain ethnicities improved the model. Minor improvements in algorithms can be observed with the inclusion of ethnicity and more CYP2C9 and VKORC1 SNPs as variables. Major improvements will likely require the identification of new gene associations with warfarin dosing.

  18. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  19. Automated retrieval of forest structure variables based on multi-scale texture analysis of VHR satellite imagery

    NASA Astrophysics Data System (ADS)

    Beguet, Benoit; Guyon, Dominique; Boukir, Samia; Chehata, Nesrine

    2014-10-01

    The main goal of this study is to design a method to describe the structure of forest stands from Very High Resolution satellite imagery, relying on some typical variables such as crown diameter, tree height, trunk diameter, tree density and tree spacing. The emphasis is placed on the automatization of the process of identification of the most relevant image features for the forest structure retrieval task, exploiting both spectral and spatial information. Our approach is based on linear regressions between the forest structure variables to be estimated and various spectral and Haralick's texture features. The main drawback of this well-known texture representation is the underlying parameters which are extremely difficult to set due to the spatial complexity of the forest structure. To tackle this major issue, an automated feature selection process is proposed which is based on statistical modeling, exploring a wide range of parameter values. It provides texture measures of diverse spatial parameters hence implicitly inducing a multi-scale texture analysis. A new feature selection technique, we called Random PRiF, is proposed. It relies on random sampling in feature space, carefully addresses the multicollinearity issue in multiple-linear regression while ensuring accurate prediction of forest variables. Our automated forest variable estimation scheme was tested on Quickbird and Pléiades panchromatic and multispectral images, acquired at different periods on the maritime pine stands of two sites in South-Western France. It outperforms two well-established variable subset selection techniques. It has been successfully applied to identify the best texture features in modeling the five considered forest structure variables. The RMSE of all predicted forest variables is improved by combining multispectral and panchromatic texture features, with various parameterizations, highlighting the potential of a multi-resolution approach for retrieving forest structure variables from VHR satellite images. Thus an average prediction error of ˜ 1.1 m is expected on crown diameter, ˜ 0.9 m on tree spacing, ˜ 3 m on height and ˜ 0.06 m on diameter at breast height.

  20. High Maternal Blood Mercury Level Is Associated with Low Verbal IQ in Children.

    PubMed

    Jeong, Kyoung Sook; Park, Hyewon; Ha, Eunhee; Shin, Jiyoung; Hong, Yun Chul; Ha, Mina; Park, Hyesook; Kim, Bung Nyun; Lee, Boeun; Lee, Soo Jeong; Lee, Kyung Yeon; Kim, Ja Hyeong; Kim, Yangho

    2017-07-01

    The objective of the present study was to investigate the relationship of IQ in children with maternal blood mercury concentration during late pregnancy. The present study is a component of the Mothers and Children's Environmental Health (MOCEH) study, a multi-center birth cohort project in Korea that began in 2006. The study cohort consisted of 553 children whose mothers underwent testing for blood mercury during late pregnancy. The children were given the Korean language version of the Wechsler Preschool and Primary Scale of Intelligence, revised edition (WPPSI-R) at 60 months of age. Multivariate linear regression analysis, with adjustment for covariates, was used to assess the relationship between verbal, performance, and total IQ in children and blood mercury concentration of mothers during late pregnancy. The results of multivariate linear regression analysis indicated that a doubling of blood mercury was associated with the decrease in verbal and total IQ by 2.482 (95% confidence interval [CI], 0.749-4.214) and 2.402 (95% CI, 0.526-4.279), respectively, after adjustment. This inverse association remained after further adjustment for blood lead concentration. Fish intake is an effect modifier of child IQ. In conclusion, high maternal blood mercury level is associated with low verbal IQ in children. © 2017 The Korean Academy of Medical Sciences.

  1. Improving medium-range ensemble streamflow forecasts through statistical post-processing

    NASA Astrophysics Data System (ADS)

    Mendoza, Pablo; Wood, Andy; Clark, Elizabeth; Nijssen, Bart; Clark, Martyn; Ramos, Maria-Helena; Nowak, Kenneth; Arnold, Jeffrey

    2017-04-01

    Probabilistic hydrologic forecasts are a powerful source of information for decision-making in water resources operations. A common approach is the hydrologic model-based generation of streamflow forecast ensembles, which can be implemented to account for different sources of uncertainties - e.g., from initial hydrologic conditions (IHCs), weather forecasts, and hydrologic model structure and parameters. In practice, hydrologic ensemble forecasts typically have biases and spread errors stemming from errors in the aforementioned elements, resulting in a degradation of probabilistic properties. In this work, we compare several statistical post-processing techniques applied to medium-range ensemble streamflow forecasts obtained with the System for Hydromet Applications, Research and Prediction (SHARP). SHARP is a fully automated prediction system for the assessment and demonstration of short-term to seasonal streamflow forecasting applications, developed by the National Center for Atmospheric Research, University of Washington, U.S. Army Corps of Engineers, and U.S. Bureau of Reclamation. The suite of post-processing techniques includes linear blending, quantile mapping, extended logistic regression, quantile regression, ensemble analogs, and the generalized linear model post-processor (GLMPP). We assess and compare these techniques using multi-year hindcasts in several river basins in the western US. This presentation discusses preliminary findings about the effectiveness of the techniques for improving probabilistic skill, reliability, discrimination, sharpness and resolution.

  2. Depressive mood mediates the influence of social support on health-related quality of life in elderly, multimorbid patients

    PubMed Central

    2014-01-01

    Background It is not well established how psychosocial factors like social support and depression affect health-related quality of life in multimorbid and elderly patients. We investigated whether depressive mood mediates the influence of social support on health-related quality of life. Methods Cross-sectional data of 3,189 multimorbid patients from the baseline assessment of the German MultiCare cohort study were used. Mediation was tested using the approach described by Baron and Kenny based on multiple linear regression, and controlling for socioeconomic variables and burden of multimorbidity. Results Mediation analyses confirmed that depressive mood mediates the influence of social support on health-related quality of life (Sobel’s p < 0.001). Multiple linear regression showed that the influence of depressive mood (β = −0.341, p < 0.01) on health-related quality of life is greater than the influence of multimorbidity (β = −0.234, p < 0.01). Conclusion Social support influences health-related quality of life, but this association is strongly mediated by depressive mood. Depression should be taken into consideration in research on multimorbidity, and clinicians should be aware of its importance when caring for multimorbid patients. Trial registration ISRCTN89818205 PMID:24708815

  3. Bayesian Travel Time Inversion adopting Gaussian Process Regression

    NASA Astrophysics Data System (ADS)

    Mauerberger, S.; Holschneider, M.

    2017-12-01

    A major application in seismology is the determination of seismic velocity models. Travel time measurements are putting an integral constraint on the velocity between source and receiver. We provide insight into travel time inversion from a correlation-based Bayesian point of view. Therefore, the concept of Gaussian process regression is adopted to estimate a velocity model. The non-linear travel time integral is approximated by a 1st order Taylor expansion. A heuristic covariance describes correlations amongst observations and a priori model. That approach enables us to assess a proxy of the Bayesian posterior distribution at ordinary computational costs. No multi dimensional numeric integration nor excessive sampling is necessary. Instead of stacking the data, we suggest to progressively build the posterior distribution. Incorporating only a single evidence at a time accounts for the deficit of linearization. As a result, the most probable model is given by the posterior mean whereas uncertainties are described by the posterior covariance.As a proof of concept, a synthetic purely 1d model is addressed. Therefore a single source accompanied by multiple receivers is considered on top of a model comprising a discontinuity. We consider travel times of both phases - direct and reflected wave - corrupted by noise. Left and right of the interface are assumed independent where the squared exponential kernel serves as covariance.

  4. Assessing the Causality between Blood Pressure and Retinal Vascular Caliber through Mendelian Randomisation

    NASA Astrophysics Data System (ADS)

    Li, Ling-Jun; Liao, Jiemin; Cheung, Carol Yim-Lui; Ikram, M. Kamran; Shyong, Tai E.; Wong, Tien-Yin; Cheng, Ching-Yu

    2016-02-01

    We aimed to determine the association between blood pressure (BP) and retinal vascular caliber changes that were free from confounders and reverse causation by using Mendelian randomisation. A total of 6528 participants from a multi-ethnic cohort (Chinese, Malays, and Indians) in Singapore were included in this study. Retinal arteriolar and venular caliber was measured by a semi-automated computer program. Genotyping was done using Illumina 610-quad chips. Meta-analysis of association between BP, and retinal arteriolar and venular caliber across three ethnic groups was performed both in conventional linear regression and Mendelian randomisation framework with a genetic risk score of BP as an instrumental variable. In multiple linear regression models, each 10 mm Hg increase in systolic BP, diastolic BP, and mean arterial BP (MAP) was associated with significant decreases in retinal arteriolar caliber of a 1.4, 3.0, and 2.6 μm, and significant decreases in retinal venular caliber of a 0.6, 0.7, and 0.9 μm, respectively. In a Mendelian randomisation model, only associations between DBP and MAP and retinal arteriolar narrowing remained yet its significance was greatly reduced. Our data showed weak evidence of a causal relationship between elevated BP and retinal arteriolar narrowing.

  5. Emotional exhaustion and cognitive performance in apparently healthy teachers: a longitudinal multi-source study.

    PubMed

    Feuerhahn, Nicolas; Stamov-Roßnagel, Christian; Wolfram, Maren; Bellingrath, Silja; Kudielka, Brigitte M

    2013-10-01

    We investigate how emotional exhaustion (EE), the core component of burnout, relates to cognitive performance, job performance and health. Cognitive performance was assessed by self-rated cognitive stress symptoms, self-rated and peer-rated cognitive impairments in everyday tasks and a neuropsychological test of learning and memory (LGT-3); job performance and physical health were gauged by self-reports. Cross-sectional linear regression analyses in a sample of 100 teachers confirm that EE is negatively related to cognitive performance as assessed by self-rating and peer-rating as well as neuropsychological testing (all p < .05). Longitudinal linear regression analyses confirm similar trends (p < .10) for self-rated and peer-rated cognitive performance. Executive control deficits might explain impaired cognitive performance in EE. In longitudinal analyses, EE also significantly predicts physical health. Contrary to our expectations, EE does not affect job performance. When reversed causation is tested, none of the outcome variables at Time 1 predict EE at Time 2. This speaks against cognitive dysfunctioning serving as a vulnerability factor for exhaustion. In sum, results underpin the negative consequences of EE for cognitive performance and health, which are relevant for individuals and organizations alike. In this way, findings might contribute to the understanding of the burnout syndrome. Copyright © 2012 John Wiley & Sons, Ltd.

  6. Simplified large African carnivore density estimators from track indices.

    PubMed

    Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J

    2016-01-01

    The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y  =  αx  + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P  > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P  < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.

  7. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  8. Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

    PubMed

    Hemmila, April; McGill, Jim; Ritter, David

    2008-03-01

    To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.

  9. Linearity versus Nonlinearity of Offspring-Parent Regression: An Experimental Study of Drosophila Melanogaster

    PubMed Central

    Gimelfarb, A.; Willis, J. H.

    1994-01-01

    An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818

  10. A linearly frequency-swept high-speed-rate multi-wavelength laser for optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Wang, Qiyu; Wang, Zhaoying; Yuan, Quan; Ma, Rui; Du, Tao; Yang, Tianxin

    2017-02-01

    We proposed and demonstrated a linearly frequency-swept multi-wavelength laser source for optical coherence tomography (OCT) eliminating the need of wavenumber space resampling in the postprocessing progress. The source consists of a multi-wavelength fiber laser source (MFS) and an optical sweeping loop. In this novel laser source, an equally spaced multi-wavelength laser is swept simultaneously by a certain step each time in the frequency domain in the optical sweeping loop. The sweeping step is determined by radio frequency (RF) signal which can be precisely controlled. Thus the sweeping behavior strictly maintains a linear relationship between time and frequency. We experimentally achieved linear time-frequency sweeping at a sweeping rate of 400 kHz with our laser source.

  11. An Analysis of Terrestrial and Aquatic Environmental Controls of Riverine Dissolved Organic Carbon in the Conterminous United States

    DOE PAGES

    Yang, Qichun; Zhang, Xuesong; Xu, Xingya; ...

    2017-05-29

    Riverine carbon cycling is an important, but insufficiently investigated component of the global carbon cycle. Analyses of environmental controls on riverine carbon cycling are critical for improved understanding of mechanisms regulating carbon processing and storage along the terrestrial-aquatic continuum. Here, we compile and analyze riverine dissolved organic carbon (DOC) concentration data from 1402 United States Geological Survey (USGS) gauge stations to examine the spatial variability and environmental controls of DOC concentrations in the United States (U.S.) surface waters. DOC concentrations exhibit high spatial variability, with an average of 6.42 ± 6.47 mg C/ L (Mean ± Standard Deviation). In general,more » high DOC concentrations occur in the Upper Mississippi River basin and the Southeastern U.S., while low concentrations are mainly distributed in the Western U.S. Single-factor analysis indicates that slope of drainage areas, wetlands, forests, percentage of first-order streams, and instream nutrients (such as nitrogen and phosphorus) pronouncedly influence DOC concentrations, but the explanatory power of each bivariate model is lower than 35%. Analyses based on the general multi-linear regression models suggest DOC concentrations are jointly impacted by multiple factors. Soil properties mainly show positive correlations with DOC concentrations; forest and shrub lands have positive correlations with DOC concentrations, but urban area and croplands demonstrate negative impacts; total instream phosphorus and dam density correlate positively with DOC concentrations. Notably, the relative importance of these environmental controls varies substantially across major U.S. water resource regions. In addition, DOC concentrations and environmental controls also show significant variability from small streams to large rivers, which may be caused by changing carbon sources and removal rates by river orders. In sum, our results reveal that general multi-linear regression analysis of twenty one terrestrial and aquatic environmental factors can partially explain (56%) the DOC concentration variation. In conclusion, this study highlights the complexity of the interactions among these environmental factors in determining DOC concentrations, thus calls for processes-based, non-linear methodologies to constrain uncertainties in riverine DOC cycling.« less

  12. Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

    PubMed

    Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

    2014-03-01

    In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.

  13. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  14. An Expert System for the Evaluation of Cost Models

    DTIC Science & Technology

    1990-09-01

    contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John

  15. Assessment of mechanical properties of isolated bovine intervertebral discs from multi-parametric magnetic resonance imaging.

    PubMed

    Recuerda, Maximilien; Périé, Delphine; Gilbert, Guillaume; Beaudoin, Gilles

    2012-10-12

    The treatment planning of spine pathologies requires information on the rigidity and permeability of the intervertebral discs (IVDs). Magnetic resonance imaging (MRI) offers great potential as a sensitive and non-invasive technique for describing the mechanical properties of IVDs. However, the literature reported small correlation coefficients between mechanical properties and MRI parameters. Our hypothesis is that the compressive modulus and the permeability of the IVD can be predicted by a linear combination of MRI parameters. Sixty IVDs were harvested from bovine tails, and randomly separated in four groups (in-situ, digested-6h, digested-18h, digested-24h). Multi-parametric MRI acquisitions were used to quantify the relaxation times T1 and T2, the magnetization transfer ratio MTR, the apparent diffusion coefficient ADC and the fractional anisotropy FA. Unconfined compression, confined compression and direct permeability measurements were performed to quantify the compressive moduli and the hydraulic permeabilities. Differences between groups were evaluated from a one way ANOVA. Multi linear regressions were performed between dependent mechanical properties and independent MRI parameters to verify our hypothesis. A principal component analysis was used to convert the set of possibly correlated variables into a set of linearly uncorrelated variables. Agglomerative Hierarchical Clustering was performed on the 3 principal components. Multilinear regressions showed that 45 to 80% of the Young's modulus E, the aggregate modulus in absence of deformation HA0, the radial permeability kr and the axial permeability in absence of deformation k0 can be explained by the MRI parameters within both the nucleus pulposus and the annulus pulposus. The principal component analysis reduced our variables to two principal components with a cumulative variability of 52-65%, which increased to 70-82% when considering the third principal component. The dendograms showed a natural division into four clusters for the nucleus pulposus and into three or four clusters for the annulus fibrosus. The compressive moduli and the permeabilities of isolated IVDs can be assessed mostly by MT and diffusion sequences. However, the relationships have to be improved with the inclusion of MRI parameters more sensitive to IVD degeneration. Before the use of this technique to quantify the mechanical properties of IVDs in vivo on patients suffering from various diseases, the relationships have to be defined for each degeneration state of the tissue that mimics the pathology. Our MRI protocol associated to principal component analysis and agglomerative hierarchical clustering are promising tools to classify the degenerated intervertebral discs and further find biomarkers and predictive factors of the evolution of the pathologies.

  16. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.

  17. Compound Identification Using Penalized Linear Regression on Metabolomics

    PubMed Central

    Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho

    2014-01-01

    Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894

  18. Walk Score® and Transit Score® and Walking in the Multi-Ethnic Study of Atherosclerosis

    PubMed Central

    Hirsch, Jana A.; Moore, Kari A.; Evenson, Kelly R.; Rodriguez, Daniel A; Diez Roux, Ana V.

    2013-01-01

    Background Walk Score® and Transit Score® are open-source measures of the neighborhood built environment to support walking (“walkability”) and access to transportation. Purpose To investigate associations of Street Smart Walk Score and Transit Score with self-reported transport and leisure walking using data from a large multi-city and diverse population-based sample of adults. Methods Data from a sample of 4552 residents of Baltimore MD; Chicago IL; Forsyth County NC; Los Angeles CA; New York NY; and St. Paul MN from the Multi-Ethnic Study of Atherosclerosis (2010–2012) were linked to Walk Score and Transit Score (collected in 2012). Logistic and linear regression models estimated ORs of not walking and mean differences in minutes walked, respectively, associated with continuous and categoric Walk Score and Transit Score. All analyses were conducted in 2012. Results After adjustment for site, key sociodemographic, and health variables, a higher Walk Score was associated with lower odds of not walking for transport and more minutes/week of transport walking. Compared to those in a “walker’s paradise,” lower categories of Walk Score were associated with a linear increase in odds of not transport walking and a decline in minutes of leisure walking. An increase in Transit Score was associated with lower odds of not transport walking or leisure walking, and additional minutes/week of leisure walking. Conclusions Walk Score and Transit Score appear to be useful as measures of walkability in analyses of neighborhood effects. PMID:23867022

  19. New Approach To Hour-By-Hour Weather Forecast

    NASA Astrophysics Data System (ADS)

    Liao, Q. Q.; Wang, B.

    2017-12-01

    Fine hourly forecast in single station weather forecast is required in many human production and life application situations. Most previous MOS (Model Output Statistics) which used a linear regression model are hard to solve nonlinear natures of the weather prediction and forecast accuracy has not been sufficient at high temporal resolution. This study is to predict the future meteorological elements including temperature, precipitation, relative humidity and wind speed in a local region over a relatively short period of time at hourly level. By means of hour-to-hour NWP (Numeral Weather Prediction)meteorological field from Forcastio (https://darksky.net/dev/docs/forecast) and real-time instrumental observation including 29 stations in Yunnan and 3 stations in Tianjin of China from June to October 2016, predictions are made of the 24-hour hour-by-hour ahead. This study presents an ensemble approach to combine the information of instrumental observation itself and NWP. Use autoregressive-moving-average (ARMA) model to predict future values of the observation time series. Put newest NWP products into the equations derived from the multiple linear regression MOS technique. Handle residual series of MOS outputs with autoregressive (AR) model for the linear property presented in time series. Due to the complexity of non-linear property of atmospheric flow, support vector machine (SVM) is also introduced . Therefore basic data quality control and cross validation makes it able to optimize the model function parameters , and do 24 hours ahead residual reduction with AR/SVM model. Results show that AR model technique is better than corresponding multi-variant MOS regression method especially at the early 4 hours when the predictor is temperature. MOS-AR combined model which is comparable to MOS-SVM model outperform than MOS. Both of their root mean square error and correlation coefficients for 2 m temperature are reduced to 1.6 degree Celsius and 0.91 respectively. The forecast accuracy of 24- hour forecast deviation no more than 2 degree Celsius is 78.75 % for MOS-AR model and 81.23 % for AR model.

  20. Control Variate Selection for Multiresponse Simulation.

    DTIC Science & Technology

    1987-05-01

    M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels

  1. An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

    ERIC Educational Resources Information Center

    Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

    2011-01-01

    This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…

  2. High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

    PubMed

    Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

    2018-05-30

    NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  4. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

    PubMed Central

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-01-01

    Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393

  5. Satellite remote sensing of fine particulate air pollutants over Indian mega cities

    NASA Astrophysics Data System (ADS)

    Sreekanth, V.; Mahesh, B.; Niranjan, K.

    2017-11-01

    In the backdrop of the need for high spatio-temporal resolution data on PM2.5 mass concentrations for health and epidemiological studies over India, empirical relations between Aerosol Optical Depth (AOD) and PM2.5 mass concentrations are established over five Indian mega cities. These relations are sought to predict the surface PM2.5 mass concentrations from high resolution columnar AOD datasets. Current study utilizes multi-city public domain PM2.5 data (from US Consulate and Embassy's air monitoring program) and MODIS AOD, spanning for almost four years. PM2.5 is found to be positively correlated with AOD. Station-wise linear regression analysis has shown spatially varying regression coefficients. Similar analysis has been repeated by eliminating data from the elevated aerosol prone seasons, which has improved the correlation coefficient. The impact of the day to day variability in the local meteorological conditions on the AOD-PM2.5 relationship has been explored by performing a multiple regression analysis. A cross-validation approach for the multiple regression analysis considering three years of data as training dataset and one-year data as validation dataset yielded an R value of ∼0.63. The study was concluded by discussing the factors which can improve the relationship.

  6. Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

    2017-12-01

    Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

  7. A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

    EPA Science Inventory

    We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...

  8. Enhancing oil removal from water by immobilizing multi-wall carbon nanotubes on the surface of polyurethane foam.

    PubMed

    Keshavarz, Alireza; Zilouei, Hamid; Abdolmaleki, Amir; Asadinezhad, Ahmad

    2015-07-01

    A surface modification method was carried out to enhance the light crude oil sorption capacity of polyurethane foam (PUF) through immobilization of multi-walled carbon nanotube (MWCNT) on the foam surface at various concentrations. The developed sorbent was characterized using scanning electron microscopy, Fourier transform infrared spectroscopy, thermogravimetric analysis, and tensile elongation test. The results obtained from thermogravimetric and tensile elongation tests showed the improvement of thermal and mechanical resistance of surface-modified foam. The experimental data also revealed that the immobilization of MWCNT on PUF surface enhanced the sorption capacity of light crude oil and reduced water sorption. The highest oil removal capacity was obtained for 1 wt% MWCNT on PUF surface which was 21.44% enhancement in light crude oil sorption compared to the blank PUF. The reusability of surface modified PUF was determined through four cycles of chemical regeneration using petroleum ether. The adsorption of light crude oil with 30 g initial mass showed that 85.45% of the initial oil sorption capacity of this modified sorbent was remained after four regeneration cycles. Equilibrium isotherms for adsorption of oil were analyzed by the Freundlich, Langmuir, Temkin, and Redlich-Peterson models through linear and non-linear regression methods. Results of equilibrium revealed that Langmuir isotherm is the best fitting model and non-linear method is a more accurate way to predict the parameters involved in the isotherms. The overall findings suggested the promising potentials of the developed sorbent in order to be efficiently used in large-scale oil spill cleanup. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Pseudo second order kinetics and pseudo isotherms for malachite green onto activated carbon: comparison of linear and non-linear regression methods.

    PubMed

    Kumar, K Vasanth; Sivanesan, S

    2006-08-25

    Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.

  10. Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

    DTIC Science & Technology

    2015-07-15

    Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

  11. Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

    NASA Astrophysics Data System (ADS)

    Cepowski, Tomasz

    2017-06-01

    The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.

  12. Solvent-free synthesis, spectral correlations and antimicrobial activities of some aryl E 2-propen-1-ones

    NASA Astrophysics Data System (ADS)

    Sathiyamoorthi, K.; Mala, V.; Sakthinathan, S. P.; Kamalakkannan, D.; Suresh, R.; Vanangamudi, G.; Thirunarayanan, G.

    2013-08-01

    Totally 38 aryl E 2-propen-1-ones including nine substituted styryl 4-iodophenyl ketones have been synthesised using solvent-free SiO2-H3PO4 catalyzed Aldol condensation between respective methyl ketones and substituted benzaldehydes under microwave irradiation. The yields of the ketones are more than 80%. The synthesised chalcones were characterized by their analytical, physical and spectroscopic data. The spectral frequencies of synthesised substituted styryl 4-iodophenyl ketones have been correlated with Hammett substituent constants, F and R parameters using single and multi-linear regression analysis. The antimicrobial activities of 4-iodophenyl chalcones have been studied using Bauer-Kirby method.

  13. Quantifying Main Trends in Lysozyme Nucleation: The Effect of Precipitant Concentration and Impurities

    NASA Technical Reports Server (NTRS)

    Burke, Michael W.; Judge, Russell A.; Pusey, Marc L.; Rose, M. Franklin (Technical Monitor)

    2000-01-01

    Full factorial experiment design incorporating multi-linear regression analysis of the experimental data allows the main trends and effects to be quickly identified while using only a limited number of experiments. These techniques were used to identify the effect of precipitant concentration and the presence of an impurity, the physiological lysozyme dimer, on the nucleation rate and crystal dimensions of the tetragonal form of chicken egg white lysozyme. Increasing precipitant concentration was found to decrease crystal numbers, the magnitude of this effect also depending on the supersaturation. The presence of the dimer generally increased nucleation. The crystal axial ratio decreased with increasing precipitant concentration independent of impurity.

  14. Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

    ERIC Educational Resources Information Center

    Li, Deping; Oranje, Andreas

    2007-01-01

    Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…

  15. A multi-criteria evaluation system for marine litter pollution based on statistical analyses of OSPAR beach litter monitoring time series.

    PubMed

    Schulz, Marcus; Neumann, Daniel; Fleet, David M; Matthies, Michael

    2013-12-01

    During the last decades, marine pollution with anthropogenic litter has become a worldwide major environmental concern. Standardized monitoring of litter since 2001 on 78 beaches selected within the framework of the Convention for the Protection of the Marine Environment of the North-East Atlantic (OSPAR) has been used to identify temporal trends of marine litter. Based on statistical analyses of this dataset a two-part multi-criteria evaluation system for beach litter pollution of the North-East Atlantic and the North Sea is proposed. Canonical correlation analyses, linear regression analyses, and non-parametric analyses of variance were used to identify different temporal trends. A classification of beaches was derived from cluster analyses and served to define different states of beach quality according to abundances of 17 input variables. The evaluation system is easily applicable and relies on the above-mentioned classification and on significant temporal trends implied by significant rank correlations. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

    PubMed

    Ernst, Anja F; Albers, Casper J

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.

  17. Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

    PubMed Central

    Ernst, Anja F.

    2017-01-01

    Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971

  18. Rapid Multi-Tracer PET Tumor Imaging With F-FDG and Secondary Shorter-Lived Tracers.

    PubMed

    Black, Noel F; McJames, Scott; Kadrmas, Dan J

    2009-10-01

    Rapid multi-tracer PET, where two to three PET tracers are rapidly scanned with staggered injections, can recover certain imaging measures for each tracer based on differences in tracer kinetics and decay. We previously showed that single-tracer imaging measures can be recovered to a certain extent from rapid dual-tracer (62)Cu - PTSM (blood flow) + (62)Cu - ATSM (hypoxia) tumor imaging. In this work, the feasibility of rapidly imaging (18)F-FDG plus one or two of these shorter-lived secondary tracers was evaluated in the same tumor model. Dynamic PET imaging was performed in four dogs with pre-existing tumors, and the raw scan data was combined to emulate 60 minute long dual- and triple-tracer scans, using the single-tracer scans as gold standards. The multi-tracer data were processed for static (SUV) and kinetic (K(1), K(net)) endpoints for each tracer, followed by linear regression analysis of multi-tracer versus single-tracer results. Static and quantitative dynamic imaging measures of FDG were both accurately recovered from the multi-tracer scans, closely matching the single-tracer FDG standards (R > 0.99). Quantitative blood flow information, as measured by PTSM K(1) and SUV, was also accurately recovered from the multi-tracer scans (R = 0.97). Recovery of ATSM kinetic parameters proved more difficult, though the ATSM SUV was reasonably well recovered (R = 0.92). We conclude that certain additional information from one to two shorter-lived PET tracers may be measured in a rapid multi-tracer scan alongside FDG without compromising the assessment of glucose metabolism. Such additional and complementary information has the potential to improve tumor characterization in vivo, warranting further investigation of rapid multi-tracer techniques.

  19. Rapid Multi-Tracer PET Tumor Imaging With 18F-FDG and Secondary Shorter-Lived Tracers

    PubMed Central

    Black, Noel F.; McJames, Scott; Kadrmas, Dan J.

    2009-01-01

    Rapid multi-tracer PET, where two to three PET tracers are rapidly scanned with staggered injections, can recover certain imaging measures for each tracer based on differences in tracer kinetics and decay. We previously showed that single-tracer imaging measures can be recovered to a certain extent from rapid dual-tracer 62Cu – PTSM (blood flow) + 62Cu — ATSM (hypoxia) tumor imaging. In this work, the feasibility of rapidly imaging 18F-FDG plus one or two of these shorter-lived secondary tracers was evaluated in the same tumor model. Dynamic PET imaging was performed in four dogs with pre-existing tumors, and the raw scan data was combined to emulate 60 minute long dual- and triple-tracer scans, using the single-tracer scans as gold standards. The multi-tracer data were processed for static (SUV) and kinetic (K1, Knet) endpoints for each tracer, followed by linear regression analysis of multi-tracer versus single-tracer results. Static and quantitative dynamic imaging measures of FDG were both accurately recovered from the multi-tracer scans, closely matching the single-tracer FDG standards (R > 0.99). Quantitative blood flow information, as measured by PTSM K1 and SUV, was also accurately recovered from the multi-tracer scans (R = 0.97). Recovery of ATSM kinetic parameters proved more difficult, though the ATSM SUV was reasonably well recovered (R = 0.92). We conclude that certain additional information from one to two shorter-lived PET tracers may be measured in a rapid multi-tracer scan alongside FDG without compromising the assessment of glucose metabolism. Such additional and complementary information has the potential to improve tumor characterization in vivo, warranting further investigation of rapid multi-tracer techniques. PMID:20046800

  20. Estimating linear temporal trends from aggregated environmental monitoring data

    USGS Publications Warehouse

    Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.

    2017-01-01

    Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.

  1. Wavelet-linear genetic programming: A new approach for modeling monthly streamflow

    NASA Astrophysics Data System (ADS)

    Ravansalar, Masoud; Rajaee, Taher; Kisi, Ozgur

    2017-06-01

    The streamflows are important and effective factors in stream ecosystems and its accurate prediction is an essential and important issue in water resources and environmental engineering systems. A hybrid wavelet-linear genetic programming (WLGP) model, which includes a discrete wavelet transform (DWT) and a linear genetic programming (LGP) to predict the monthly streamflow (Q) in two gauging stations, Pataveh and Shahmokhtar, on the Beshar River at the Yasuj, Iran were used in this study. In the proposed WLGP model, the wavelet analysis was linked to the LGP model where the original time series of streamflow were decomposed into the sub-time series comprising wavelet coefficients. The results were compared with the single LGP, artificial neural network (ANN), a hybrid wavelet-ANN (WANN) and Multi Linear Regression (MLR) models. The comparisons were done by some of the commonly utilized relevant physical statistics. The Nash coefficients (E) were found as 0.877 and 0.817 for the WLGP model, for the Pataveh and Shahmokhtar stations, respectively. The comparison of the results showed that the WLGP model could significantly increase the streamflow prediction accuracy in both stations. Since, the results demonstrate a closer approximation of the peak streamflow values by the WLGP model, this model could be utilized for the simulation of cumulative streamflow data prediction in one month ahead.

  2. Comparing The Effectiveness of a90/95 Calculations (Preprint)

    DTIC Science & Technology

    2006-09-01

    Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity

  3. Application of glas laser altimetry to detect elevation changes in East Antarctica

    NASA Astrophysics Data System (ADS)

    Scaioni, M.; Tong, X.; Li, R.

    2013-10-01

    In this paper the use of ICESat/GLAS laser altimeter for estimating multi-temporal elevation changes on polar ice sheets is afforded. Due to non-overlapping laser spots during repeat passes, interpolation methods are required to make comparisons. After reviewing the main methods described in the literature (crossover point analysis, cross-track DEM projection, space-temporal regressions), the last one has been chosen for its capability of providing more elevation change rate measurements. The standard implementation of the space-temporal linear regression technique has been revisited and improved to better cope with outliers and to check the estimability of model's parameters. GLAS data over the PANDA route in East Antarctica have been used for testing. Obtained results have been quite meaningful from a physical point of view, confirming the trend reported by the literature of a constant snow accumulation in the area during the two past decades, unlike the most part of the continent that has been losing mass.

  4. Intimate partner violence and anxiety disorders in pregnancy: the importance of vocational training of the nursing staff in facing them1

    PubMed Central

    Fonseca-Machado, Mariana de Oliveira; Monteiro, Juliana Cristina dos Santos; Haas, Vanderlei José; Abrão, Ana Cristina Freitas de Vilhena; Gomes-Sponholz, Flávia

    2015-01-01

    Objective: to identify the relationship between posttraumatic stress disorder, trait and state anxiety, and intimate partner violence during pregnancy. Method: observational, cross-sectional study developed with 358 pregnant women. The Posttraumatic Stress Disorder Checklist - Civilian Version was used, as well as the State-Trait Anxiety Inventory and an adapted version of the instrument used in the World Health Organization Multi-country Study on Women's Health and Domestic Violence. Results: after adjusting to the multiple logistic regression model, intimate partner violence, occurred during pregnancy, was associated with the indication of posttraumatic stress disorder. The adjusted multiple linear regression models showed that the victims of violence, in the current pregnancy, had higher symptom scores of trait and state anxiety than non-victims. Conclusion: recognizing the intimate partner violence as a clinically relevant and identifiable risk factor for the occurrence of anxiety disorders during pregnancy can be a first step in the prevention thereof. PMID:26487135

  5. Correlation and simple linear regression.

    PubMed

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  6. Estimation of continuous multi-DOF finger joint kinematics from surface EMG using a multi-output Gaussian Process.

    PubMed

    Ngeo, Jimson; Tamei, Tomoya; Shibata, Tomohiro

    2014-01-01

    Surface electromyographic (EMG) signals have often been used in estimating upper and lower limb dynamics and kinematics for the purpose of controlling robotic devices such as robot prosthesis and finger exoskeletons. However, in estimating multiple and a high number of degrees-of-freedom (DOF) kinematics from EMG, output DOFs are usually estimated independently. In this study, we estimate finger joint kinematics from EMG signals using a multi-output convolved Gaussian Process (Multi-output Full GP) that considers dependencies between outputs. We show that estimation of finger joints from muscle activation inputs can be improved by using a regression model that considers inherent coupling or correlation within the hand and finger joints. We also provide a comparison of estimation performance between different regression methods, such as Artificial Neural Networks (ANN) which is used by many of the related studies. We show that using a multi-output GP gives improved estimation compared to multi-output ANN and even dedicated or independent regression models.

  7. Gender difference in cycling speed and age of winning performers in ultra-cycling - the 508-mile "Furnace Creek" from 1983 to 2012.

    PubMed

    Rüst, Christoph Alexander; Rosemann, Thomas; Lepers, Romuald; Knechtle, Beat

    2015-01-01

    We analysed (i) the gender difference in cycling speed and (ii) the age of winning performers in the 508-mile "Furnace Creek 508". Changes in cycling speeds and gender differences from 1983 to 2012 were analysed using linear, non-linear and hierarchical multi-level regression analyses for the annual three fastest women and men. Cycling speed increased non-linearly in men from 14.6 (s = 0.3) km · h(-1) (1983) to 27.1 (s = 0.7) km · h(-1) (2012) and non-linearly in women from 11.0 (s = 0.3) km · h(-1) (1984) to 24.2 (s = 0.2) km · h(-1) (2012). The gender difference in cycling speed decreased linearly from 26.2 (s = 0.5)% (1984) to 10.7 (s = 1.9)% (2012). The age of winning performers increased from 26 (s = 2) years (1984) to 43 (s = 11) years (2012) in women and from 33 (s = 6) years (1983) to 50 (s = 5) years (2012) in men. To summarise, these results suggest that (i) women will be able to narrow the gender gap in cycling speed in the near future in an ultra-endurance cycling race such as the "Furnace Creek 508" due to the linear decrease in gender difference and (ii) the maturity of these athletes has changed during the last three decades where winning performers become older and faster across years.

  8. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    PubMed

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Multivariate Regression Analysis of Winter Ozone Events in the Uinta Basin of Eastern Utah, USA

    NASA Astrophysics Data System (ADS)

    Mansfield, M. L.

    2012-12-01

    I report on a regression analysis of a number of variables that are involved in the formation of winter ozone in the Uinta Basin of Eastern Utah. One goal of the analysis is to develop a mathematical model capable of predicting the daily maximum ozone concentration from values of a number of independent variables. The dependent variable is the daily maximum ozone concentration at a particular site in the basin. Independent variables are (1) daily lapse rate, (2) daily "basin temperature" (defined below), (3) snow cover, (4) midday solar zenith angle, (5) monthly oil production, (6) monthly gas production, and (7) the number of days since the beginning of a multi-day inversion event. Daily maximum temperature and daily snow cover data are available at ten or fifteen different sites throughout the basin. The daily lapse rate is defined operationally as the slope of the linear least-squares fit to the temperature-altitude plot, and the "basin temperature" is defined as the value assumed by the same least-squares line at an altitude of 1400 m. A multi-day inversion event is defined as a set of consecutive days for which the lapse rate remains positive. The standard deviation in the accuracy of the model is about 10 ppb. The model has been combined with historical climate and oil & gas production data to estimate historical ozone levels.

  10. U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

    DTIC Science & Technology

    2017-10-01

    ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID

  11. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  12. A piecewise regression approach for determining biologically relevant hydraulic thresholds for the protection of fish at river infrastructure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boys, Craig A.; Robinson, Wayne; Miller, Brett

    2016-05-13

    Barotrauma injury can occur when fish are exposed to rapid decompression during downstream passage through river infrastructure. A piecewise regression approach was used to objectively quantify barotrauma injury thresholds in two physoclistous species (Murray cod Maccullochella peelii and silver perch Bidyanus bidyanus) following simulated infrastructure passage in barometric chambers. The probability of injuries such as swim bladder rupture; exophthalmia; and haemorrhage and emphysema in various organs increased as the ratio between the lowest exposure pressure and the acclimation pressure (ratio of pressure change RPCE/A) fell. The relationship was typically non-linear and piecewise regression was able to quantify thresholds in RPCE/Amore » that once exceeded resulted in a substantial increase in barotrauma injury. Thresholds differed among injury types and between species but by applying a multi-species precautionary principle, the maintenance of exposure pressures at river infrastructure above 70% of acclimation pressure (RPCE/A of 0.7) should sufficiently protect downstream migrating juveniles of these two physoclistous species. These findings have important implications for determining the risk posed by current infrastructures and informing the design and operation of new ones.« less

  13. A Constrained Linear Estimator for Multiple Regression

    ERIC Educational Resources Information Center

    Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

    2010-01-01

    "Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

  14. Limits of detection and decision. Part 3

    NASA Astrophysics Data System (ADS)

    Voigtman, E.

    2008-02-01

    It has been shown that the MARLAP (Multi-Agency Radiological Laboratory Analytical Protocols) for estimating the Currie detection limit, which is based on 'critical values of the non-centrality parameter of the non-central t distribution', is intrinsically biased, even if no calibration curve or regression is used. This completed the refutation of the method, begun in Part 2. With the field cleared of obstructions, the true theory underlying Currie's limits of decision, detection and quantification, as they apply in a simple linear chemical measurement system (CMS) having heteroscedastic, Gaussian measurement noise and using weighted least squares (WLS) processing, was then derived. Extensive Monte Carlo simulations were performed, on 900 million independent calibration curves, for linear, "hockey stick" and quadratic noise precision models (NPMs). With errorless NPM parameters, all the simulation results were found to be in excellent agreement with the derived theoretical expressions. Even with as much as 30% noise on all of the relevant NPM parameters, the worst absolute errors in rates of false positives and false negatives, was only 0.3%.

  15. On the design of classifiers for crop inventories

    NASA Technical Reports Server (NTRS)

    Heydorn, R. P.; Takacs, H. C.

    1986-01-01

    Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.

  16. Influence of Teachers' Personal Health Behaviors on Operationalizing Obesity Prevention Policy in Head Start Preschools: A Project of the Children's Healthy Living Program (CHL).

    PubMed

    Esquivel, Monica Kazlausky; Nigg, Claudio R; Fialkowski, Marie K; Braun, Kathryn L; Li, Fenfang; Novotny, Rachel

    2016-05-01

    To quantify the Head Start (HS) teacher mediating and moderating influence on the effect of a wellness policy intervention. Intervention trial within a larger randomized community trial. HS preschools in Hawaii. Twenty-three HS classrooms located within 2 previously randomized communities. Seven-month multi-component intervention with policy changes to food served and service style, initiatives for employee wellness, classroom activities for preschoolers promoting physical activity (PA) and healthy eating, and training and technical assistance. The Environment and Policy Assessment and Observation (EPAO) classroom scores and teacher questionnaires assessing on knowledge, beliefs, priorities, and misconceptions around child nutrition and changes in personal health behaviors and status were the main outcome measures. Paired t tests and linear regression analysis tested the intervention effects on the classroom and mediating and moderating effects of the teacher variables on the classroom environment. General linear model test showed greater intervention effect on the EPAO score where teachers reported higher than average improvements in their own health status and behaviors (estimate [SE] = -2.47 (0.78), P < .05). Strategies to improve teacher health status and behaviors included in a multi-component policy intervention aimed at child obesity prevention may produce a greater effect on classroom environments. Copyright © 2016 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.

  17. Multi-timescale sediment responses across a human impacted river-estuary system

    NASA Astrophysics Data System (ADS)

    Chen, Yining; Chen, Nengwang; Li, Yan; Hong, Huasheng

    2018-05-01

    Hydrological processes regulating sediment transport from land to sea have been widely studied. However, anthropogenic factors controlling the river flow-sediment regime and subsequent response of the estuary are still poorly understood. Here we conducted a multi-timescale analysis on flow and sediment discharges during the period 1967-2014 for the two tributaries of the Jiulong River in Southeast China. The long-term flow-sediment relationship remained linear in the North River throughout the period, while the linearity showed a remarkable change after 1995 in the West River, largely due to construction of dams and reservoirs in the upland watershed. Over short timescales, rainstorm events caused the changes of suspended sediment concentration (SSC) in the rivers. Regression analysis using synchronous SSC data in a wet season (2009) revealed a delayed response (average 5 days) of the estuary to river input, and a box-model analysis established a quantitative relationship to further describe the response of the estuary to the river sediment input over multiple timescales. The short-term response is determined by both the vertical SSC-salinity changes and the sediment trapping rate in the estuary. However, over the long term, the reduction of riverine sediment yield increased marine sediments trapped into the estuary. The results of this study indicate that human activities (e.g., dams) have substantially altered sediment delivery patterns and river-estuary interactions at multiple timescales.

  18. [Application of artificial neural networks on the prediction of surface ozone concentrations].

    PubMed

    Shen, Lu-Lu; Wang, Yu-Xuan; Duan, Lei

    2011-08-01

    Ozone is an important secondary air pollutant in the lower atmosphere. In order to predict the hourly maximum ozone one day in advance based on the meteorological variables for the Wanqingsha site in Guangzhou, Guangdong province, a neural network model (Multi-Layer Perceptron) and a multiple linear regression model were used and compared. Model inputs are meteorological parameters (wind speed, wind direction, air temperature, relative humidity, barometric pressure and solar radiation) of the next day and hourly maximum ozone concentration of the previous day. The OBS (optimal brain surgeon) was adopted to prune the neutral work, to reduce its complexity and to improve its generalization ability. We find that the pruned neural network has the capacity to predict the peak ozone, with an agreement index of 92.3%, the root mean square error of 0.0428 mg/m3, the R-square of 0.737 and the success index of threshold exceedance 77.0% (the threshold O3 mixing ratio of 0.20 mg/m3). When the neural classifier was added to the neural network model, the success index of threshold exceedance increased to 83.6%. Through comparison of the performance indices between the multiple linear regression model and the neural network model, we conclud that that neural network is a better choice to predict peak ozone from meteorological forecast, which may be applied to practical prediction of ozone concentration.

  19. Artificial Neural Networks: A Novel Approach to Analysing the Nutritional Ecology of a Blowfly Species, Chrysomya megacephala

    PubMed Central

    Bianconi, André; Zuben, Cláudio J. Von; Serapião, Adriane B. de S.; Govone, José S.

    2010-01-01

    Bionomic features of blowflies may be clarified and detailed by the deployment of appropriate modelling techniques such as artificial neural networks, which are mathematical tools widely applied to the resolution of complex biological problems. The principal aim of this work was to use three well-known neural networks, namely Multi-Layer Perceptron (MLP), Radial Basis Function (RBF), and Adaptive Neural Network-Based Fuzzy Inference System (ANFIS), to ascertain whether these tools would be able to outperform a classical statistical method (multiple linear regression) in the prediction of the number of resultant adults (survivors) of experimental populations of Chrysomya megacephala (F.) (Diptera: Calliphoridae), based on initial larval density (number of larvae), amount of available food, and duration of immature stages. The coefficient of determination (R2) derived from the RBF was the lowest in the testing subset in relation to the other neural networks, even though its R2 in the training subset exhibited virtually a maximum value. The ANFIS model permitted the achievement of the best testing performance. Hence this model was deemed to be more effective in relation to MLP and RBF for predicting the number of survivors. All three networks outperformed the multiple linear regression, indicating that neural models could be taken as feasible techniques for predicting bionomic variables concerning the nutritional dynamics of blowflies. PMID:20569135

  20. Land cover in the Guayas Basin using SAR images from low resolution ASAR Global mode to high resolution Sentinel-1 images

    NASA Astrophysics Data System (ADS)

    Bourrel, Luc; Brodu, Nicolas; Frappart, Frédéric

    2016-04-01

    Remotely sensed images allow a frequent monitoring of land cover variations at regional and global scale. Recently launched Sentinel-1 satellite offers a global cover of land areas at an unprecedented spatial (20 m) and temporal (6 days at the Equator). We propose here to compare the performances of commonly used supervised classification techniques (i.e., k-nearest neighbors, linear and Gaussian support vector machines, naive Bayes, linear and quadratic discriminant analyzes, adaptative boosting, loggit regression, ridge regression with one-vs-one voting, random forest, extremely randomized trees) for land cover applications in the Guayas Basin, the largest river basin of the Pacific coast of Ecuator (area ~32,000 km²). The reason of this choice is the importance of this region in Ecuatorian economy as its watershed represents 13% of the total area of Ecuador where 40% of the Ecuadorian population lives. It also corresponds to the most productive region of Ecuador for agriculture and aquaculture. Fifty percents of the country shrimp farming production comes from this watershed, and represents with agriculture the largest source of revenue of the country. Similar comparisons are also performed using ENVISAT ASAR images acquired in global mode (1 km of spatial resolution). Accuracy of the results will be achieved using land cover map derived from multi-spectral images.

  1. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    PubMed Central

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  2. Numerical investigation of multi-beam laser heterodyne measurement with ultra-precision for linear expansion coefficient of metal based on oscillating mirror modulation

    NASA Astrophysics Data System (ADS)

    Li, Yan-Chao; Wang, Chun-Hui; Qu, Yang; Gao, Long; Cong, Hai-Fang; Yang, Yan-Ling; Gao, Jie; Wang, Ao-You

    2011-01-01

    This paper proposes a novel method of multi-beam laser heterodyne measurement for metal linear expansion coefficient. Based on the Doppler effect and heterodyne technology, the information is loaded of length variation to the frequency difference of the multi-beam laser heterodyne signal by the frequency modulation of the oscillating mirror, this method can obtain many values of length variation caused by temperature variation after the multi-beam laser heterodyne signal demodulation simultaneously. Processing these values by weighted-average, it can obtain length variation accurately, and eventually obtain the value of linear expansion coefficient of metal by the calculation. This novel method is used to simulate measurement for linear expansion coefficient of metal rod under different temperatures by MATLAB, the obtained result shows that the relative measurement error of this method is just 0.4%.

  3. Assessing the role of feed water constituents in irreversible membrane fouling of pilot-scale ultrafiltration drinking water treatment systems.

    PubMed

    Peiris, R H; Jaklewicz, M; Budman, H; Legge, R L; Moresoli, C

    2013-06-15

    Fluorescence excitation-emission matrix (EEM) approach together with principal component analysis (PCA) was used for assessing hydraulically irreversible fouling of three pilot-scale ultrafiltration (UF) systems containing full-scale and bench-scale hollow fiber membrane modules in drinking water treatment. These systems were operated for at least three months with extensive cycles of permeation, combination of back-pulsing and scouring and chemical cleaning. The principal component (PC) scores generated from the PCA of the fluorescence EEMs were found to be related to humic substances (HS), protein-like and colloidal/particulate matter content. PC scores of HS- and protein-like matter of the UF feed water, when considered separately, showed reasonably good correlations with the rate of hydraulically irreversible fouling for long-term UF operations. In contrast, comparatively weaker correlations for PC scores of colloidal/particulate matter and the rate of hydraulically irreversible fouling were obtained for all UF systems. Since, individual correlations could not fully explain the evolution of the rate of irreversible fouling, multi-linear regression models were developed to relate the combined effect of HS-like, protein-like and colloidal/particulate matter PC scores to the rate of hydraulically irreversible fouling for each specific UF system. These multi-linear regression models revealed significant individual and combined contribution of HS- and protein-like matter to the rate of hydraulically irreversible fouling, with protein-like matter generally showing the greatest contribution. The contribution of colloidal/particulate matter to the rate of hydraulically irreversible fouling was not as significant. The addition of polyaluminum chloride, as coagulant, to UF feed appeared to have a positive impact in reducing hydraulically irreversible fouling by these constituents. The proposed approach has applications in quantifying the individual and synergistic contribution of major natural water constituents to the rate of hydraulically irreversible membrane fouling and shows potential for controlling UF irreversible fouling in the production of drinking water. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Low-rank regularization for learning gene expression programs.

    PubMed

    Ye, Guibo; Tang, Mengfan; Cai, Jian-Feng; Nie, Qing; Xie, Xiaohui

    2013-01-01

    Learning gene expression programs directly from a set of observations is challenging due to the complexity of gene regulation, high noise of experimental measurements, and insufficient number of experimental measurements. Imposing additional constraints with strong and biologically motivated regularizations is critical in developing reliable and effective algorithms for inferring gene expression programs. Here we propose a new form of regulation that constrains the number of independent connectivity patterns between regulators and targets, motivated by the modular design of gene regulatory programs and the belief that the total number of independent regulatory modules should be small. We formulate a multi-target linear regression framework to incorporate this type of regulation, in which the number of independent connectivity patterns is expressed as the rank of the connectivity matrix between regulators and targets. We then generalize the linear framework to nonlinear cases, and prove that the generalized low-rank regularization model is still convex. Efficient algorithms are derived to solve both the linear and nonlinear low-rank regularized problems. Finally, we test the algorithms on three gene expression datasets, and show that the low-rank regularization improves the accuracy of gene expression prediction in these three datasets.

  5. Resting Heart Rate as Predictor for Left Ventricular Dysfunction and Heart Failure: The Multi-Ethnic Study of Atherosclerosis

    PubMed Central

    Opdahl, Anders; Venkatesh, Bharath Ambale; Fernandes, Veronica R. S.; Wu, Colin O.; Nasir, Khurram; Choi, Eui-Young; Almeida, Andre L. C.; Rosen, Boaz; Carvalho, Benilton; Edvardsen, Thor; Bluemke, David A.; Lima, Joao A. C.

    2014-01-01

    OBJECTIVE To investigate the relationship between baseline resting heart rate and incidence of heart failure (HF) and global and regional left ventricular (LV) dysfunction. BACKGROUND The association of resting heart rate to HF and LV function is not well described in an asymptomatic multi-ethnic population. METHODS Participants in the Multi-Ethnic Study of Atherosclerosis had resting heart rate measured at inclusion. Incident HF was registered (n=176) during follow-up (median 7 years) in those who underwent cardiac MRI (n=5000). Changes in ejection fraction (ΔEF) and peak circumferential strain (Δεcc) were measured as markers of developing global and regional LV dysfunction in 1056 participants imaged at baseline and 5 years later. Time to HF (Cox model) and Δεcc and ΔEF (multiple linear regression models) were adjusted for demographics, traditional cardiovascular risk factors, calcium score, LV end-diastolic volume and mass in addition to resting heart rate. RESULTS Cox analysis demonstrated that for 1 bpm increase in resting heart rate there was a 4% greater adjusted relative risk for incident HF (Hazard Ratio: 1.04 (1.02, 1.06 (95% CI); P<0.001). Adjusted multiple regression models demonstrated that resting heart rate was positively associated with deteriorating εcc and decrease in EF, even in analyses when all coronary heart disease events were excluded from the model. CONCLUSION Elevated resting heart rate is associated with increased risk for incident HF in asymptomatic participants in MESA. Higher heart rate is related to development of regional and global LV dysfunction independent of subclinical atherosclerosis and coronary heart disease. PMID:24412444

  6. Linear regression analysis of survival data with missing censoring indicators.

    PubMed

    Wang, Qihua; Dinse, Gregg E

    2011-04-01

    Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

  7. An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.

    DTIC Science & Technology

    1983-09-01

    books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40

  8. Testing hypotheses for differences between linear regression lines

    Treesearch

    Stanley J. Zarnoch

    2009-01-01

    Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...

  9. Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

    ERIC Educational Resources Information Center

    Schafer, William D.; Wang, Yuh-Yin

    A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…

  10. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    ERIC Educational Resources Information Center

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  11. Estimating monotonic rates from biological data using local linear regression.

    PubMed

    Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

    2017-03-01

    Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.

  12. Liver attenuation, pericardial adipose tissue, obesity, and insulin resistance: the Multi-Ethnic Study of Atherosclerosis (MESA).

    PubMed

    McAuley, Paul A; Hsu, Fang-Chi; Loman, Kurt K; Carr, J Jeffrey; Budoff, Matthew J; Szklo, Moyses; Sharrett, A Richey; Ding, Jingzhong

    2011-09-01

    Insulin resistance is linked to general and abdominal obesity, but its relation to hepatic lipid content and pericardial adipose tissue is less clear. The purpose of this study was to examine cross-sectional associations of liver attenuation, pericardial adipose tissue, BMI, and waist circumference with insulin resistance. We measured liver attenuation and pericardial adipose tissue using the existing cardiac computed tomography scans in 5,291 individuals free of clinical cardiovascular disease and diabetes in the Multi-Ethnic Study of Atherosclerosis (MESA) during the study's baseline visit (2000-2002). Low liver attenuation was defined as the lowest quartile and high pericardial adipose tissue as the upper quartile of volume (cm(3)). We used standard clinical definitions for obesity and abdominal obesity. Insulin resistance was assessed by the homeostasis model assessment of insulin resistance (HOMA(IR)) index. In multivariate linear regression with all adiposity measures in the model simultaneously, all adiposity measures were significantly (P < 0.0001) associated with insulin resistance: regression coefficients (±s.e.) were 0.31 (±0.02) for low liver attenuation, 0.27 (±0.02) for high pericardial adipose tissue, 0.27 (±0.02) for obesity, and 0.32 (±0.02) for abdominal obesity. We found significant differences (P = 0.003) between standardized liver attenuation and insulin resistance by ethnicity: regression coefficients per 1 s.d. increment were 0.10 ± 0.01 for whites, 0.11 ± 0.02 for Chinese, 0.08 ± 0.2 for blacks, and 0.14 ± 0.01 for Hispanics. Liver attenuation and pericardial adipose tissue were associated with insulin resistance, independent of BMI and waist circumference.

  13. Lagged kernel machine regression for identifying time windows of susceptibility to exposures of complex mixtures.

    PubMed

    Liu, Shelley H; Bobb, Jennifer F; Lee, Kyu Ha; Gennings, Chris; Claus Henn, Birgit; Bellinger, David; Austin, Christine; Schnaas, Lourdes; Tellez-Rojo, Martha M; Hu, Howard; Wright, Robert O; Arora, Manish; Coull, Brent A

    2018-07-01

    The impact of neurotoxic chemical mixtures on children's health is a critical public health concern. It is well known that during early life, toxic exposures may impact cognitive function during critical time intervals of increased vulnerability, known as windows of susceptibility. Knowledge on time windows of susceptibility can help inform treatment and prevention strategies, as chemical mixtures may affect a developmental process that is operating at a specific life phase. There are several statistical challenges in estimating the health effects of time-varying exposures to multi-pollutant mixtures, such as: multi-collinearity among the exposures both within time points and across time points, and complex exposure-response relationships. To address these concerns, we develop a flexible statistical method, called lagged kernel machine regression (LKMR). LKMR identifies critical exposure windows of chemical mixtures, and accounts for complex non-linear and non-additive effects of the mixture at any given exposure window. Specifically, LKMR estimates how the effects of a mixture of exposures change with the exposure time window using a Bayesian formulation of a grouped, fused lasso penalty within a kernel machine regression (KMR) framework. A simulation study demonstrates the performance of LKMR under realistic exposure-response scenarios, and demonstrates large gains over approaches that consider each time window separately, particularly when serial correlation among the time-varying exposures is high. Furthermore, LKMR demonstrates gains over another approach that inputs all time-specific chemical concentrations together into a single KMR. We apply LKMR to estimate associations between neurodevelopment and metal mixtures in Early Life Exposures in Mexico and Neurotoxicology, a prospective cohort study of child health in Mexico City.

  14. Locally linear regression for pose-invariant face recognition.

    PubMed

    Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

    2007-07-01

    The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.

  15. [Frailty, disability and multi-morbidity: the relationship with quality of life and healthcare costs in elderly people].

    PubMed

    Lutomski, Jennifer E; Baars, Maria A E; Boter, Han; Buurman, Bianca M; den Elzen, Wendy P J; Jansen, Aaltje P D; Kempen, Gertrudis I J M; Steunenberg, Bas; Steyerberg, Ewout W; Olde Rikkert, Marcel G M; Melis, René J F

    2014-01-01

    To assess the independent and combined impact of frailty, multi-morbidity, and activities of daily living (ADL) limitations on self-reported quality of life and healthcare costs in elderly people. Cross-sectional, descriptive study. Data came from The Older Persons and Informal Caregivers Minimum DataSet (TOPICS-MDS), a pooled dataset with information from 41 projects across the Netherlands from the Dutch national care for the Elderly programme. Frailty, multi-morbidity and ADL limitations, and the interactions between these domains, were used as predictors in regression analyses with quality of life and healthcare costs as outcome measures. Analyses were stratified by living situation (independent or care home). Directionality and magnitude of associations were assessed using linear mixed models. A total of 11,093 elderly people were interviewed. A substantial proportion of elderly people living independently reported frailty, multi-morbidity, and/or ADL limitations (56.4%, 88.3% and 41.4%, respectively), as did elderly people living in a care home (88.7%, 89.2% and 77,3%, respectively). One-third of elderly people living at home (31.9%) reported all three conditions compared with two-thirds of elderly people living in a care home (68.3%). In the multivariable analysis, frailty had a strong impact on outcomes independently of multi-morbidity and ADL limitations. Elderly people experiencing problems across all three domains reported the poorest quality-of-life scores and the highest healthcare costs, irrespective of their living situation. Frailty, multi-morbidity and ADL limitations are complementary measurements, which together provide a more holistic understanding of health status in elderly people. A multi-dimensional approach is important in mapping the complex relationships between these measurements on the one hand and the quality of life and healthcare costs on the other.

  16. Sensitivity of Chemical Shift-Encoded Fat Quantification to Calibration of Fat MR Spectrum

    PubMed Central

    Wang, Xiaoke; Hernando, Diego; Reeder, Scott B.

    2015-01-01

    Purpose To evaluate the impact of different fat spectral models on proton density fat-fraction (PDFF) quantification using chemical shift-encoded (CSE) MRI. Material and Methods Simulations and in vivo imaging were performed. In a simulation study, spectral models of fat were compared pairwise. Comparison of magnitude fitting and mixed fitting was performed over a range of echo times and fat fractions. In vivo acquisitions from 41 patients were reconstructed using 7 published spectral models of fat. T2-corrected STEAM-MRS was used as reference. Results Simulations demonstrate that imperfectly calibrated spectral models of fat result in biases that depend on echo times and fat fraction. Mixed fitting is more robust against this bias than magnitude fitting. Multi-peak spectral models showed much smaller differences among themselves than when compared to the single-peak spectral model. In vivo studies show all multi-peak models agree better (for mixed fitting, slope ranged from 0.967–1.045 using linear regression) with reference standard than the single-peak model (for mixed fitting, slope=0.76). Conclusion It is essential to use a multi-peak fat model for accurate quantification of fat with CSE-MRI. Further, fat quantification techniques using multi-peak fat models are comparable and no specific choice of spectral model is shown to be superior to the rest. PMID:25845713

  17. Self-consistent asset pricing models

    NASA Astrophysics Data System (ADS)

    Malevergne, Y.; Sornette, D.

    2007-08-01

    We discuss the foundations of factor or regression models in the light of the self-consistency condition that the market portfolio (and more generally the risk factors) is (are) constituted of the assets whose returns it is (they are) supposed to explain. As already reported in several articles, self-consistency implies correlations between the return disturbances. As a consequence, the alphas and betas of the factor model are unobservable. Self-consistency leads to renormalized betas with zero effective alphas, which are observable with standard OLS regressions. When the conditions derived from internal consistency are not met, the model is necessarily incomplete, which means that some sources of risk cannot be replicated (or hedged) by a portfolio of stocks traded on the market, even for infinite economies. Analytical derivations and numerical simulations show that, for arbitrary choices of the proxy which are different from the true market portfolio, a modified linear regression holds with a non-zero value αi at the origin between an asset i's return and the proxy's return. Self-consistency also introduces “orthogonality” and “normality” conditions linking the betas, alphas (as well as the residuals) and the weights of the proxy portfolio. Two diagnostics based on these orthogonality and normality conditions are implemented on a basket of 323 assets which have been components of the S&P500 in the period from January 1990 to February 2005. These two diagnostics show interesting departures from dynamical self-consistency starting about 2 years before the end of the Internet bubble. Assuming that the CAPM holds with the self-consistency condition, the OLS method automatically obeys the resulting orthogonality and normality conditions and therefore provides a simple way to self-consistently assess the parameters of the model by using proxy portfolios made only of the assets which are used in the CAPM regressions. Finally, the factor decomposition with the self-consistency condition derives a risk-factor decomposition in the multi-factor case which is identical to the principal component analysis (PCA), thus providing a direct link between model-driven and data-driven constructions of risk factors. This correspondence shows that PCA will therefore suffer from the same limitations as the CAPM and its multi-factor generalization, namely lack of out-of-sample explanatory power and predictability. In the multi-period context, the self-consistency conditions force the betas to be time-dependent with specific constraints.

  18. Effect of Malmquist bias on correlation studies with IRAS data base

    NASA Technical Reports Server (NTRS)

    Verter, Frances

    1993-01-01

    The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.

  19. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  20. A secure distributed logistic regression protocol for the detection of rare adverse drug events

    PubMed Central

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-01-01

    Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397

  1. A secure distributed logistic regression protocol for the detection of rare adverse drug events.

    PubMed

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-05-01

    There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.

  2. Statistical Methods in Ai: Rare Event Learning Using Associative Rules and Higher-Order Statistics

    NASA Astrophysics Data System (ADS)

    Iyer, V.; Shetty, S.; Iyengar, S. S.

    2015-07-01

    Rare event learning has not been actively researched since lately due to the unavailability of algorithms which deal with big samples. The research addresses spatio-temporal streams from multi-resolution sensors to find actionable items from a perspective of real-time algorithms. This computing framework is independent of the number of input samples, application domain, labelled or label-less streams. A sampling overlap algorithm such as Brooks-Iyengar is used for dealing with noisy sensor streams. We extend the existing noise pre-processing algorithms using Data-Cleaning trees. Pre-processing using ensemble of trees using bagging and multi-target regression showed robustness to random noise and missing data. As spatio-temporal streams are highly statistically correlated, we prove that a temporal window based sampling from sensor data streams converges after n samples using Hoeffding bounds. Which can be used for fast prediction of new samples in real-time. The Data-cleaning tree model uses a nonparametric node splitting technique, which can be learned in an iterative way which scales linearly in memory consumption for any size input stream. The improved task based ensemble extraction is compared with non-linear computation models using various SVM kernels for speed and accuracy. We show using empirical datasets the explicit rule learning computation is linear in time and is only dependent on the number of leafs present in the tree ensemble. The use of unpruned trees (t) in our proposed ensemble always yields minimum number (m) of leafs keeping pre-processing computation to n × t log m compared to N2 for Gram Matrix. We also show that the task based feature induction yields higher Qualify of Data (QoD) in the feature space compared to kernel methods using Gram Matrix.

  3. Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2013-01-01

    This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…

  4. The numerical solution of linear multi-term fractional differential equations: systems of equations

    NASA Astrophysics Data System (ADS)

    Edwards, John T.; Ford, Neville J.; Simpson, A. Charles

    2002-11-01

    In this paper, we show how the numerical approximation of the solution of a linear multi-term fractional differential equation can be calculated by reduction of the problem to a system of ordinary and fractional differential equations each of order at most unity. We begin by showing how our method applies to a simple class of problems and we give a convergence result. We solve the Bagley Torvik equation as an example. We show how the method can be applied to a general linear multi-term equation and give two further examples.

  5. Computerized tomography measured liver fat is associated with low levels of N-terminal pro-brain natriuretic protein (NT-proBNP). Multi-Ethnic Study of Atherosclerosis

    PubMed Central

    Sanchez, Otto A.; Mariana, Lazo-Elizondo; Irfan, Zeb; Tracy, Russell P; Bradley, Ryan; Duprez, Daniel A.; Bahrami, Hossein; Peralta, Carmen A.; Daniels, Lori B.; Lima, João A.; Maisel, Alan; Jacobs, David R.; MJ, Budoff

    2016-01-01

    Background and aims N-terminal pro B-type natriuretic peptide (NT-proBNP) is inversely associated with diabetes mellitus, obesity and metabolic syndrome. We aim to characterize the association between NT-proBNP and nonalcoholic fatty liver disease (NAFLD), a condition strongly associated with metabolic syndrome. Methods 4529 participants from the Multi-Ethnic Study of Atherosclerosis (MESA) free of cardiovascular disease, without self-reported liver disease and not diabetic at their baseline visit in 2000- 2002 were included in this analysis. NAFLD was defined by a liver attenuation <40 HU. Relative prevalence (RP) for NAFLD was assessed adjusted for age, race, and sex, percent of dietary calories derived from fat, total intentional exercise, alcoholic drinks per week, and interleukin-6 by quintiles of NT-proBNP. Adjusted linear spline model was used to characterize a non-linear association between NT-proBNP and liver fat. The inflection point (IP) was the NT-proBNP concentration where there was a change in slope in the association between liver attenuation and NT-proBNP. Results RP for NAFLD decreased by 30% from the lowest to the highest quintile of NT-proBNP, p = 0.01. We observed an inverse linear association between NT-proBNP and liver fat, which plateaued (IP) at an NT-proBNP concentration of 45 pg/mL. Linear regression coefficient (SE) per unit of NT-proBNP < and ≥ IP was of 0.05 (0.02), p = 0.001 and 0.0006 (0.0008), p = 0.5, respectively, differences between slopes p < 0.0001. Conclusions In this cross-sectional study of a community based multiethnic sample of non-diabetic adults, low levels of NT-proBNP are associated with greater prevalence of NAFLD. PMID:27085779

  6. Cardiovascular diseases and air pollution in Novi Sad, Serbia.

    PubMed

    Jevtić, Marija; Dragić, Nataša; Bijelović, Sanja; Popović, Milka

    2014-04-01

    A large body of evidence has documented that air pollutants have adverse effect on human health as well as on the environment. The aim of this study was to determine whether there was an association between outdoor concentrations of sulfur dioxide (SO2) and nitrogen dioxide (NO2) and a daily number of hospital admissions due to cardiovascular diseases (CVD) in Novi Sad, Serbia among patients aged above 18. The investigation was carried out during over a 3-year period (from January 1, 2007 to December 31, 2009) in the area of Novi Sad. The number (N = 10 469) of daily CVD (ICD-10: I00-I99) hospital admissions was collected according to patients' addresses. Daily mean levels of NO2 and SO2, measured in the ambient air of Novi Sad via a network of fixed samplers, have been used to put forward outdoor air pollution. Associations between air pollutants and hospital admissions were firstly analyzed by the use of the linear regression in a single polluted model, and then trough a single and multi-polluted adjusted generalized linear Poisson model. The single polluted model (without confounding factors) indicated that there was a linear increase in the number of hospital admissions due to CVD in relation to the linear increase in concentrations of SO2 (p = 0.015; 95% confidence interval (95% CI): 0.144-1.329, R(2) = 0.005) and NO2 (p = 0.007; 95% CI: 0.214-1.361, R(2) = 0.007). However, the single and multi-polluted adjusted models revealed that only NO2 was associated with the CVD (p = 0.016, relative risk (RR) = 1.049, 95% CI: 1.009-1.091 and p = 0.022, RR = 1.047, 95% CI: 1.007-1.089, respectively). This study shows a significant positive association between hospital admissions due to CVD and outdoor NO2 concentrations in the area of Novi Sad, Serbia.

  7. Development of Ensemble Model Based Water Demand Forecasting Model

    NASA Astrophysics Data System (ADS)

    Kwon, Hyun-Han; So, Byung-Jin; Kim, Seong-Hyeon; Kim, Byung-Seop

    2014-05-01

    In recent years, Smart Water Grid (SWG) concept has globally emerged over the last decade and also gained significant recognition in South Korea. Especially, there has been growing interest in water demand forecast and optimal pump operation and this has led to various studies regarding energy saving and improvement of water supply reliability. Existing water demand forecasting models are categorized into two groups in view of modeling and predicting their behavior in time series. One is to consider embedded patterns such as seasonality, periodicity and trends, and the other one is an autoregressive model that is using short memory Markovian processes (Emmanuel et al., 2012). The main disadvantage of the abovementioned model is that there is a limit to predictability of water demands of about sub-daily scale because the system is nonlinear. In this regard, this study aims to develop a nonlinear ensemble model for hourly water demand forecasting which allow us to estimate uncertainties across different model classes. The proposed model is consist of two parts. One is a multi-model scheme that is based on combination of independent prediction model. The other one is a cross validation scheme named Bagging approach introduced by Brieman (1996) to derive weighting factors corresponding to individual models. Individual forecasting models that used in this study are linear regression analysis model, polynomial regression, multivariate adaptive regression splines(MARS), SVM(support vector machine). The concepts are demonstrated through application to observed from water plant at several locations in the South Korea. Keywords: water demand, non-linear model, the ensemble forecasting model, uncertainty. Acknowledgements This subject is supported by Korea Ministry of Environment as "Projects for Developing Eco-Innovation Technologies (GT-11-G-02-001-6)

  8. Quantification of trace metals in infant formula premixes using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Cama-Moncunill, Raquel; Casado-Gavalda, Maria P.; Cama-Moncunill, Xavier; Markiewicz-Keszycka, Maria; Dixit, Yash; Cullen, Patrick J.; Sullivan, Carl

    2017-09-01

    Infant formula is a human milk substitute generally based upon fortified cow milk components. In order to mimic the composition of breast milk, trace elements such as copper, iron and zinc are usually added in a single operation using a premix. The correct addition of premixes must be verified to ensure that the target levels in infant formulae are achieved. In this study, a laser-induced breakdown spectroscopy (LIBS) system was assessed as a fast validation tool for trace element premixes. LIBS is a promising emission spectroscopic technique for elemental analysis, which offers real-time analyses, little to no sample preparation and ease of use. LIBS was employed for copper and iron determinations of premix samples ranging approximately from 0 to 120 mg/kg Cu/1640 mg/kg Fe. LIBS spectra are affected by several parameters, hindering subsequent quantitative analyses. This work aimed at testing three matrix-matched calibration approaches (simple-linear regression, multi-linear regression and partial least squares regression (PLS)) as means for precision and accuracy enhancement of LIBS quantitative analysis. All calibration models were first developed using a training set and then validated with an independent test set. PLS yielded the best results. For instance, the PLS model for copper provided a coefficient of determination (R2) of 0.995 and a root mean square error of prediction (RMSEP) of 14 mg/kg. Furthermore, LIBS was employed to penetrate through the samples by repetitively measuring the same spot. Consequently, LIBS spectra can be obtained as a function of sample layers. This information was used to explore whether measuring deeper into the sample could reduce possible surface-contaminant effects and provide better quantifications.

  9. Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2011-01-01

    Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…

  10. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    ERIC Educational Resources Information Center

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  11. Classical Testing in Functional Linear Models.

    PubMed

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.

  12. Classical Testing in Functional Linear Models

    PubMed Central

    Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

    2016-01-01

    We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155

  13. A Linear Regression and Markov Chain Model for the Arabian Horse Registry

    DTIC Science & Technology

    1993-04-01

    as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to

  14. Absolute determination of copper and silver in ancient coins using 14 MeV neutrons

    NASA Astrophysics Data System (ADS)

    Chalouhi, Ch.; Hourani, E.; Loos, R.; Melki, S.

    1982-09-01

    A method for absolute determination of copper and silver in ancient coins is described. Activation analysis by 14 MeV neutrons is performed. In the experimental procedure emphasis is placed on corrections for neutrons and gamma attenuation. In the analytical procedure, a multi linear-regression calculation is used to separate different contributions to the 511 keV gamma peak. The precision in the absolute determination of Cu and Ag is better than 2% in recent coins of definite shapes, whereas it is a somewhat lower in ancient coins of irregular shapes. The method was applied to ancient coins provided by the Museum of the American University of Beirut. Overall consistency and suitability of the method were obtained.

  15. Training on the DSM-5 Cultural Formulation Interview Improves Cultural Competence in General Psychiatry Residents: A Multi-site Study.

    PubMed

    Mills, Stacia; Wolitzky-Taylor, Kate; Xiao, Anna Q; Bourque, Marie Claire; Rojas, Sandra M Peynado; Bhattacharya, Debanjana; Simpson, Annabelle K; Maye, Aleea; Lo, Pachida; Clark, Aaron; Lim, Russell; Lu, Francis G

    2016-10-01

    The authors assessed whether a 1-h didactic session on the DSM-5 Cultural Formulation Interview (CFI) improves cultural competence of general psychiatry residents. Psychiatry residents at six residency programs completed demographics and pre-intervention questionnaires, were exposed to a 1-h session on the CFI, and completed a post-intervention questionnaire. Repeated measures ANCOVA compared pre- to post-intervention change. Linear regression assessed whether previous cultural experience predicted post-intervention scores. Mean scores on the questionnaire significantly changed from pre- to post-intervention (p < 0.001). Previous cultural experience did not predict post-intervention scores. Psychiatry residents' cultural competence scores improved with a 1-h session on the CFI but with notable limitations.

  16. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  17. CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

    NASA Astrophysics Data System (ADS)

    Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

    2007-07-01

    Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.

  18. Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing.

    PubMed

    Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Chen, Charles; Porth, Ilga; El-Kassaby, Yousry A

    2015-05-09

    Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.

  19. Multi-Window Controllers for Autonomous Space Systems

    NASA Technical Reports Server (NTRS)

    Lurie, B, J.; Hadaegh, F. Y.

    1997-01-01

    Multi-window controllers select between elementary linear controllers using nonlinear windows based on the amplitude and frequency content of the feedback error. The controllers are relatively simple to implement and perform much better than linear controllers. The commanders for such controllers only order the destination point and are freed from generating the command time-profiles. The robotic missions rely heavily on the tasks of acquisition and tracking. For autonomous and optimal control of the spacecraft, the control bandwidth must be larger while the feedback can (and, therefore, must) be reduced.. Combining linear compensators via multi-window nonlinear summer guarantees minimum phase character of the combined transfer function. It is shown that the solution may require using several parallel branches and windows. Several examples of multi-window nonlinear controller applications are presented.

  20. Regression equations for estimation of annual peak-streamflow frequency for undeveloped watersheds in Texas using an L-moment-based, PRESS-minimized, residual-adjusted approach

    USGS Publications Warehouse

    Asquith, William H.; Roussel, Meghan C.

    2009-01-01

    Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed characteristics of drainage area, dimensionless main-channel slope, and mean annual precipitation. The residuals of the nine equations are spatially mapped, and residuals for the 10-year recurrence interval are selected for generalization to 1-degree latitude and longitude quadrangles. The generalized residual is referred to as the OmegaEM parameter and represents a generalized terrain and climate index that expresses peak-streamflow potential not otherwise represented in the three watershed characteristics. The OmegaEM parameter was assigned to each station, and using OmegaEM, nine additional regression equations are computed. Because of favorable diagnostics, the OmegaEM equations are expected to be generally reliable estimators of peak-streamflow frequency for undeveloped and ungaged stream locations in Texas. The mean residual standard error, adjusted R-squared, and percentage reduction of PRESS by use of OmegaEM are 0.30log10, 0.86, and -21 percent, respectively. Inclusion of the OmegaEM parameter provides a substantial reduction in the PRESS statistic of the regression equations and removes considerable spatial dependency in regression residuals. Although the OmegaEM parameter requires interpretation on the part of analysts and the potential exists that different analysts could estimate different values for a given watershed, the authors suggest that typical uncertainty in the OmegaEM estimate might be about +or-0.1010. Finally, given the two ensembles of equations reported herein and those in previous reports, hydrologic design engineers and other analysts have several different methods, which represent different analytical tracks, to make comparisons of peak-streamflow frequency estimates for ungaged watersheds in the study area.

  1. The Uncertainty of Long-term Linear Trend in Global SST Due to Internal Variation

    NASA Astrophysics Data System (ADS)

    Lian, Tao

    2016-04-01

    In most parts of the global ocean, the magnitude of the long-term linear trend in sea surface temperature (SST) is much smaller than the amplitude of local multi-scale internal variation. One can thus use the record of a specified period to arbitrarily determine the value and the sign of the long-term linear trend in regional SST, and further leading to controversial conclusions on how global SST responds to global warming in the recent history. Analyzing the linear trend coefficient estimated by the ordinary least-square method indicates that the linear trend consists of two parts: One related to the long-term change, and the other related to the multi-scale internal variation. The sign of the long-term change can be correctly reproduced only when the magnitude of the linear trend coefficient is greater than a theoretical threshold which scales the influence from the multi-scale internal variation. Otherwise, the sign of the linear trend coefficient will depend on the phase of the internal variation, or in the other words, the period being used. An improved least-square method is then proposed to reduce the theoretical threshold. When apply the new method to a global SST reconstruction from 1881 to 2013, we find that in a large part of Pacific, the southern Indian Ocean and North Atlantic, the influence from the multi-scale internal variation on the sign of the linear trend coefficient can-not be excluded. Therefore, the resulting warming or/and cooling linear trends in these regions can-not be fully assigned to global warming.

  2. Biostatistics Series Module 6: Correlation and Linear Regression.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

  3. Biostatistics Series Module 6: Correlation and Linear Regression

    PubMed Central

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175

  4. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    ERIC Educational Resources Information Center

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  5. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    PubMed

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  6. Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

    USGS Publications Warehouse

    Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

    2009-01-01

    In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.

  7. CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

    NASA Astrophysics Data System (ADS)

    Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

    2007-11-01

    Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.

  8. Spatial decorrelation stretch of annual (2003-2014) Daymet precipitation summaries on a 1-km grid for California, Nevada, Arizona, and Utah.

    PubMed

    Ch Miliaresis, George

    2016-06-01

    A method is presented for elevation (H) and spatial position (X, Y) decorrelation stretch of annual precipitation summaries on a 1-km grid for SW USA for the period 2003 to 2014. Multiple linear regression analysis of the first and second principal component (PC) quantifies the variance in the multi-temporal precipitation imagery that is explained by X, Y, and elevation (h). The multi-temporal dataset is reconstructed from the PC1 and PC2 residual images and the later PCs by taking into account the variance that is not related to X, Y, and h. Clustering of the reconstructed precipitation dataset allowed the definition of positive (for example, in Sierra Nevada, Salt Lake City) and negative (for example, in San Joaquin Valley, Nevada, Colorado Plateau) precipitation anomalies. The temporal and spatial patterns defined from the spatially standardized multi-temporal precipitation imagery provide a tool of comparison for regions in different geographic environments according to the deviation from the precipitation amount that they are expected to receive as function of X, Y, and h. Such a standardization allows the definition of less or more sensitive to climatic change regions and gives an insight in the spatial impact of atmospheric circulation that causes the annual precipitation.

  9. Multi-segmental movements as a function of experience in karate.

    PubMed

    Zago, Matteo; Codari, Marina; Iaia, F Marcello; Sforza, Chiarella

    2017-08-01

    Karate is a martial art that partly depends on subjective scoring of complex movements. Principal component analysis (PCA)-based methods can identify the fundamental synergies (principal movements) of motor system, providing a quantitative global analysis of technique. In this study, we aimed at describing the fundamental multi-joint synergies of a karate performance, under the hypothesis that the latter are skilldependent; estimate karateka's experience level, expressed as years of practice. A motion capture system recorded traditional karate techniques of 10 professional and amateur karateka. At any time point, the 3D-coordinates of body markers produced posture vectors that were normalised, concatenated from all karateka and submitted to a first PCA. Five principal movements described both gross movement synergies and individual differences. A second PCA followed by linear regression estimated the years of practice using principal movements (eigenpostures and weighting curves) and centre of mass kinematics (error: 3.71 years; R2 = 0.91, P ≪ 0.001). Principal movements and eigenpostures varied among different karateka and as functions of experience. This approach provides a framework to develop visual tools for the analysis of motor synergies in karate, allowing to detect the multi-joint motor patterns that should be restored after an injury, or to be specifically trained to increase performance.

  10. Constructing general partial differential equations using polynomial and neural networks.

    PubMed

    Zjavka, Ladislav; Pedrycz, Witold

    2016-01-01

    Sum fraction terms can approximate multi-variable functions on the basis of discrete observations, replacing a partial differential equation definition with polynomial elementary data relation descriptions. Artificial neural networks commonly transform the weighted sum of inputs to describe overall similarity relationships of trained and new testing input patterns. Differential polynomial neural networks form a new class of neural networks, which construct and solve an unknown general partial differential equation of a function of interest with selected substitution relative terms using non-linear multi-variable composite polynomials. The layers of the network generate simple and composite relative substitution terms whose convergent series combinations can describe partial dependent derivative changes of the input variables. This regression is based on trained generalized partial derivative data relations, decomposed into a multi-layer polynomial network structure. The sigmoidal function, commonly used as a nonlinear activation of artificial neurons, may transform some polynomial items together with the parameters with the aim to improve the polynomial derivative term series ability to approximate complicated periodic functions, as simple low order polynomials are not able to fully make up for the complete cycles. The similarity analysis facilitates substitutions for differential equations or can form dimensional units from data samples to describe real-world problems. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

    NASA Astrophysics Data System (ADS)

    Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

    2017-11-01

    This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation

  12. A method for fitting regression splines with varying polynomial order in the linear mixed model.

    PubMed

    Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W

    2006-02-15

    The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.

  13. GIS Tools to Estimate Average Annual Daily Traffic

    DOT National Transportation Integrated Search

    2012-06-01

    This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...

  14. Modeling the relationship between photosynthetically active radiation and global horizontal irradiance using singular spectrum analysis

    NASA Astrophysics Data System (ADS)

    Zempila, Melina-Maria; Taylor, Michael; Bais, Alkiviadis; Kazadzis, Stelios

    2016-10-01

    We report on the construction of generic models to calculate photosynthetically active radiation (PAR) from global horizontal irradiance (GHI), and vice versa. Our study took place at stations of the Greek UV network (UVNET) and the Hellenic solar energy network (HNSE) with measurements from NILU-UV multi-filter radiometers and CM pyranometers, chosen due to their long (≈1 M record/site) high temporal resolution (≈1 min) record that captures a broad range of atmospheric environments and cloudiness conditions. The uncertainty of the PAR measurements is quantified to be ±6.5% while the uncertainty involved in GHI measurements is up to ≈±7% according to the manufacturer. We show how multi-linear regression and nonlinear neural network (NN) models, trained at a calibration site (Thessaloniki) can be made generic provided that the input-output time series are processed with multi-channel singular spectrum analysis (M-SSA). Without M-SSA, both linear and nonlinear models perform well only locally. M-SSA with 50 time-lags is found to be sufficient for identification of trend, periodic and noise components in aerosol, cloud parameters and irradiance, and to construct regularized noise models of PAR from GHI irradiances. Reconstructed PAR and GHI time series capture ≈95% of the variance of the cross-validated target measurements and have median absolute percentage errors <2%. The intra-site median absolute error of M-SSA processed models were ≈8.2±1.7 W/m2 for PAR and ≈9.2±4.2 W/m2 for GHI. When applying the models trained at Thessaloniki to other stations, the average absolute mean bias between the model estimates and measured values was found to be ≈1.2 W/m2 for PAR and ≈0.8 W/m2 for GHI. For the models, percentage errors are well within the uncertainty of the measurements at all sites. Generic NN models were found to perform marginally better than their linear counterparts.

  15. Estimating extent of mortality associated with the Douglas-fir beetle in the Central and Northern Rockies

    Treesearch

    Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini

    1999-01-01

    Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...

  16. [Prediction model of health workforce and beds in county hospitals of Hunan by multiple linear regression].

    PubMed

    Ling, Ru; Liu, Jiawang

    2011-12-01

    To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.

  17. A new approach to correct the QT interval for changes in heart rate using a nonparametric regression model in beagle dogs.

    PubMed

    Watanabe, Hiroyuki; Miyazaki, Hiroyasu

    2006-01-01

    Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.

  18. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  19. Association between central sleep apnea and left ventricular structure: the Multi-Ethnic Study of Atherosclerosis.

    PubMed

    Javaheri, Sogol; Sharma, Ravi K; Bluemke, David A; Redline, Susan

    2017-08-01

    We assessed whether the presence of central sleep apnea is associated with adverse left ventricular structural changes. We analysed 1412 participants from the Multi-Ethnic Study of Atherosclerosis who underwent both overnight polysomnography and cardiac magnetic resonance imaging. Subjects had been recruited 10 years earlier when free of cardiovascular disease. Our main exposure is the presence of central sleep apnea as defined by central apnea-hypopnea index = 5 or the presence of Cheyne-Stokes breathing. Outcome variables were left ventricular mass/height, left ventricular ejection fraction, and left ventricular mass/volume ratio. Multivariate linear regression models adjusted for age, gender, race, waist circumference, tobacco use, hypertension, and the obstructive apnea-hypopnea index were fit for the outcomes. Of the 1412 participants, 27 (2%) individuals had central sleep apnea. After adjusting for covariates, the presence of central sleep apnea was significantly associated with elevated left ventricular mass/volume ratio (β = 0.11 ± 0.04 g mL -1 , P = 0.0071), an adverse cardiac finding signifying concentric remodelling. © 2017 European Sleep Research Society.

  20. Data Assimilation and Propagation of Uncertainty in Multiscale Cardiovascular Simulation

    NASA Astrophysics Data System (ADS)

    Schiavazzi, Daniele; Marsden, Alison

    2015-11-01

    Cardiovascular modeling is the application of computational tools to predict hemodynamics. State-of-the-art techniques couple a 3D incompressible Navier-Stokes solver with a boundary circulation model and can predict local and peripheral hemodynamics, analyze the post-operative performance of surgical designs and complement clinical data collection minimizing invasive and risky measurement practices. The ability of these tools to make useful predictions is directly related to their accuracy in representing measured physiologies. Tuning of model parameters is therefore a topic of paramount importance and should include clinical data uncertainty, revealing how this uncertainty will affect the predictions. We propose a fully Bayesian, multi-level approach to data assimilation of uncertain clinical data in multiscale circulation models. To reduce the computational cost, we use a stable, condensed approximation of the 3D model build by linear sparse regression of the pressure/flow rate relationship at the outlets. Finally, we consider the problem of non-invasively propagating the uncertainty in model parameters to the resulting hemodynamics and compare Monte Carlo simulation with Stochastic Collocation approaches based on Polynomial or Multi-resolution Chaos expansions.

  1. Thermal anomaly mapping from night MODIS imagery of USA, a tool for environmental assessment.

    PubMed

    Miliaresis, George Ch

    2013-02-01

    A method is presented for elevation, latitude and longitude decorrelation stretch of multi-temporal MODIS MYD11C3 imagery (monthly average night land surface temperature (LST) across USA and Mexico). Multiple linear regression analysis of principal components images (PCAs) quantifies the variance explained by elevation (H), latitude (LAT), and longitude (LON). The multi-temporal LST imagery is reconstructed from the residual images and selected PCAs by taking into account the portion of variance that is not related to H, LAT, LON. The reconstructed imagery presents the magnitude the standardized LST value per pixel deviates from the H, LAT, LON predicted. LST anomaly is defined as a region that presents either positive or negative reconstructed LST value. The environmental assessment of USA indicated that only for the 25 % of the study area (Mississippi drainage basin), the LST is predicted by the H, LAT, LON. Regions with milled climatic pattern were identified in the West Coast while the coldest climatic pattern is observed for Mid USA. Positive season invariant LST anomalies are identified in SW (Arizona, Sierra Nevada, etc.) and NE USA.

  2. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

    PubMed

    Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.

  3. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches

    PubMed Central

    Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969

  4. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    PubMed

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.

  5. Large signal-to-noise ratio quantification in MLE for ARARMAX models

    NASA Astrophysics Data System (ADS)

    Zou, Yiqun; Tang, Xiafei

    2014-06-01

    It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.

  6. Chemical performance of multi-environment trials in lens (Lens culinaris M.).

    PubMed

    Karadavut, Ufuk; Palta, Cetin

    2010-01-15

    Genotype-environment (GE) interaction has been a major effect to determine stable lens (Lens culinaris (Medik.) Merr.) cultivars for chemical composition in Turkey. Utilization of the lines depends on their agronomic traits and stability of the chemical composition in diverse environments. The objectives of this study were: (i) to evaluate the influence of year and location on the chemical composition of lens genotypes; and (ii) to determine which cultivar is the most stable. Genotypes were evaluated over 3 years (2005, 2006 and 2007) at four locations in Turkey. Effects of year had the largest impact on all protein contents. GE interaction was analyzed by using linear regression techniques. Stability was estimated using the Eberhart and Russell method. 'Kişlik Kirmizi51' was the most stable cultivar for grain yield. The highest protein was obtained from 'Kişlik Kirmizi51' (4.6%) across environments. According to stability analysis, 'Firat 87' had the most stable chemical composition. This genotype had a regression coefficient (b(i) = 1) around unity, and deviations from regression values (delta(ij) = 0) around zero. Chemical composition was affected by year in this study. Temperature might have an effect on protein, oil, carbohydrate, fibre and ash. Firat 87 could be recommended for favourable environments. Copyright (c) 2009 Society of Chemical Industry.

  7. Sensitivity Analysis of Mechanical Parameters of Different Rock Layers to the Stability of Coal Roadway in Soft Rock Strata

    PubMed Central

    Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing

    2013-01-01

    According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447

  8. Spacebased Estimation of Moisture Transport in Marine Atmosphere Using Support Vector Regression

    NASA Technical Reports Server (NTRS)

    Xie, Xiaosu; Liu, W. Timothy; Tang, Benyang

    2007-01-01

    An improved algorithm is developed based on support vector regression (SVR) to estimate horizonal water vapor transport integrated through the depth of the atmosphere ((Theta)) over the global ocean from observations of surface wind-stress vector by QuikSCAT, cloud drift wind vector derived from the Multi-angle Imaging SpectroRadiometer (MISR) and geostationary satellites, and precipitable water from the Special Sensor Microwave/Imager (SSM/I). The statistical relation is established between the input parameters (the surface wind stress, the 850 mb wind, the precipitable water, time and location) and the target data ((Theta) calculated from rawinsondes and reanalysis of numerical weather prediction model). The results are validated with independent daily rawinsonde observations, monthly mean reanalysis data, and through regional water balance. This study clearly demonstrates the improvement of (Theta) derived from satellite data using SVR over previous data sets based on linear regression and neural network. The SVR methodology reduces both mean bias and standard deviation comparedwith rawinsonde observations. It agrees better with observations from synoptic to seasonal time scales, and compare more favorably with the reanalysis data on seasonal variations. Only the SVR result can achieve the water balance over South America. The rationale of the advantage by SVR method and the impact of adding the upper level wind will also be discussed.

  9. The Age in Swimming of Champions in World Championships (1994–2013) and Olympic Games (1992–2012): A Cross-Sectional Data Analysis

    PubMed Central

    Knechtle, Beat; Bragazzi, Nicola Luigi; König, Stefan; Nikolaidis, Pantelis Theodoros; Wild, Stefanie; Rosemann, Thomas; Rüst, Christoph Alexander

    2016-01-01

    (1) Background: We investigated the age of swimming champions in all strokes and race distances in World Championships (1994–2013) and Olympic Games (1992–2012); (2) Methods: Changes in age and swimming performance across calendar years for 412 Olympic and world champions were analysed using linear, non-linear, multi-level regression analyses and MultiLayer Perceptron (MLP); (3) Results: The age of peak swimming performance remained stable in most of all race distances for world champions and in all race distances for Olympic champions. Longer (i.e., 200 m and more) race distances were completed by younger (~20 years old for women and ~22 years old for men) champions than shorter (i.e., 50 m and 100 m) race distances (~22 years old for women and ~24 years old for men). There was a sex difference in the age of champions of ~2 years with a mean age of ~21 and ~23 years for women and men, respectively. Swimming performance improved in most race distances for world and Olympic champions with a larger trend of increase in Olympic champions; (4) Conclusion: Swimmers at younger ages (<20 years) may benefit from training and competing in longer race distances (i.e., 200 m and longer) before they change to shorter distances (i.e., 50 m and 100 m) when they become older (>22 years). PMID:29910265

  10. Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

    NASA Astrophysics Data System (ADS)

    Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

    2017-05-01

    This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.

  11. Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.

  12. Multi-Parameter Linear Least-Squares Fitting to Poisson Data One Count at a Time

    NASA Technical Reports Server (NTRS)

    Wheaton, W.; Dunklee, A.; Jacobson, A.; Ling, J.; Mahoney, W.; Radocinski, R.

    1993-01-01

    A standard problem in gamma-ray astronomy data analysis is the decomposition of a set of observed counts, described by Poisson statistics, according to a given multi-component linear model, with underlying physical count rates or fluxes which are to be estimated from the data.

  13. Liver fat contents, abdominal adiposity and insulin resistance in non-diabetic prevalent hemodialysis patients.

    PubMed

    Chen, Hung-Yuan; Lin, Chien-Chu; Chiu, Yen-Ling; Hsu, Shih-Ping; Pai, Mei-Fen; Yang, Ju-Yeh; Wu, Hon-Yen; Peng, Yu-Sen

    2014-01-01

    The liver fat contents and abdominal adiposity correlate well with insulin resistance (IR) in the general population. However, the relationship between liver fat content, abdominal adiposity and IR in non-diabetic hemodialysis (HD) patients remains unclear. This study aimed to clarify the associations among these factors. This is a cross-sectional, observational study. All patients received abdominal ultrasound for liver fat content. Abdominal adiposity was quantified with the conicity index (Ci) and waist circumference (WC). We checked the homeostasis model assessment for insulin resistance index (HOMA-IR) for IR. A total of 112 patients (60 women) were analyzed. Subjects with higher liver fat contents and WC had higher IR indices. But Ci did not correlate with IR indices. In both the multi-variable linear regression model and the logistic regression model, only higher liver fat content predicted a severe IR status. Liver fat contents have a remarkable correlation with IR; however, abdominal adiposity, measured either by Ci or WC, dose not independently correlate with IR in non-diabetic prevalent HD patients. © 2014 S. Karger AG, Basel.

  14. Comparison of conceptually based and regression rainfall-runoff models, Denver Metropolitan area, Colorado, and potential applications in urban areas

    USGS Publications Warehouse

    Lindner-Lunsford, J. B.; Ellis, S.R.

    1987-01-01

    Multievent, conceptually based models and a single-event, multiple linear-regression model for estimating storm-runoff quantity and quality from urban areas were calibrated and verified for four small (57 to 167 acres) basins in the Denver metropolitan area, Colorado. The basins represented different land-use types - light commercial, single-family housing, and multi-family housing. Both types of models were calibrated using the same data set for each basin. A comparison was made between the storm-runoff volume, peak flow, and storm-runoff loads of seven water quality constituents simulated by each of the models by use of identical verification data sets. The models studied were the U.S. Geological Survey 's Distributed Routing Rainfall-Runoff Model-Version II (DR3M-II) (a runoff-quantity model designed for urban areas), and a multievent urban runoff quality model (DR3M-QUAL). Water quality constituents modeled were chemical oxygen demand, total suspended solids, total nitrogen, total phosphorus, total lead, total manganese, and total zinc. (USGS)

  15. A 5-year scientometric analysis of research centers affiliated to Tehran University of Medical Sciences.

    PubMed

    Yazdani, Kamran; Rahimi-Movaghar, Afarin; Nedjat, Saharnaz; Ghalichi, Leila; Khalili, Malahat

    2015-01-01

    Since Tehran University of Medical Sciences (TUMS) has the oldest and highest number of research centers among all Iranian medical universities, this study was conducted to evaluate scientific output of research centers affiliated to Tehran University of Medical Sciences (TUMS) using scientometric indices and the affecting factors. Moreover, a number of scientometric indicators were introduced. This cross-sectional study was performed to evaluate a 5-year scientific performance of research centers of TUMS. Data were collected through questionnaires, annual evaluation reports of the Ministry of Health, and also from Scopus database. We used appropriate measures of central tendency and variation for descriptive analyses. Moreover, uni-and multi-variable linear regression were used to evaluate the effect of independent factors on the scientific output of the centers. The medians of the numbers of papers and books during a 5-year period were 150.5 and 2.5 respectively. The median of the "articles per researcher" was 19.1. Based on multiple linear regression, younger age centers (p=0.001), having a separate budget line (p=0.016), and number of research personnel (p<0.001) had a direct significant correlation with the number of articles while real properties had a reverse significant correlation with it (p=0.004). The results can help policy makers and research managers to allocate sufficient resources to improve current situation of the centers. Newly adopted and effective scientometric indices are is suggested to be used to evaluate scientific outputs and functions of these centers.

  16. Consensus for linear multi-agent system with intermittent information transmissions using the time-scale theory

    NASA Astrophysics Data System (ADS)

    Taousser, Fatima; Defoort, Michael; Djemai, Mohamed

    2016-01-01

    This paper investigates the consensus problem for linear multi-agent system with fixed communication topology in the presence of intermittent communication using the time-scale theory. Since each agent can only obtain relative local information intermittently, the proposed consensus algorithm is based on a discontinuous local interaction rule. The interaction among agents happens at a disjoint set of continuous-time intervals. The closed-loop multi-agent system can be represented using mixed linear continuous-time and linear discrete-time models due to intermittent information transmissions. The time-scale theory provides a powerful tool to combine continuous-time and discrete-time cases and study the consensus protocol under a unified framework. Using this theory, some conditions are derived to achieve exponential consensus under intermittent information transmissions. Simulations are performed to validate the theoretical results.

  17. Scoring and staging systems using cox linear regression modeling and recursive partitioning.

    PubMed

    Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H

    2006-01-01

    Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.

  18. Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

    PubMed

    Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

    2017-01-01

    This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.

  19. Harmonic regression based multi-temporal cloud filtering algorithm for Landsat 8

    NASA Astrophysics Data System (ADS)

    Joshi, P.

    2015-12-01

    Landsat data archive though rich is seen to have missing dates and periods owing to the weather irregularities and inconsistent coverage. The satellite images are further subject to cloud cover effects resulting in erroneous analysis and observations of ground features. In earlier studies the change detection algorithm using statistical control charts on harmonic residuals of multi-temporal Landsat 5 data have been shown to detect few prominent remnant clouds [Brooks, Evan B., et al, 2014]. So, in this work we build on this harmonic regression approach to detect and filter clouds using a multi-temporal series of Landsat 8 images. Firstly, we compute the harmonic coefficients using the fitting models on annual training data. This time series of residuals is further subjected to Shewhart X-bar control charts which signal the deviations of cloud points from the fitted multi-temporal fourier curve. For the process with standard deviation σ we found the second and third order harmonic regression with a x-bar chart control limit [Lσ] ranging between [0.5σ < Lσ < σ] as most efficient in detecting clouds. By implementing second order harmonic regression with successive x-bar chart control limits of L and 0.5 L on the NDVI, NDSI and haze optimized transformation (HOT), and utilizing the seasonal physical properties of these parameters, we have designed a novel multi-temporal algorithm for filtering clouds from Landsat 8 images. The method is applied to Virginia and Alabama in Landsat8 UTM zones 17 and 16 respectively. Our algorithm efficiently filters all types of cloud cover with an overall accuracy greater than 90%. As a result of the multi-temporal operation and the ability to recreate the multi-temporal database of images using only the coefficients of the fourier regression, our algorithm is largely storage and time efficient. The results show a good potential for this multi-temporal approach for cloud detection as a timely and targeted solution for the Landsat 8 research community, catering to the need for innovative processing solutions in the infant stage of the satellite.

  20. Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression.

    PubMed

    Gao, Guangwei; Yang, Jian; Jing, Xiaoyuan; Huang, Pu; Hua, Juliang; Yue, Dong

    2016-01-01

    In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.

  1. Correcting for population structure and kinship using the linear mixed model: theory and extensions.

    PubMed

    Hoffman, Gabriel E

    2013-01-01

    Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.

  2. Cortical Decoding of Individual Finger and Wrist Kinematics for an Upper-Limb Neuroprosthesis

    PubMed Central

    Aggarwal, Vikram; Tenore, Francesco; Acharya, Soumyadipta; Schieber, Marc H.; Thakor, Nitish V.

    2010-01-01

    Previous research has shown that neuronal activity can be used to continuously decode the kinematics of gross movements involving arm and hand trajectory. However, decoding the kinematics of fine motor movements, such as the manipulation of individual fingers, has not been demonstrated. In this study, single unit activities were recorded from task-related neurons in M1 of two trained rhesus monkey as they performed individuated movements of the fingers and wrist. The primates’ hand was placed in a manipulandum, and strain gauges at the tips of each finger were used to track the digit’s position. Both linear and non-linear filters were designed to simultaneously predict kinematics of each digit and the wrist, and their performance compared using mean squared error and correlation coefficients. All models had high decoding accuracy, but the feedforward ANN (R=0.76–0.86, MSE=0.04–0.05) and Kalman filter (R=0.68–0.86, MSE=0.04–0.07) performed better than a simple linear regression filter (0.58–0.81, 0.05–0.07). These results suggest that individual finger and wrist kinematics can be decoded with high accuracy, and be used to control a multi-fingered prosthetic hand in real-time. PMID:19964645

  3. A simple method for HPLC retention time prediction: linear calibration using two reference substances.

    PubMed

    Sun, Lei; Jin, Hong-Yu; Tian, Run-Tao; Wang, Ming-Juan; Liu, Li-Na; Ye, Liu-Ping; Zuo, Tian-Tian; Ma, Shuang-Cheng

    2017-01-01

    Analysis of related substances in pharmaceutical chemicals and multi-components in traditional Chinese medicines needs bulk of reference substances to identify the chromatographic peaks accurately. But the reference substances are costly. Thus, the relative retention (RR) method has been widely adopted in pharmacopoeias and literatures for characterizing HPLC behaviors of those reference substances unavailable. The problem is it is difficult to reproduce the RR on different columns due to the error between measured retention time (t R ) and predicted t R in some cases. Therefore, it is useful to develop an alternative and simple method for prediction of t R accurately. In the present study, based on the thermodynamic theory of HPLC, a method named linear calibration using two reference substances (LCTRS) was proposed. The method includes three steps, procedure of two points prediction, procedure of validation by multiple points regression and sequential matching. The t R of compounds on a HPLC column can be calculated by standard retention time and linear relationship. The method was validated in two medicines on 30 columns. It was demonstrated that, LCTRS method is simple, but more accurate and more robust on different HPLC columns than RR method. Hence quality standards using LCTRS method are easy to reproduce in different laboratories with lower cost of reference substances.

  4. SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

    EPA Science Inventory

    As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

  5. A simplified competition data analysis for radioligand specific activity determination.

    PubMed

    Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A

    1990-01-01

    Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.

  6. Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

    PubMed Central

    Fernandes, Bruno J. T.; Roque, Alexandre

    2018-01-01

    Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366

  7. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  8. Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

    PubMed Central

    Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

    2011-01-01

    This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626

  9. Thyra Abstract Interface Package

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bartlett, Roscoe A.

    2005-09-01

    Thrya primarily defines a set of abstract C++ class interfaces needed for the development of abstract numerical atgorithms (ANAs) such as iterative linear solvers, transient solvers all the way up to optimization. At the foundation of these interfaces are abstract C++ classes for vectors, vector spaces, linear operators and multi-vectors. Also included in the Thyra package is C++ code for creating concrete vector, vector space, linear operator, and multi-vector subclasses as well as other utilities to aid in the development of ANAs. Currently, very general and efficient concrete subclass implementations exist for serial and SPMD in-core vectors and multi-vectors. Codemore » also currently exists for testing objects and providing composite objects such as product vectors.« less

  10. Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Gorgees, HazimMansoor; Mahdi, FatimahAssim

    2018-05-01

    This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.

  11. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    PubMed

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  12. Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.

    PubMed

    Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong

    2017-01-01

    This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  13. Re-Mediating Classroom Activity with a Non-Linear, Multi-Display Presentation Tool

    ERIC Educational Resources Information Center

    Bligh, Brett; Coyle, Do

    2013-01-01

    This paper uses an Activity Theory framework to evaluate the use of a novel, multi-screen, non-linear presentation tool. The Thunder tool allows presenters to manipulate and annotate multiple digital slides and to concurrently display a selection of juxtaposed resources across a wall-sized projection area. Conventional, single screen presentation…

  14. Detailed emission profiles for on-road vehicles derived from ambient measurements during a windless traffic episode in Baltimore using a multi-model approach

    NASA Astrophysics Data System (ADS)

    Ke, Haohao; Ondov, John M.; Rogge, Wolfgang F.

    2013-12-01

    Composite chemical profiles of motor vehicle emissions were extracted from ambient measurements at a near-road site in Baltimore during a windless traffic episode in November, 2002, using four independent approaches, i.e., simple peak analysis, windless model-based linear regression, PMF, and UNMIX. Although the profiles are in general agreement, the windless-model-based profile treatment more effectively removes interference from non-traffic sources and is deemed to be more accurate for many species. In addition to abundances of routine pollutants (e.g., NOx, CO, PM2.5, EC, OC, sulfate, and nitrate), 11 particle-bound metals and 51 individual traffic-related organic compounds (including n-alkanes, PAHs, oxy-PAHs, hopanes, alkylcyclohexanes, and others) were included in the modeling.

  15. Mechanisms and kinetics models for ultrasonic waste activated sludge disintegration.

    PubMed

    Wang, Fen; Wang, Yong; Ji, Min

    2005-08-31

    Ultrasonic energy can be applied as pre-treatment to disintegrate sludge flocs and disrupt bacterial cells' walls, and the hydrolysis can be improved, so that the rate of sludge digestion and methane production is improved. In this paper, by adding NaHCO3 to mask the oxidizing effect of OH, the mechanisms of disintegration are investigated. In addition, kinetics models for ultrasonic sludge disintegration are established by applying multi-variable linear regression method. It has been found that hydro-mechanical shear forces predominantly responsible for the disintegration, and the contribution of oxidizing effect of OH increases with the amount of the ultrasonic density and ultrasonic intensity. It has also been inferred from the kinetics model which dependent variable is SCOD+ that both sludge pH and sludge concentration significantly affect the disintegration.

  16. PM10 source apportionment in Milan (Italy) using time-resolved data.

    PubMed

    Bernardoni, Vera; Vecchi, Roberta; Valli, Gianluigi; Piazzalunga, Andrea; Fermo, Paola

    2011-10-15

    In this work Positive Matrix Factorization (PMF) was applied to 4-hour resolved PM10 data collected in Milan (Italy) during summer and winter 2006. PM10 characterisation included elements (Mg-Pb), main inorganic ions (NH(4)(+), NO(3)(-), SO(4)(2-)), levoglucosan and its isomers (mannosan and galactosan), and organic and elemental carbon (OC and EC). PMF resolved seven factors that were assigned to construction works, re-suspended dust, secondary sulphate, traffic, industry, secondary nitrate, and wood burning. Multi Linear Regression was applied to obtain the PM10 source apportionment. The 4-hour temporal resolution allowed the estimation of the factor contributions during peculiar episodes, which would have not been detected with the traditional 24-hour sampling strategy. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. A simple bias correction in linear regression for quantitative trait association under two-tail extreme selection.

    PubMed

    Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C

    2011-09-01

    Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.

  18. A road map for multi-way calibration models.

    PubMed

    Escandar, Graciela M; Olivieri, Alejandro C

    2017-08-07

    A large number of experimental applications of multi-way calibration are known, and a variety of chemometric models are available for the processing of multi-way data. While the main focus has been directed towards three-way data, due to the availability of various instrumental matrix measurements, a growing number of reports are being produced on order signals of increasing complexity. The purpose of this review is to present a general scheme for selecting the appropriate data processing model, according to the properties exhibited by the multi-way data. In spite of the complexity of the multi-way instrumental measurements, simple criteria can be proposed for model selection, based on the presence and number of the so-called multi-linearity breaking modes (instrumental modes that break the low-rank multi-linearity of the multi-way arrays), and also on the existence of mutually dependent instrumental modes. Recent literature reports on multi-way calibration are reviewed, with emphasis on the models that were selected for data processing.

  19. A Common Mechanism for Resistance to Oxime Reactivation of Acetylcholinesterase Inhibited by Organophosphorus Compounds

    DTIC Science & Technology

    2013-01-01

    application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal

  20. Effect of linear and non-linear blade modelling techniques on simulated fatigue and extreme loads using Bladed

    NASA Astrophysics Data System (ADS)

    Beardsell, Alec; Collier, William; Han, Tao

    2016-09-01

    There is a trend in the wind industry towards ever larger and more flexible turbine blades. Blade tip deflections in modern blades now commonly exceed 10% of blade length. Historically, the dynamic response of wind turbine blades has been analysed using linear models of blade deflection which include the assumption of small deflections. For modern flexible blades, this assumption is becoming less valid. In order to continue to simulate dynamic turbine performance accurately, routine use of non-linear models of blade deflection may be required. This can be achieved by representing the blade as a connected series of individual flexible linear bodies - referred to in this paper as the multi-part approach. In this paper, Bladed is used to compare load predictions using single-part and multi-part blade models for several turbines. The study examines the impact on fatigue and extreme loads and blade deflection through reduced sets of load calculations based on IEC 61400-1 ed. 3. Damage equivalent load changes of up to 16% and extreme load changes of up to 29% are observed at some turbine load locations. It is found that there is no general pattern in the loading differences observed between single-part and multi-part blade models. Rather, changes in fatigue and extreme loads with a multi-part blade model depend on the characteristics of the individual turbine and blade. Key underlying causes of damage equivalent load change are identified as differences in edgewise- torsional coupling between the multi-part and single-part models, and increased edgewise rotor mode damping in the multi-part model. Similarly, a causal link is identified between torsional blade dynamics and changes in ultimate load results.

  1. Describing three-class task performance: three-class linear discriminant analysis and three-class ROC analysis

    NASA Astrophysics Data System (ADS)

    He, Xin; Frey, Eric C.

    2007-03-01

    Binary ROC analysis has solid decision-theoretic foundations and a close relationship to linear discriminant analysis (LDA). In particular, for the case of Gaussian equal covariance input data, the area under the ROC curve (AUC) value has a direct relationship to the Hotelling trace. Many attempts have been made to extend binary classification methods to multi-class. For example, Fukunaga extended binary LDA to obtain multi-class LDA, which uses the multi-class Hotelling trace as a figure-of-merit, and we have previously developed a three-class ROC analysis method. This work explores the relationship between conventional multi-class LDA and three-class ROC analysis. First, we developed a linear observer, the three-class Hotelling observer (3-HO). For Gaussian equal covariance data, the 3- HO provides equivalent performance to the three-class ideal observer and, under less strict conditions, maximizes the signal to noise ratio for classification of all pairs of the three classes simultaneously. The 3-HO templates are not the eigenvectors obtained from multi-class LDA. Second, we show that the three-class Hotelling trace, which is the figureof- merit in the conventional three-class extension of LDA, has significant limitations. Third, we demonstrate that, under certain conditions, there is a linear relationship between the eigenvectors obtained from multi-class LDA and 3-HO templates. We conclude that the 3-HO based on decision theory has advantages both in its decision theoretic background and in the usefulness of its figure-of-merit. Additionally, there exists the possibility of interpreting the two linear features extracted by the conventional extension of LDA from a decision theoretic point of view.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng

    An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less

  3. mPLR-Loc: an adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction.

    PubMed

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2015-03-15

    Proteins located in appropriate cellular compartments are of paramount importance to exert their biological functions. Prediction of protein subcellular localization by computational methods is required in the post-genomic era. Recent studies have been focusing on predicting not only single-location proteins but also multi-location proteins. However, most of the existing predictors are far from effective for tackling the challenges of multi-label proteins. This article proposes an efficient multi-label predictor, namely mPLR-Loc, based on penalized logistic regression and adaptive decisions for predicting both single- and multi-location proteins. Specifically, for each query protein, mPLR-Loc exploits the information from the Gene Ontology (GO) database by using its accession number (AC) or the ACs of its homologs obtained via BLAST. The frequencies of GO occurrences are used to construct feature vectors, which are then classified by an adaptive decision-based multi-label penalized logistic regression classifier. Experimental results based on two recent stringent benchmark datasets (virus and plant) show that mPLR-Loc remarkably outperforms existing state-of-the-art multi-label predictors. In addition to being able to rapidly and accurately predict subcellular localization of single- and multi-label proteins, mPLR-Loc can also provide probabilistic confidence scores for the prediction decisions. For readers' convenience, the mPLR-Loc server is available online (http://bioinfo.eie.polyu.edu.hk/mPLRLocServer). Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Specialization Agreements in the Council for Mutual Economic Assistance

    DTIC Science & Technology

    1988-02-01

    proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International

  5. Radio Propagation Prediction Software for Complex Mixed Path Physical Channels

    DTIC Science & Technology

    2006-08-14

    63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of

  6. INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

    EPA Science Inventory

    Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

  7. Data Transformations for Inference with Linear Regression: Clarifications and Recommendations

    ERIC Educational Resources Information Center

    Pek, Jolynn; Wong, Octavia; Wong, C. M.

    2017-01-01

    Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…

  8. USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES

    EPA Science Inventory

    The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...

  9. Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Camilleri, Liberato; Cefai, Carmel

    2013-01-01

    Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…

  10. Simple and multiple linear regression: sample size considerations.

    PubMed

    Hanley, James A

    2016-11-01

    The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Entanglement of Multi-qudit States Constructed by Linearly Independent Coherent States: Balanced Case

    NASA Astrophysics Data System (ADS)

    Najarbashi, G.; Mirzaei, S.

    2016-03-01

    Multi-mode entangled coherent states are important resources for linear optics quantum computation and teleportation. Here we introduce the generalized balanced N-mode coherent states which recast in the multi-qudit case. The necessary and sufficient condition for bi-separability of such balanced N-mode coherent states is found. We particularly focus on pure and mixed multi-qubit and multi-qutrit like states and examine the degree of bipartite as well as tripartite entanglement using the concurrence measure. Unlike the N-qubit case, it is shown that there are qutrit states violating monogamy inequality. Using parity, displacement operator and beam splitters, we will propose a scheme for generating balanced N-mode entangled coherent states for even number of terms in superposition.

  12. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression

    PubMed Central

    Jiang, Feng; Han, Ji-zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088

  13. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression.

    PubMed

    Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  14. A comparative study of linear and nonlinear MIMO feedback configurations

    NASA Technical Reports Server (NTRS)

    Desoer, C. A.; Lin, C. A.

    1984-01-01

    In this paper, a comparison is conducted of several feedback configurations which have appeared in the literature (e.g. unity-feedback, model-reference, etc.). The linear time-invariant multi-input multi-output case is considered. For each configuration, the stability conditions are specified, the relation between achievable I/O maps and the achievable disturbance-to-output maps is examined, and the effect of various subsystem perturbations on the system performance is studied. In terms of these considerations, it is demonstrated that one of the configurations considered is better than all the others. The results are then extended to the nonlinear multi-input multi-output case.

  15. An implemented method of asymmetric transmission for arbitrary polarization base in multi-layered chiral meta-surface

    NASA Astrophysics Data System (ADS)

    Xiao, Zhong-yin; Zou, Huan-ling; Xu, Kai-Kai; Tang, Jing-yao

    2018-03-01

    Asymmetric transmission of linearly or circularly polarized waves is a well-established property not only for three-layered chiral structures but for multi-layered ones. Here we show a method which can simultaneously implement asymmetric transmission for arbitrary base vector polarized wave in multi-layered chiral meta-surface. We systematically study the implemented method based on a multi-layered chiral structure consisting of a y-shape, a half gammadion and an S-shape in the terahertz gap. A numerical simulation was carried out, followed by an explanation of the asymmetric transmission mechanism in these structures proposed in this work. The simulated results indicate that the multi-layered chiral structure can realize a maximum asymmetric transmission of 0.89 and 0.28 for circularly and linearly polarized waves, respectively, which exhibit magnitude improvement over previous chiral metamaterials. Specifically, the maximum asymmetric transmitted coefficient of the multi-layered chiral structure is insensitivity to the incident angles from 0° to 45° for circularly polarized components. Additionally, we also study the influence of structural parameters on the asymmetric transmission effect for both linearly and circularly polarized waves in detail.

  16. Statistical linearization for multi-input/multi-output nonlinearities

    NASA Technical Reports Server (NTRS)

    Lin, Ching-An; Cheng, Victor H. L.

    1991-01-01

    Formulas are derived for the computation of the random input-describing functions for MIMO nonlinearities; these straightforward and rigorous derivations are based on the optimal mean square linear approximation. The computations involve evaluations of multiple integrals. It is shown that, for certain classes of nonlinearities, multiple-integral evaluations are obviated and the computations are significantly simplified.

  17. Analysis of Binary Adherence Data in the Setting of Polypharmacy: A Comparison of Different Approaches

    PubMed Central

    Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.

    2009-01-01

    Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358

  18. Genetic Programming Transforms in Linear Regression Situations

    NASA Astrophysics Data System (ADS)

    Castillo, Flor; Kordon, Arthur; Villa, Carlos

    The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.

  19. Naval Research Logistics Quarterly. Volume 28. Number 3,

    DTIC Science & Technology

    1981-09-01

    denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions

  20. Automating approximate Bayesian computation by local linear regression.

    PubMed

    Thornton, Kevin R

    2009-07-07

    In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.

  1. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  2. Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

    PubMed

    Haoliang Yuan; Yuan Yan Tang

    2017-04-01

    Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

  3. Simple linear and multivariate regression models.

    PubMed

    Rodríguez del Águila, M M; Benítez-Parejo, N

    2011-01-01

    In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.

  4. Distributed collaborative probabilistic design of multi-failure structure with fluid-structure interaction using fuzzy neural network of regression

    NASA Astrophysics Data System (ADS)

    Song, Lu-Kai; Wen, Jie; Fei, Cheng-Wei; Bai, Guang-Chen

    2018-05-01

    To improve the computing efficiency and precision of probabilistic design for multi-failure structure, a distributed collaborative probabilistic design method-based fuzzy neural network of regression (FR) (called as DCFRM) is proposed with the integration of distributed collaborative response surface method and fuzzy neural network regression model. The mathematical model of DCFRM is established and the probabilistic design idea with DCFRM is introduced. The probabilistic analysis of turbine blisk involving multi-failure modes (deformation failure, stress failure and strain failure) was investigated by considering fluid-structure interaction with the proposed method. The distribution characteristics, reliability degree, and sensitivity degree of each failure mode and overall failure mode on turbine blisk are obtained, which provides a useful reference for improving the performance and reliability of aeroengine. Through the comparison of methods shows that the DCFRM reshapes the probability of probabilistic analysis for multi-failure structure and improves the computing efficiency while keeping acceptable computational precision. Moreover, the proposed method offers a useful insight for reliability-based design optimization of multi-failure structure and thereby also enriches the theory and method of mechanical reliability design.

  5. Variance approach for multi-objective linear programming with fuzzy random of objective function coefficients

    NASA Astrophysics Data System (ADS)

    Indarsih, Indrati, Ch. Rini

    2016-02-01

    In this paper, we define variance of the fuzzy random variables through alpha level. We have a theorem that can be used to know that the variance of fuzzy random variables is a fuzzy number. We have a multi-objective linear programming (MOLP) with fuzzy random of objective function coefficients. We will solve the problem by variance approach. The approach transform the MOLP with fuzzy random of objective function coefficients into MOLP with fuzzy of objective function coefficients. By weighted methods, we have linear programming with fuzzy coefficients and we solve by simplex method for fuzzy linear programming.

  6. Optimization of isotherm models for pesticide sorption on biopolymer-nanoclay composite by error analysis.

    PubMed

    Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M

    2017-04-01

    A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Prenatal Phthalate, Perfluoroalkyl Acid, and Organochlorine Exposures and Term Birth Weight in Three Birth Cohorts: Multi-Pollutant Models Based on Elastic Net Regression

    PubMed Central

    Lenters, Virissa; Portengen, Lützen; Rignell-Hydbom, Anna; Jönsson, Bo A.G.; Lindh, Christian H.; Piersma, Aldert H.; Toft, Gunnar; Bonde, Jens Peter; Heederik, Dick; Rylander, Lars; Vermeulen, Roel

    2015-01-01

    Background Some legacy and emerging environmental contaminants are suspected risk factors for intrauterine growth restriction. However, the evidence is equivocal, in part due to difficulties in disentangling the effects of mixtures. Objectives We assessed associations between multiple correlated biomarkers of environmental exposure and birth weight. Methods We evaluated a cohort of 1,250 term (≥ 37 weeks gestation) singleton infants, born to 513 mothers from Greenland, 180 from Poland, and 557 from Ukraine, who were recruited during antenatal care visits in 2002‒2004. Secondary metabolites of diethylhexyl and diisononyl phthalates (DEHP, DiNP), eight perfluoroalkyl acids, and organochlorines (PCB-153 and p,p´-DDE) were quantifiable in 72‒100% of maternal serum samples. We assessed associations between exposures and term birth weight, adjusting for co-exposures and covariates, including prepregnancy body mass index. To identify independent associations, we applied the elastic net penalty to linear regression models. Results Two phthalate metabolites (MEHHP, MOiNP), perfluorooctanoic acid (PFOA), and p,p´-DDE were most consistently predictive of term birth weight based on elastic net penalty regression. In an adjusted, unpenalized regression model of the four exposures, 2-SD increases in natural log–transformed MEHHP, PFOA, and p,p´-DDE were associated with lower birth weight: –87 g (95% CI: –137, –340 per 1.70 ng/mL), –43 g (95% CI: –108, 23 per 1.18 ng/mL), and –135 g (95% CI: –192, –78 per 1.82 ng/g lipid), respectively; and MOiNP was associated with higher birth weight (46 g; 95% CI: –5, 97 per 2.22 ng/mL). Conclusions This study suggests that several of the environmental contaminants, belonging to three chemical classes, may be independently associated with impaired fetal growth. These results warrant follow-up in other cohorts. Citation Lenters V, Portengen L, Rignell-Hydbom A, Jönsson BA, Lindh CH, Piersma AH, Toft G, Bonde JP, Heederik D, Rylander L, Vermeulen R. 2016. Prenatal phthalate, perfluoroalkyl acid, and organochlorine exposures and term birth weight in three birth cohorts: multi-pollutant models based on elastic net regression. Environ Health Perspect 124:365–372; http://dx.doi.org/10.1289/ehp.1408933 PMID:26115335

  8. London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

    PubMed Central

    Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

    2017-01-01

    Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343

  9. Feedback linearizing control of a MIMO power system

    NASA Astrophysics Data System (ADS)

    Ilyes, Laszlo

    Prior research has demonstrated that either the mechanical or electrical subsystem of a synchronous electric generator may be controlled using single-input single-output (SISO) nonlinear feedback linearization. This research suggests a new approach which applies nonlinear feedback linearization to a multi-input multi-output (MIMO) model of the synchronous electric generator connected to an infinite bus load model. In this way, the electrical and mechanical subsystems may be linearized and simultaneously decoupled through the introduction of a pair of auxiliary inputs. This allows well known, linear, SISO control methods to be effectively applied to the resulting systems. The derivation of the feedback linearizing control law is presented in detail, including a discussion on the use of symbolic math processing as a development tool. The linearizing and decoupling properties of the control law are validated through simulation. And finally, the robustness of the control law is demonstrated.

  10. Using Parametric Cost Models to Estimate Engineering and Installation Costs of Selected Electronic Communications Systems

    DTIC Science & Technology

    1994-09-01

    Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of

  11. An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System

    DTIC Science & Technology

    1989-09-01

    residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte

  12. A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

    ERIC Educational Resources Information Center

    Cooper, Paul D.

    2010-01-01

    A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

  13. Conjoint Analysis: A Study of the Effects of Using Person Variables.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…

  14. Fitting program for linear regressions according to Mahon (1996)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trappitsch, Reto G.

    2018-01-09

    This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.

  15. How Robust Is Linear Regression with Dummy Variables?

    ERIC Educational Resources Information Center

    Blankmeyer, Eric

    2006-01-01

    Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…

  16. Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method

    ERIC Educational Resources Information Center

    Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev

    2018-01-01

    The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…

  17. An Introduction to Graphical and Mathematical Methods for Detecting Heteroscedasticity in Linear Regression.

    ERIC Educational Resources Information Center

    Thompson, Russel L.

    Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…

  18. Common pitfalls in statistical analysis: Linear regression analysis

    PubMed Central

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022

  19. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    PubMed

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  20. Changes in performance: a 5-year longitudinal study of participants in a multi-source feedback programme.

    PubMed

    Violato, Claudio; Lockyer, Jocelyn M; Fidler, Herta

    2008-10-01

    Multi-source feedback (MSF) enables performance data to be provided to doctors from patients, co-workers and medical colleagues. This study examined the evidence for the validity of MSF instruments for general practice, investigated changes in performance for doctors who participated twice, 5 years apart, and determined the association between change in performance and initial assessment and socio-demographic characteristics. Data for 250 doctors included three datasets per doctor from, respectively, 25 patients, eight co-workers and eight medical colleagues, collected on two occasions. There was high internal consistency (alpha > 0.90) and adequate generalisability (Ep(2) > 0.70). D study results indicate adequate generalisability coefficients for groups of eight assessors (medical colleagues, co-workers) and 25 patient surveys. Confirmatory factor analyses provided evidence for the validity of factors that were theoretically expected, meaningful and cohesive. Comparative fit indices were 0.91 for medical colleague data, 0.87 for co-worker data and 0.81 for patient data. Paired t-test analysis showed significant change between the two assessments from medical colleagues and co-workers, but not between the two patient surveys. Multiple linear regressions explained 2.1% of the variance at time 2 for medical colleagues, 21.4% of the variance for co-workers and 16.35% of the variance for patient assessments, with professionalism a key variable in all regressions. There is evidence for the construct validity of the instruments and for their stability over time. Upward changes in performance will occur, although their effect size is likely to be small to moderate.

  1. Random regression models using different functions to model test-day milk yield of Brazilian Holstein cows.

    PubMed

    Bignardi, A B; El Faro, L; Torres Júnior, R A A; Cardoso, V L; Machado, P F; Albuquerque, L G

    2011-10-31

    We analyzed 152,145 test-day records from 7317 first lactations of Holstein cows recorded from 1995 to 2003. Our objective was to model variations in test-day milk yield during the first lactation of Holstein cows by random regression model (RRM), using various functions in order to obtain adequate and parsimonious models for the estimation of genetic parameters. Test-day milk yields were grouped into weekly classes of days in milk, ranging from 1 to 44 weeks. The contemporary groups were defined as herd-test-day. The analyses were performed using a single-trait RRM, including the direct additive, permanent environmental and residual random effects. In addition, contemporary group and linear and quadratic effects of the age of cow at calving were included as fixed effects. The mean trend of milk yield was modeled with a fourth-order orthogonal Legendre polynomial. The additive genetic and permanent environmental covariance functions were estimated by random regression on two parametric functions, Ali and Schaeffer and Wilmink, and on B-spline functions of days in milk. The covariance components and the genetic parameters were estimated by the restricted maximum likelihood method. Results from RRM parametric and B-spline functions were compared to RRM on Legendre polynomials and with a multi-trait analysis, using the same data set. Heritability estimates presented similar trends during mid-lactation (13 to 31 weeks) and between week 37 and the end of lactation, for all RRM. Heritabilities obtained by multi-trait analysis were of a lower magnitude than those estimated by RRM. The RRMs with a higher number of parameters were more useful to describe the genetic variation of test-day milk yield throughout the lactation. RRM using B-spline and Legendre polynomials as base functions appears to be the most adequate to describe the covariance structure of the data.

  2. Specificity vs. Generalizability: Emergence of Especial Skills in Classical Archery

    PubMed Central

    Czyż, Stanisław H.; Moss, Sarah J.

    2016-01-01

    There is evidence that the recall schema becomes more refined after constant practice. It is also believed that massive amounts of constant practice eventually leads to the emergence of especial skills, i.e., skills that have an advantage in performance over other actions from within the same class of actions. This advantage in performance was noticed when one-criterion practice, e.g., basketball free throws, was compared to non-practiced variations of the skill. However, there is no evidence whether multi-criterion massive amounts of practice would give an advantage to the trained variations of the skill over non-trained, i.e., whether such practice would eventually lead to the development of (multi)-especial skills. The purpose of this study was to determine whether massive amount of practice involving four criterion variations of the skill will give an advantage in performance to the criterions over the class of actions. In two experiments, we analyzed data from female (n = 8) and male classical archers (n = 10), who were required to shoot 30 shots from four accustomed distances, i.e., males at 30, 50, 70, and 90 m and females at 30, 50, 60, and 70 m. The shooting accuracy for the untrained distances (16 distances in men and 14 in women) was used to compile a regression line for distance over shooting accuracy. Regression determined (expected) values were then compared to the shooting accuracy of the trained distances. Data revealed no significant differences between real and expected results at trained distances, except for the 70 m shooting distance in men. The F-test for lack of fit showed that the regression computed for trained and non-trained shooting distances was linear. It can be concluded that especial skills emerge only after very specific practice, i.e., constant practice limited to only one variation of the skill. PMID:27547196

  3. Specificity vs. Generalizability: Emergence of Especial Skills in Classical Archery.

    PubMed

    Czyż, Stanisław H; Moss, Sarah J

    2016-01-01

    There is evidence that the recall schema becomes more refined after constant practice. It is also believed that massive amounts of constant practice eventually leads to the emergence of especial skills, i.e., skills that have an advantage in performance over other actions from within the same class of actions. This advantage in performance was noticed when one-criterion practice, e.g., basketball free throws, was compared to non-practiced variations of the skill. However, there is no evidence whether multi-criterion massive amounts of practice would give an advantage to the trained variations of the skill over non-trained, i.e., whether such practice would eventually lead to the development of (multi)-especial skills. The purpose of this study was to determine whether massive amount of practice involving four criterion variations of the skill will give an advantage in performance to the criterions over the class of actions. In two experiments, we analyzed data from female (n = 8) and male classical archers (n = 10), who were required to shoot 30 shots from four accustomed distances, i.e., males at 30, 50, 70, and 90 m and females at 30, 50, 60, and 70 m. The shooting accuracy for the untrained distances (16 distances in men and 14 in women) was used to compile a regression line for distance over shooting accuracy. Regression determined (expected) values were then compared to the shooting accuracy of the trained distances. Data revealed no significant differences between real and expected results at trained distances, except for the 70 m shooting distance in men. The F-test for lack of fit showed that the regression computed for trained and non-trained shooting distances was linear. It can be concluded that especial skills emerge only after very specific practice, i.e., constant practice limited to only one variation of the skill.

  4. Instrumental intelligent test of food sensory quality as mimic of human panel test combining multiple cross-perception sensors and data fusion.

    PubMed

    Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng

    2014-09-02

    Instrumental test of food quality using perception sensors instead of human panel test is attracting massive attention recently. A novel cross-perception multi-sensors data fusion imitating multiple mammal perception was proposed for the instrumental test in this work. First, three mimic sensors of electronic eye, electronic nose and electronic tongue were used in sequence for data acquisition of rice wine samples. Then all data from the three different sensors were preprocessed and merged. Next, three cross-perception variables i.e., color, aroma and taste, were constructed using principal components analysis (PCA) and multiple linear regression (MLR) which were used as the input of models. MLR, back-propagation artificial neural network (BPANN) and support vector machine (SVM) were comparatively used for modeling, and the instrumental test was achieved for the comprehensive quality of samples. Results showed the proposed cross-perception multi-sensors data fusion presented obvious superiority to the traditional data fusion methodologies, also achieved a high correlation coefficient (>90%) with the human panel test results. This work demonstrated that the instrumental test based on the cross-perception multi-sensors data fusion can actually mimic the human test behavior, therefore is of great significance to ensure the quality of products and decrease the loss of the manufacturers. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. The Research of Regression Method for Forecasting Monthly Electricity Sales Considering Coupled Multi-factor

    NASA Astrophysics Data System (ADS)

    Wang, Jiangbo; Liu, Junhui; Li, Tiantian; Yin, Shuo; He, Xinhui

    2018-01-01

    The monthly electricity sales forecasting is a basic work to ensure the safety of the power system. This paper presented a monthly electricity sales forecasting method which comprehensively considers the coupled multi-factors of temperature, economic growth, electric power replacement and business expansion. The mathematical model is constructed by using regression method. The simulation results show that the proposed method is accurate and effective.

  6. The Multi-Player Performance-Enhancing Drug Game

    PubMed Central

    Haugen, Kjetil K.; Nepusz, Tamás; Petróczi, Andrea

    2013-01-01

    This paper extends classical work on economics of doping into a multi-player game setting. Apart from being among the first papers formally formulating and analysing a multi-player doping situation, we find interesting results related to different types of Nash-equilibria (NE). Based mainly on analytic results, we claim at least two different NE structures linked to the choice of prize functions. Linear prize functions provide NEs characterised by either everyone or nobody taking drugs, while non-linear prize functions lead to qualitatively different NEs with significantly more complex predictive characteristics. PMID:23691018

  7. The Stability and Interfacial Motion of Multi-layer Radial Porous Media and Hele-Shaw Flows

    NASA Astrophysics Data System (ADS)

    Gin, Craig; Daripa, Prabir

    2017-11-01

    In this talk, we will discuss viscous fingering instabilities of multi-layer immiscible porous media flows within the Hele-Shaw model in a radial flow geometry. We study the motion of the interfaces for flows with both constant and variable viscosity fluids. We consider the effects of using a variable injection rate on multi-layer flows. We also present a numerical approach to simulating the interface motion within linear theory using the method of eigenfunction expansion. We compare these results with fully non-linear simulations.

  8. Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

    NASA Astrophysics Data System (ADS)

    Wu, Cheng; Zhen Yu, Jian

    2018-03-01

    Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.

  9. Random regression analyses using B-spline functions to model growth of Nellore cattle.

    PubMed

    Boligon, A A; Mercadante, M E Z; Lôbo, R B; Baldi, F; Albuquerque, L G

    2012-02-01

    The objective of this study was to estimate (co)variance components using random regression on B-spline functions to weight records obtained from birth to adulthood. A total of 82 064 weight records of 8145 females obtained from the data bank of the Nellore Breeding Program (PMGRN/Nellore Brazil) which started in 1987, were used. The models included direct additive and maternal genetic effects and animal and maternal permanent environmental effects as random. Contemporary group and dam age at calving (linear and quadratic effect) were included as fixed effects, and orthogonal Legendre polynomials of age (cubic regression) were considered as random covariate. The random effects were modeled using B-spline functions considering linear, quadratic and cubic polynomials for each individual segment. Residual variances were grouped in five age classes. Direct additive genetic and animal permanent environmental effects were modeled using up to seven knots (six segments). A single segment with two knots at the end points of the curve was used for the estimation of maternal genetic and maternal permanent environmental effects. A total of 15 models were studied, with the number of parameters ranging from 17 to 81. The models that used B-splines were compared with multi-trait analyses with nine weight traits and to a random regression model that used orthogonal Legendre polynomials. A model fitting quadratic B-splines, with four knots or three segments for direct additive genetic effect and animal permanent environmental effect and two knots for maternal additive genetic effect and maternal permanent environmental effect, was the most appropriate and parsimonious model to describe the covariance structure of the data. Selection for higher weight, such as at young ages, should be performed taking into account an increase in mature cow weight. Particularly, this is important in most of Nellore beef cattle production systems, where the cow herd is maintained on range conditions. There is limited modification of the growth curve of Nellore cattle with respect to the aim of selecting them for rapid growth at young ages while maintaining constant adult weight.

  10. A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis.

    PubMed

    Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga

    2006-08-01

    A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.

  11. Wavelet regression model in forecasting crude oil price

    NASA Astrophysics Data System (ADS)

    Hamid, Mohd Helmie; Shabri, Ani

    2017-05-01

    This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.

  12. Multishaker modal testing

    NASA Technical Reports Server (NTRS)

    Craig, Roy R., Jr.

    1987-01-01

    The major accomplishments of this research are: (1) the refinement and documentation of a multi-input, multi-output modal parameter estimation algorithm which is applicable to general linear, time-invariant dynamic systems; (2) the development and testing of an unsymmetric block-Lanzcos algorithm for reduced-order modeling of linear systems with arbitrary damping; and (3) the development of a control-structure-interaction (CSI) test facility.

  13. Multi-exponential analysis of magnitude MR images using a quantitative multispectral edge-preserving filter.

    PubMed

    Bonny, Jean Marie; Boespflug-Tanguly, Odile; Zanca, Michel; Renou, Jean Pierre

    2003-03-01

    A solution for discrete multi-exponential analysis of T(2) relaxation decay curves obtained in current multi-echo imaging protocol conditions is described. We propose a preprocessing step to improve the signal-to-noise ratio and thus lower the signal-to-noise ratio threshold from which a high percentage of true multi-exponential detection is detected. It consists of a multispectral nonlinear edge-preserving filter that takes into account the signal-dependent Rician distribution of noise affecting magnitude MR images. Discrete multi-exponential decomposition, which requires no a priori knowledge, is performed by a non-linear least-squares procedure initialized with estimates obtained from a total least-squares linear prediction algorithm. This approach was validated and optimized experimentally on simulated data sets of normal human brains.

  14. Partitioning sources of variation in vertebrate species richness

    USGS Publications Warehouse

    Boone, R.B.; Krohn, W.B.

    2000-01-01

    Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.

  15. Local hyperspectral data multisharpening based on linear/linear-quadratic nonnegative matrix factorization by integrating lidar data

    NASA Astrophysics Data System (ADS)

    Benhalouche, Fatima Zohra; Karoui, Moussa Sofiane; Deville, Yannick; Ouamri, Abdelaziz

    2015-10-01

    In this paper, a new Spectral-Unmixing-based approach, using Nonnegative Matrix Factorization (NMF), is proposed to locally multi-sharpen hyperspectral data by integrating a Digital Surface Model (DSM) obtained from LIDAR data. In this new approach, the nature of the local mixing model is detected by using the local variance of the object elevations. The hyper/multispectral images are explored using small zones. In each zone, the variance of the object elevations is calculated from the DSM data in this zone. This variance is compared to a threshold value and the adequate linear/linearquadratic spectral unmixing technique is used in the considered zone to independently unmix hyperspectral and multispectral data, using an adequate linear/linear-quadratic NMF-based approach. The obtained spectral and spatial information thus respectively extracted from the hyper/multispectral images are then recombined in the considered zone, according to the selected mixing model. Experiments based on synthetic hyper/multispectral data are carried out to evaluate the performance of the proposed multi-sharpening approach and literature linear/linear-quadratic approaches used on the whole hyper/multispectral data. In these experiments, real DSM data are used to generate synthetic data containing linear and linear-quadratic mixed pixel zones. The DSM data are also used for locally detecting the nature of the mixing model in the proposed approach. Globally, the proposed approach yields good spatial and spectral fidelities for the multi-sharpened data and significantly outperforms the used literature methods.

  16. RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

    PubMed

    Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

    2009-01-01

    This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).

  17. An alternative method for quantifying coronary artery calcification: the multi-ethnic study of atherosclerosis (MESA).

    PubMed

    Liang, C Jason; Budoff, Matthew J; Kaufman, Joel D; Kronmal, Richard A; Brown, Elizabeth R

    2012-07-02

    Extent of atherosclerosis measured by amount of coronary artery calcium (CAC) in computed tomography (CT) has been traditionally assessed using thresholded scoring methods, such as the Agatston score (AS). These thresholded scores have value in clinical prediction, but important information might exist below the threshold, which would have important advantages for understanding genetic, environmental, and other risk factors in atherosclerosis. We developed a semi-automated threshold-free scoring method, the spatially weighted calcium score (SWCS) for CAC in the Multi-Ethnic Study of Atherosclerosis (MESA). Chest CT scans were obtained from 6814 participants in the Multi-Ethnic Study of Atherosclerosis (MESA). The SWCS and the AS were calculated for each of the scans. Cox proportional hazards models and linear regression models were used to evaluate the associations of the scores with CHD events and CHD risk factors. CHD risk factors were summarized using a linear predictor. Among all participants and participants with AS > 0, the SWCS and AS both showed similar strongly significant associations with CHD events (hazard ratios, 1.23 and 1.19 per doubling of SWCS and AS; 95% CI, 1.16 to 1.30 and 1.14 to 1.26) and CHD risk factors (slopes, 0.178 and 0.164; 95% CI, 0.162 to 0.195 and 0.149 to 0.179). Even among participants with AS = 0, an increase in the SWCS was still significantly associated with established CHD risk factors (slope, 0.181; 95% CI, 0.138 to 0.224). The SWCS appeared to be predictive of CHD events even in participants with AS = 0, though those events were rare as expected. The SWCS provides a valid, continuous measure of CAC suitable for quantifying the extent of atherosclerosis without a threshold, which will be useful for examining novel genetic and environmental risk factors for atherosclerosis.

  18. Co-occurrence of psychotic experiences and common mental health conditions across four racially and ethnically diverse population samples.

    PubMed

    DeVylder, J E; Burnette, D; Yang, L H

    2014-12-01

    Prior research with racially/ethnically homogeneous samples has demonstrated widespread co-occurrence of psychotic experiences (PEs) and common mental health conditions, particularly multi-morbidity, suggesting that psychosis may be related to the overall severity of psychiatric disorder rather than any specific subtype. In this study we aimed to examine whether PEs are associated with the presence of specific disorders or multi-morbidity of co-occurring disorders across four large racially/ethnically diverse samples of adults in the USA. Data were drawn from the National Comorbidity Survey Replication (NCS-R), the National Survey of American Life (NSAL) and separately from the Asian and Latino subsamples of the National Latino and Asian American Study (NLAAS). Logistic regression models were used to examine the relationship between PEs and individual subtypes of DSM-IV disorder, and to test for a linear dose-response relationship between the number of subtypes and PEs. Prevalence of PEs was moderately greater among individuals with each subtype of disorder in each data set [odds ratios (ORs) 1.8-3.8], although associations were only variably significant when controlling for clinical and demographic variables. However, the sum of disorder subtypes was related to odds for PEs in a linear dose-response fashion across all four samples. PEs are related primarily to the extent or severity of psychiatric illness, as indicated by the presence of multiple psychiatric disorders, rather than to any particular subtype of disorder in these data. This relationship applies to the general population and across diverse racial/ethnic groups.

  19. Do factors related to combustion-based sources explain ...

    EPA Pesticide Factsheets

    Introduction: Spatial heterogeneity of effect estimates in associations between PM2.5 and total non-accidental mortality (TNA) in the United States (US), is an issue in epidemiology. This study uses rate ratios generated from the Multi-City/Multi-Pollutant study (1999-2005) for 313 core-based statistical areas (CBSA) and their metropolitan divisions (MD) to examine combustion-based sources of heterogeneity.Methods: For CBSA/MDs, area-specific log rate ratios (betas) were derived from a model adjusting for time, an interaction with age-group, day of week, and natural splines of current temperature, current dew point, and unconstrained temperature at lags 1, 2, and 3. We assessed the heterogeneity in the betas by linear regression with inverse variance weights, using average NO2, SO2, and CO, which may act as a combustion source proxy, and these pollutants’ correlations with PM2.5. Results: We found that weighted mean PM2.5 association (0.96 percent increase in total non-accidental mortality for a 10 µg/m3 increment in PM2.5) increased by 0.26 (95% confidence interval 0.08 , 0.44) for an interquartile change (0.2) in the correlation of SO2 and PM2.5., but betas showed less dependence on the annual averages of SO2 or NO2. Spline analyses suggest departures from linearity, particularly in a model that examined correlations between PM2.5 and CO.Conclusions: We conclude that correlations between SO2 and PM2.5 as an indicator of combustion sources explains some hete

  20. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.

  1. Pan evaporation modeling using six different heuristic computing methods in different climates of China

    NASA Astrophysics Data System (ADS)

    Wang, Lunche; Kisi, Ozgur; Zounemat-Kermani, Mohammad; Li, Hui

    2017-01-01

    Pan evaporation (Ep) plays important roles in agricultural water resources management. One of the basic challenges is modeling Ep using limited climatic parameters because there are a number of factors affecting the evaporation rate. This study investigated the abilities of six different soft computing methods, multi-layer perceptron (MLP), generalized regression neural network (GRNN), fuzzy genetic (FG), least square support vector machine (LSSVM), multivariate adaptive regression spline (MARS), adaptive neuro-fuzzy inference systems with grid partition (ANFIS-GP), and two regression methods, multiple linear regression (MLR) and Stephens and Stewart model (SS) in predicting monthly Ep. Long-term climatic data at various sites crossing a wide range of climates during 1961-2000 are used for model development and validation. The results showed that the models have different accuracies in different climates and the MLP model performed superior to the other models in predicting monthly Ep at most stations using local input combinations (for example, the MAE (mean absolute errors), RMSE (root mean square errors), and determination coefficient (R2) are 0.314 mm/day, 0.405 mm/day and 0.988, respectively for HEB station), while GRNN model performed better in Tibetan Plateau (MAE, RMSE and R2 are 0.459 mm/day, 0.592 mm/day and 0.932, respectively). The accuracies of above models ranked as: MLP, GRNN, LSSVM, FG, ANFIS-GP, MARS and MLR. The overall results indicated that the soft computing techniques generally performed better than the regression methods, but MLR and SS models can be more preferred at some climatic zones instead of complex nonlinear models, for example, the BJ (Beijing), CQ (Chongqing) and HK (Haikou) stations. Therefore, it can be concluded that Ep could be successfully predicted using above models in hydrological modeling studies.

  2. Post-processing through linear regression

    NASA Astrophysics Data System (ADS)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  3. Linear regression metamodeling as a tool to summarize and present simulation model results.

    PubMed

    Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

    2013-10-01

    Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

  4. Population Coding of Forelimb Joint Kinematics by Peripheral Afferents in Monkeys

    PubMed Central

    Umeda, Tatsuya; Seki, Kazuhiko; Sato, Masa-aki; Nishimura, Yukio; Kawato, Mitsuo; Isa, Tadashi

    2012-01-01

    Various peripheral receptors provide information concerning position and movement to the central nervous system to achieve complex and dexterous movements of forelimbs in primates. The response properties of single afferent receptors to movements at a single joint have been examined in detail, but the population coding of peripheral afferents remains poorly defined. In this study, we obtained multichannel recordings from dorsal root ganglion (DRG) neurons in cervical segments of monkeys. We applied the sparse linear regression (SLiR) algorithm to the recordings, which selects useful input signals to reconstruct movement kinematics. Multichannel recordings of peripheral afferents were performed by inserting multi-electrode arrays into the DRGs of lower cervical segments in two anesthetized monkeys. A total of 112 and 92 units were responsive to the passive joint movements or the skin stimulation with a painting brush in Monkey 1 and Monkey 2, respectively. Using the SLiR algorithm, we reconstructed the temporal changes of joint angle, angular velocity, and acceleration at the elbow, wrist, and finger joints from temporal firing patterns of the DRG neurons. By automatically selecting a subset of recorded units, the SLiR achieved superior generalization performance compared with a regularized linear regression algorithm. The SLiR selected not only putative muscle units that were responsive to only the passive movements, but also a number of putative cutaneous units responsive to the skin stimulation. These results suggested that an ensemble of peripheral primary afferents that contains both putative muscle and cutaneous units encode forelimb joint kinematics of non-human primates. PMID:23112841

  5. A 5-year scientometric analysis of research centers affiliated to Tehran University of Medical Sciences

    PubMed Central

    Yazdani, Kamran; Rahimi-Movaghar, Afarin; Nedjat, Saharnaz; Ghalichi, Leila; Khalili, Malahat

    2015-01-01

    Background: Since Tehran University of Medical Sciences (TUMS) has the oldest and highest number of research centers among all Iranian medical universities, this study was conducted to evaluate scientific output of research centers affiliated to Tehran University of Medical Sciences (TUMS) using scientometric indices and the affecting factors. Moreover, a number of scientometric indicators were introduced. Methods: This cross-sectional study was performed to evaluate a 5-year scientific performance of research centers of TUMS. Data were collected through questionnaires, annual evaluation reports of the Ministry of Health, and also from Scopus database. We used appropriate measures of central tendency and variation for descriptive analyses. Moreover, uni-and multi-variable linear regression were used to evaluate the effect of independent factors on the scientific output of the centers. Results: The medians of the numbers of papers and books during a 5-year period were 150.5 and 2.5 respectively. The median of the "articles per researcher" was 19.1. Based on multiple linear regression, younger age centers (p=0.001), having a separate budget line (p=0.016), and number of research personnel (p<0.001) had a direct significant correlation with the number of articles while real properties had a reverse significant correlation with it (p=0.004). Conclusion: The results can help policy makers and research managers to allocate sufficient resources to improve current situation of the centers. Newly adopted and effective scientometric indices are is suggested to be used to evaluate scientific outputs and functions of these centers. PMID:26157724

  6. Solo and Multi-Offenders Who Commit Stranger Kidnapping: An Assessment of Factors That Correlate With Violent Events.

    PubMed

    Cunningham, Shannon N; Vandiver, Donna M

    2016-03-06

    Research has demonstrated that co-offending dyads and groups often use more violence than individual offenders. Despite the attention given to co-offending by the research community, kidnapping remains understudied. Stranger kidnappings are more likely than non-stranger kidnappings to involve the use of a weapon. Public fear of stranger kidnapping warrants further examination of this specific crime, including differences between those committed by solo and multi-offender groups. The current study uses National Incident-Based Reporting System (NIBRS) data to assess differences in use of violence among 4,912 stranger kidnappings by solo offenders and multi-offender groups using cross-tabulations, ordinal regression, and logistic regression. The results indicate that violent factors are significantly more common in multi-offender incidents, and that multi-offender groups have fewer arrests than solo offenders. The implications of these findings are discussed. © The Author(s) 2016.

  7. Grouped gene selection and multi-classification of acute leukemia via new regularized multinomial regression.

    PubMed

    Li, Juntao; Wang, Yanyan; Jiang, Tao; Xiao, Huimin; Song, Xuekun

    2018-05-09

    Diagnosing acute leukemia is the necessary prerequisite to treating it. Multi-classification on the gene expression data of acute leukemia is help for diagnosing it which contains B-cell acute lymphoblastic leukemia (BALL), T-cell acute lymphoblastic leukemia (TALL) and acute myeloid leukemia (AML). However, selecting cancer-causing genes is a challenging problem in performing multi-classification. In this paper, weighted gene co-expression networks are employed to divide the genes into groups. Based on the dividing groups, a new regularized multinomial regression with overlapping group lasso penalty (MROGL) has been presented to simultaneously perform multi-classification and select gene groups. By implementing this method on three-class acute leukemia data, the grouped genes which work synergistically are identified, and the overlapped genes shared by different groups are also highlighted. Moreover, MROGL outperforms other five methods on multi-classification accuracy. Copyright © 2017. Published by Elsevier B.V.

  8. Structure-function relationships using spectral-domain optical coherence tomography: comparison with scanning laser polarimetry.

    PubMed

    Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe

    2010-12-01

    To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.

  9. Wavelet analysis for the study of the relations among soil radon anomalies, volcanic and seismic events: the case of Mt. Etna (Italy)

    NASA Astrophysics Data System (ADS)

    Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido

    2013-04-01

    From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.

  10. A Simulation-Based Comparison of Several Stochastic Linear Regression Methods in the Presence of Outliers.

    ERIC Educational Resources Information Center

    Rule, David L.

    Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…

  11. Decision making for best cogeneration power integration into a grid

    NASA Astrophysics Data System (ADS)

    Al Asmar, Joseph; Zakhia, Nadim; Kouta, Raed; Wack, Maxime

    2016-07-01

    Cogeneration systems are known to be efficient power systems for their ability to reduce pollution. Their integration into a grid requires simultaneous consideration of the economic and environmental challenges. Thus, an optimal cogeneration power are adopted to face such challenges. This work presents a novelty in selectinga suitable solution using heuristic optimization method. Its aim is to optimize the cogeneration capacity to be installed according to the economic and environmental concerns. This novelty is based on the sensitivity and data analysis method, namely, Multiple Linear Regression (MLR). This later establishes a compromise between power, economy, and pollution, which leads to find asuitable cogeneration power, and further, to be integrated into a grid. The data exploited were the results of the Genetic Algorithm (GA) multi-objective optimization. Moreover, the impact of the utility's subsidy on the selected power is shown.

  12. Evaluation of globally available precipitation data products as input for water balance models

    NASA Astrophysics Data System (ADS)

    Lebrenz, H.; Bárdossy, A.

    2009-04-01

    Subject of this study is the evaluation of globally available precipitation data products, which are intended to be used as input variables for water balance models in ungauged basins. The selected data sources are a) the Global Precipitation Climatology Centre (GPCC), b) the Global Precipitation Climatology Project (GPCP) and c) the Climate Research Unit (CRU), resulting into twelve globally available data products. The data products imply different data bases, different derivation routines and varying resolutions in time and space. For validation purposes, the ground data from South Africa were screened on homogeneity and consistency by various tests and an outlier detection using multi-linear regression was performed. External Drift Kriging was subsequently applied on the ground data and the resulting precipitation arrays were compared to the different products with respect to quantity and variance.

  13. Age-related differences in health complaints: the Hilo women's health study.

    PubMed

    Sievert, Lynnette Leidy; Morrison, Lynn A; Reza, Angela M; Brown, Daniel E; Kalua, Erin; Tefft, Harold A T

    2007-01-01

    The purpose of this study was to determine the age distribution of health-related complaints and symptom groupings from a random postal survey carried out in the multi-ethnic city of Hilo, Hawaii. Symptom frequencies and factor analyses were compared across three age categories: < 40 (32%), 40-60 (48%), and > 60 years (19%), (n = 1,796). Younger women were most likely to report headaches, menstrual complaints, irritability, and mood swings. Women at midlife were most likely to report fluid retention, trouble sleeping, loss of sexual desire, vasomotor symptoms, and nervous tension. Older women reported the least number of symptoms overall. Using multiple linear regression, menopause status, ethnicity, and alcohol intake were significantly associated with the factor scores for symptoms of menopause, after controlling for age, education, BMI, exercise, smoking habits, and financial comfort.

  14. Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance

    DTIC Science & Technology

    1989-12-01

    v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed

  15. Comparison of Selection Procedures and Validation of Criterion Used in Selection of Significant Control Variates of a Simulation Model

    DTIC Science & Technology

    1990-03-01

    and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models

  16. Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

    ERIC Educational Resources Information Center

    Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

    2013-01-01

    Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…

  17. Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course

    ERIC Educational Resources Information Center

    Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna

    2010-01-01

    Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…

  18. What Is Wrong with ANOVA and Multiple Regression? Analyzing Sentence Reading Times with Hierarchical Linear Models

    ERIC Educational Resources Information Center

    Richter, Tobias

    2006-01-01

    Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…

  19. Some Applied Research Concerns Using Multiple Linear Regression Analysis.

    ERIC Educational Resources Information Center

    Newman, Isadore; Fraas, John W.

    The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…

  20. Using Simple Linear Regression to Assess the Success of the Montreal Protocol in Reducing Atmospheric Chlorofluorocarbons

    ERIC Educational Resources Information Center

    Nelson, Dean

    2009-01-01

    Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…

  1. Quantum State Tomography via Linear Regression Estimation

    PubMed Central

    Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan

    2013-01-01

    A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519

  2. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  3. Unified treatment of microscopic boundary conditions and efficient algorithms for estimating tangent operators of the homogenized behavior in the computational homogenization method

    NASA Astrophysics Data System (ADS)

    Nguyen, Van-Dung; Wu, Ling; Noels, Ludovic

    2017-03-01

    This work provides a unified treatment of arbitrary kinds of microscopic boundary conditions usually considered in the multi-scale computational homogenization method for nonlinear multi-physics problems. An efficient procedure is developed to enforce the multi-point linear constraints arising from the microscopic boundary condition either by the direct constraint elimination or by the Lagrange multiplier elimination methods. The macroscopic tangent operators are computed in an efficient way from a multiple right hand sides linear system whose left hand side matrix is the stiffness matrix of the microscopic linearized system at the converged solution. The number of vectors at the right hand side is equal to the number of the macroscopic kinematic variables used to formulate the microscopic boundary condition. As the resolution of the microscopic linearized system often follows a direct factorization procedure, the computation of the macroscopic tangent operators is then performed using this factorized matrix at a reduced computational time.

  4. A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.

    PubMed

    Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S

    2017-06-01

    The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.

  5. Multi-objective Optimization of Pulsed Gas Metal Arc Welding Process Using Neuro NSGA-II

    NASA Astrophysics Data System (ADS)

    Pal, Kamal; Pal, Surjya K.

    2018-05-01

    Weld quality is a critical issue in fabrication industries where products are custom-designed. Multi-objective optimization results number of solutions in the pareto-optimal front. Mathematical regression model based optimization methods are often found to be inadequate for highly non-linear arc welding processes. Thus, various global evolutionary approaches like artificial neural network, genetic algorithm (GA) have been developed. The present work attempts with elitist non-dominated sorting GA (NSGA-II) for optimization of pulsed gas metal arc welding process using back propagation neural network (BPNN) based weld quality feature models. The primary objective to maintain butt joint weld quality is the maximization of tensile strength with minimum plate distortion. BPNN has been used to compute the fitness of each solution after adequate training, whereas NSGA-II algorithm generates the optimum solutions for two conflicting objectives. Welding experiments have been conducted on low carbon steel using response surface methodology. The pareto-optimal front with three ranked solutions after 20th generations was considered as the best without further improvement. The joint strength as well as transverse shrinkage was found to be drastically improved over the design of experimental results as per validated pareto-optimal solutions obtained.

  6. Relationships between menopausal and mood symptoms and EEG sleep measures in a multi-ethnic sample of middle-aged women: the SWAN sleep study.

    PubMed

    Kravitz, Howard M; Avery, Elizabeth; Sowers, Maryfran; Bromberger, Joyce T; Owens, Jane F; Matthews, Karen A; Hall, Martica; Zheng, Huiyong; Gold, Ellen B; Buysse, Daniel J

    2011-09-01

    Examine associations of vasomotor and mood symptoms with visually scored and computer-generated measures of EEG sleep. Cross-sectional analysis. Community-based in-home polysomnography (PSG). 343 African American, Caucasian, and Chinese women; ages 48-58 years; pre-, peri- or post-menopausal; participating in the Study of Women's Health Across the Nation Sleep Study (SWAN Sleep Study). None. Measures included PSG-assessed sleep duration, continuity, and architecture, delta sleep ratio (DSR) computed from automated counts of delta wave activity, daily diary-assessed vasomotor symptoms (VMS), questionnaires to collect mood (depression, anxiety) symptoms, medication, and lifestyle information, and menopausal status using bleeding criteria. Sleep outcomes were modeled using linear regression. Nocturnal VMS were associated with longer sleep time. Higher anxiety symptom scores were associated with longer sleep latency and lower sleep efficiency, but only in women reporting nocturnal VMS. Contrary to expectations, VMS and mood symptoms were unrelated to either DSR or REM latency. Vasomotor symptoms moderated associations of anxiety with EEG sleep measures of sleep latency and sleep efficiency and was associated with longer sleep duration in this multi-ethnic sample of midlife women.

  7. Application of multi-criteria decision-making to risk prioritisation in tidal energy developments

    NASA Astrophysics Data System (ADS)

    Kolios, Athanasios; Read, George; Ioannou, Anastasia

    2016-01-01

    This paper presents an analytical multi-criterion analysis for the prioritisation of risks for the development of tidal energy projects. After a basic identification of risks throughout the project and relevant stakeholders in the UK, classified through a political, economic, social, technological, legal and environmental analysis, relevant questionnaires provided scores to each risk and corresponding weights for each of the different sectors. Employing an extended technique for order of preference by similarity to ideal solution as well as the weighted sum method based on the data obtained, the risks identified are ranked based on their criticality, drawing attention of the industry in mitigating the ones scoring higher. Both methods were modified to take averages at different stages of the analysis in order to observe the effects on the final risk ranking. A sensitivity analysis of the results was also carried out with regard to the weighting factors given to the perceived expertise of participants, with different results being obtained whether a linear, squared or square root regression is used. Results of the study show that academics and industry have conflicting opinions with regard to the perception of the most critical risks.

  8. Regression of non-linear coupling of noise in LIGO detectors

    NASA Astrophysics Data System (ADS)

    Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.

    2018-03-01

    In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.

  9. Toward Control of Universal Scaling in Critical Dynamics

    DTIC Science & Technology

    2016-01-27

    program that aims to synergistically combine two powerful and very successful theories for non-linear stochastic dynamics of cooperative multi...RESPONSIBLE PERSON 19b. TELEPHONE NUMBER Uwe Tauber Uwe C. T? uber , Michel Pleimling, Daniel J. Stilwell 611102 c. THIS PAGE The public reporting burden...to synergistically combine two powerful and very successful theories for non-linear stochastic dynamics of cooperative multi-component systems, namely

  10. Reference Models for Multi-Layer Tissue Structures

    DTIC Science & Technology

    2016-09-01

    simulation,  finite   element  analysis 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME OF RESPONSIBLE PERSON USAMRMC...Physiologically realistic, fully specimen-specific, nonlinear reference models. Tasks. Finite element analysis of non-linear mechanics of cadaver...models. Tasks. Finite element analysis of non-linear mechanics of multi-layer tissue regions of human subjects. Deliverables. Partially subject- and

  11. QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions.

    PubMed

    Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan

    2012-12-01

    A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  13. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  14. Random regression models on Legendre polynomials to estimate genetic parameters for weights from birth to adult age in Canchim cattle.

    PubMed

    Baldi, F; Albuquerque, L G; Alencar, M M

    2010-08-01

    The objective of this work was to estimate covariance functions for direct and maternal genetic effects, animal and maternal permanent environmental effects, and subsequently, to derive relevant genetic parameters for growth traits in Canchim cattle. Data comprised 49,011 weight records on 2435 females from birth to adult age. The model of analysis included fixed effects of contemporary groups (year and month of birth and at weighing) and age of dam as quadratic covariable. Mean trends were taken into account by a cubic regression on orthogonal polynomials of animal age. Residual variances were allowed to vary and were modelled by a step function with 1, 4 or 11 classes based on animal's age. The model fitting four classes of residual variances was the best. A total of 12 random regression models from second to seventh order were used to model direct and maternal genetic effects, animal and maternal permanent environmental effects. The model with direct and maternal genetic effects, animal and maternal permanent environmental effects fitted by quadric, cubic, quintic and linear Legendre polynomials, respectively, was the most adequate to describe the covariance structure of the data. Estimates of direct and maternal heritability obtained by multi-trait (seven traits) and random regression models were very similar. Selection for higher weight at any age, especially after weaning, will produce an increase in mature cow weight. The possibility to modify the growth curve in Canchim cattle to obtain animals with rapid growth at early ages and moderate to low mature cow weight is limited.

  15. Linear FBG Temperature Sensor Interrogation with Fabry-Perot ITU Multi-wavelength Reference.

    PubMed

    Park, Hyoung-Jun; Song, Minho

    2008-10-29

    The equidistantly spaced multi-passbands of a Fabry-Perot ITU filter are used as an efficient multi-wavelength reference for fiber Bragg grating sensor demodulation. To compensate for the nonlinear wavelength tuning effect in the FBG sensor demodulator, a polynomial fitting algorithm was applied to the temporal peaks of the wavelength-scanned ITU filter. The fitted wavelength values are assigned to the peak locations of the FBG sensor reflections, obtaining constant accuracy, regardless of the wavelength scan range and frequency. A linearity error of about 0.18% against a reference thermocouple thermometer was obtained with the suggested method.

  16. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  17. Lung nodule malignancy prediction using multi-task convolutional neural network

    NASA Astrophysics Data System (ADS)

    Li, Xiuli; Kao, Yueying; Shen, Wei; Li, Xiang; Xie, Guotong

    2017-03-01

    In this paper, we investigated the problem of diagnostic lung nodule malignancy prediction using thoracic Computed Tomography (CT) screening. Unlike most existing studies classify the nodules into two types benign and malignancy, we interpreted the nodule malignancy prediction as a regression problem to predict continuous malignancy level. We proposed a joint multi-task learning algorithm using Convolutional Neural Network (CNN) to capture nodule heterogeneity by extracting discriminative features from alternatingly stacked layers. We trained a CNN regression model to predict the nodule malignancy, and designed a multi-task learning mechanism to simultaneously share knowledge among 9 different nodule characteristics (Subtlety, Calcification, Sphericity, Margin, Lobulation, Spiculation, Texture, Diameter and Malignancy), and improved the final prediction result. Each CNN would generate characteristic-specific feature representations, and then we applied multi-task learning on the features to predict the corresponding likelihood for that characteristic. We evaluated the proposed method on 2620 nodules CT scans from LIDC-IDRI dataset with the 5-fold cross validation strategy. The multitask CNN regression result for regression RMSE and mapped classification ACC were 0.830 and 83.03%, while the results for single task regression RMSE 0.894 and mapped classification ACC 74.9%. Experiments show that the proposed method could predict the lung nodule malignancy likelihood effectively and outperforms the state-of-the-art methods. The learning framework could easily be applied in other anomaly likelihood prediction problem, such as skin cancer and breast cancer. It demonstrated the possibility of our method facilitating the radiologists for nodule staging assessment and individual therapeutic planning.

  18. Predictive and mechanistic multivariate linear regression models for reaction development

    PubMed Central

    Santiago, Celine B.; Guo, Jing-Yao

    2018-01-01

    Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711

  19. Adding a Parameter Increases the Variance of an Estimated Regression Function

    ERIC Educational Resources Information Center

    Withers, Christopher S.; Nadarajah, Saralees

    2011-01-01

    The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…

  20. Using nonlinear quantile regression to estimate the self-thinning boundary curve

    Treesearch

    Quang V. Cao; Thomas J. Dean

    2015-01-01

    The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...

  1. Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.

    PubMed

    Habib, I H I; Hassouna, M E M; Zaki, G A

    2005-03-01

    Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.

  2. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  3. Modeling the frequency of opposing left-turn conflicts at signalized intersections using generalized linear regression models.

    PubMed

    Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei

    2014-01-01

    The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.

  4. Impact of Article Language in Multi-Language Medical Journals - a Bibliometric Analysis of Self-Citations and Impact Factor

    PubMed Central

    Diekhoff, Torsten; Schlattmann, Peter; Dewey, Marc

    2013-01-01

    Background In times of globalization there is an increasing use of English in the medical literature. The aim of this study was to analyze the influence of English-language articles in multi-language medical journals on their international recognition – as measured by a lower rate of self-citations and higher impact factor (IF). Methods and Findings We analyzed publications in multi-language journals in 2008 and 2009 using the Web of Science (WoS) of Thomson Reuters (former Institute of Scientific Information) and PubMed as sources of information. The proportion of English-language articles during the period was compared with both the share of self-citations in the year 2010 and the IF with and without self-citations. Multivariable linear regression analysis was performed to analyze these factors as well as the influence of the journals‘ countries of origin and of the other language(s) used in publications besides English. We identified 168 multi-language journals that were listed in WoS as well as in PubMed and met our criteria. We found a significant positive correlation of the share of English articles in 2008 and 2009 with the IF calculated without self-citations (Pearson r=0.56, p = <0.0001), a correlation with the overall IF (Pearson r = 0.47, p = <0.0001) and with the cites to years of IF calculation (Pearson r = 0.34, p = <0.0001), and a weak negative correlation with the share of self-citations (Pearson r = -0.2, p = 0.009). The IF without self-citations also correlated with the journal‘s country of origin – North American journals had a higher IF compared to Middle and South American or European journals. Conclusion Our findings suggest that a larger share of English articles in multi-language medical journals is associated with greater international recognition. Fewer self-citations were found in multi-language journals with a greater share of original articles in English. PMID:24146929

  5. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  6. Numerical analysis of combustion characteristics of hybrid rocket motor with multi-section swirl injection

    NASA Astrophysics Data System (ADS)

    Li, Chengen; Cai, Guobiao; Tian, Hui

    2016-06-01

    This paper is aimed to analyse the combustion characteristics of hybrid rocket motor with multi-section swirl injection by simulating the combustion flow field. Numerical combustion flow field and combustion performance parameters are obtained through three-dimensional numerical simulations based on a steady numerical model proposed in this paper. The hybrid rocket motor adopts 98% hydrogen peroxide and polyethylene as the propellants. Multiple injection sections are set along the axis of the solid fuel grain, and the oxidizer enters the combustion chamber by means of tangential injection via the injector ports in the injection sections. Simulation results indicate that the combustion flow field structure of the hybrid rocket motor could be improved by multi-section swirl injection method. The transformation of the combustion flow field can greatly increase the fuel regression rate and the combustion efficiency. The average fuel regression rate of the motor with multi-section swirl injection is improved by 8.37 times compared with that of the motor with conventional head-end irrotational injection. The combustion efficiency is increased to 95.73%. Besides, the simulation results also indicate that (1) the additional injection sections can increase the fuel regression rate and the combustion efficiency; (2) the upstream offset of the injection sections reduces the combustion efficiency; and (3) the fuel regression rate and the combustion efficiency decrease with the reduction of the number of injector ports in each injection section.

  7. Application and evaluation of ISVR method in QuickBird image fusion

    NASA Astrophysics Data System (ADS)

    Cheng, Bo; Song, Xiaolu

    2014-05-01

    QuickBird satellite images are widely used in many fields, and applications have put forward high requirements for the integration of the spatial information and spectral information of the imagery. A fusion method for high resolution remote sensing images based on ISVR is identified in this study. The core principle of ISVS is taking the advantage of radicalization targeting to remove the effect of different gain and error of satellites' sensors. Transformed from DN to radiance, the multi-spectral image's energy is used to simulate the panchromatic band. The linear regression analysis is carried through the simulation process to find a new synthetically panchromatic image, which is highly linearly correlated to the original panchromatic image. In order to evaluate, test and compare the algorithm results, this paper used ISVR and other two different fusion methods to give a comparative study of the spatial information and spectral information, taking the average gradient and the correlation coefficient as an indicator. Experiments showed that this method could significantly improve the quality of fused image, especially in preserving spectral information, to maximize the spectral information of original multispectral images, while maintaining abundant spatial information.

  8. Quantitative MRI establishes the efficacy of PI3K inhibitor (GDC-0941) multi-treatments in PTEN-deficient mice lymphoma.

    PubMed

    Wullschleger, Stephan; García-Martínez, Juan M; Duce, Suzanne L

    2012-02-01

    To assess the efficacy of multiple treatment of phosphatidylinositol-3-kinase (PI3K) inhibitor on autochthonous tumours in phosphatase and tensin homologue (Pten)-deficient genetically engineered mouse cancer models using a longitudinal magnetic resonance imaging (MRI) protocol. Using 3D MRI, B-cell follicular lymphoma growth was quantified in a Pten(+/-)Lkb1(+/hypo) mouse line, before, during and after repeated treatments with a PI3K inhibitor GDC-0941 (75 mg/kg). Mean pre-treatment linear tumour growth rate was 16.5±12.8 mm(3)/week. Repeated 28-day GDC-0941 administration, with 21 days 'off-treatment', induced average tumour regression of 41±7%. Upon cessation of the second treatment (which was not permanently cytocidal), tumours re-grew with an average linear growth rate of 40.1±15.5 mm(3)/week. There was no evidence of chemoresistance. This protocol can accommodate complex dosing schedules, as well as combine different cancer therapies. It reduces biological variability problems and resulted in a 10-fold reduction in mouse numbers compared with terminal assessment methods. It is ideal for preclinical efficacy studies and for phenotyping molecularly characterized mouse models when investigating gene function.

  9. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    NASA Astrophysics Data System (ADS)

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.

  10. Parameterized data-driven fuzzy model based optimal control of a semi-batch reactor.

    PubMed

    Kamesh, Reddi; Rani, K Yamuna

    2016-09-01

    A parameterized data-driven fuzzy (PDDF) model structure is proposed for semi-batch processes, and its application for optimal control is illustrated. The orthonormally parameterized input trajectories, initial states and process parameters are the inputs to the model, which predicts the output trajectories in terms of Fourier coefficients. Fuzzy rules are formulated based on the signs of a linear data-driven model, while the defuzzification step incorporates a linear regression model to shift the domain from input to output domain. The fuzzy model is employed to formulate an optimal control problem for single rate as well as multi-rate systems. Simulation study on a multivariable semi-batch reactor system reveals that the proposed PDDF modeling approach is capable of capturing the nonlinear and time-varying behavior inherent in the semi-batch system fairly accurately, and the results of operating trajectory optimization using the proposed model are found to be comparable to the results obtained using the exact first principles model, and are also found to be comparable to or better than parameterized data-driven artificial neural network model based optimization results. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  11. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    PubMed Central

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883

  12. Impact of temperature on mortality in Hubei, China: a multi-county time series analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Yunquan; Yu, Chuanhua; Bao, Junzhe; Li, Xudong

    2017-03-01

    We examined the impact of extreme temperatures on mortality in 12 counties across Hubei Province, central China, during 2009-2012. Quasi-Poisson generalized linear regression combined with distributed lag non-linear model was first applied to estimate county-specific relationship between temperature and mortality. A multivariable meta-analysis was then used to pool the estimates of county-specific mortality effects of extreme cold temperature (1st percentile) and hot temperature (99th percentile). An inverse J-shaped relationship was observed between temperature and mortality at the provincial level. Heat effect occurred immediately and persisted for 2-3 days, whereas cold effect was 1-2 days delayed and much longer lasting. Higher mortality risks were observed among females, the elderly aged over 75 years, persons dying outside the hospital and those with high education attainment, especially for cold effects. Our data revealed some slight differences in heat- and cold- related mortality effects on urban and rural residents. These findings may have important implications for developing locally-based preventive and intervention strategies to reduce temperature-related mortality, especially for those susceptible subpopulations. Also, urbanization should be considered as a potential influence factor when evaluating temperature-mortality association in future researches.

  13. Image interpolation via regularized local linear regression.

    PubMed

    Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang

    2011-12-01

    The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE

  14. Multi-objective experimental design for (13)C-based metabolic flux analysis.

    PubMed

    Bouvin, Jeroen; Cajot, Simon; D'Huys, Pieter-Jan; Ampofo-Asiama, Jerry; Anné, Jozef; Van Impe, Jan; Geeraerd, Annemie; Bernaerts, Kristel

    2015-10-01

    (13)C-based metabolic flux analysis is an excellent technique to resolve fluxes in the central carbon metabolism but costs can be significant when using specialized tracers. This work presents a framework for cost-effective design of (13)C-tracer experiments, illustrated on two different networks. Linear and non-linear optimal input mixtures are computed for networks for Streptomyces lividans and a carcinoma cell line. If only glucose tracers are considered as labeled substrate for a carcinoma cell line or S. lividans, the best parameter estimation accuracy is obtained by mixtures containing high amounts of 1,2-(13)C2 glucose combined with uniformly labeled glucose. Experimental designs are evaluated based on a linear (D-criterion) and non-linear approach (S-criterion). Both approaches generate almost the same input mixture, however, the linear approach is favored due to its low computational effort. The high amount of 1,2-(13)C2 glucose in the optimal designs coincides with a high experimental cost, which is further enhanced when labeling is introduced in glutamine and aspartate tracers. Multi-objective optimization gives the possibility to assess experimental quality and cost at the same time and can reveal excellent compromise experiments. For example, the combination of 100% 1,2-(13)C2 glucose with 100% position one labeled glutamine and the combination of 100% 1,2-(13)C2 glucose with 100% uniformly labeled glutamine perform equally well for the carcinoma cell line, but the first mixture offers a decrease in cost of $ 120 per ml-scale cell culture experiment. We demonstrated the validity of a multi-objective linear approach to perform optimal experimental designs for the non-linear problem of (13)C-metabolic flux analysis. Tools and a workflow are provided to perform multi-objective design. The effortless calculation of the D-criterion can be exploited to perform high-throughput screening of possible (13)C-tracers, while the illustrated benefit of multi-objective design should stimulate its application within the field of (13)C-based metabolic flux analysis. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Research in Applied Mathematics Related to Mathematical System Theory.

    DTIC Science & Technology

    1977-06-01

    This report deals with research results obtained in the field of mathematical system theory . Special emphasis was given to the following areas: (1...Linear system theory over a field: parametrization of multi-input, multi-output systems and the geometric structure of classes of systems of...constant dimension. (2) Linear systems over a ring: development of the theory for very general classes of rings. (3) Nonlinear system theory : basic

  16. A comparison of two multi-variable integrator windup protection schemes

    NASA Technical Reports Server (NTRS)

    Mattern, Duane

    1993-01-01

    Two methods are examined for limit and integrator wind-up protection for multi-input, multi-output linear controllers subject to actuator constraints. The methods begin with an existing linear controller that satisfies the specifications for the nominal, small perturbation, linear model of the plant. The controllers are formulated to include an additional contribution to the state derivative calculations. The first method to be examined is the multi-variable version of the single-input, single-output, high gain, Conventional Anti-Windup (CAW) scheme. Except for the actuator limits, the CAW scheme is linear. The second scheme to be examined, denoted the Modified Anti-Windup (MAW) scheme, uses a scalar to modify the magnitude of the controller output vector while maintaining the vector direction. The calculation of the scalar modifier is a nonlinear function of the controller outputs and the actuator limits. In both cases the constrained actuator is tracked. These two integrator windup protection methods are demonstrated on a turbofan engine control system with five measurements, four control variables, and four actuators. The closed-loop responses of the two schemes are compared and contrasted during limit operation. The issue of maintaining the direction of the controller output vector using the Modified Anti-Windup scheme is discussed and the advantages and disadvantages of both of the IWP methods are presented.

  17. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    PubMed Central

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  18. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    PubMed

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  19. Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

    PubMed

    Kumar, K Vasanth; Porkodi, K; Rocha, F

    2008-01-15

    A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.

  20. Applied Multiple Linear Regression: A General Research Strategy

    ERIC Educational Resources Information Center

    Smith, Brandon B.

    1969-01-01

    Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)

  1. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666

  2. Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

    PubMed

    Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

    2016-03-01

    In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

    PubMed

    Bennett, Bradley C; Husby, Chad E

    2008-03-28

    Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

  4. An Application to the Prediction of LOD Change Based on General Regression Neural Network

    NASA Astrophysics Data System (ADS)

    Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.

    2011-07-01

    Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.

  5. Solving a mixture of many random linear equations by tensor decomposition and alternating minimization.

    DOT National Transportation Integrated Search

    2016-09-01

    We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...

  6. Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation

    NASA Astrophysics Data System (ADS)

    Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.

    A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.

  7. Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

    PubMed

    Yang, Xiaowei; Nie, Kun

    2008-03-15

    Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.

  8. Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino

    2018-07-01

    Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.

  9. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  10. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    NASA Astrophysics Data System (ADS)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  11. A User''s Guide to the Zwikker-Kosten Transmission Line Code (ZKTL)

    NASA Technical Reports Server (NTRS)

    Kelly, J. J.; Abu-Khajeel, H.

    1997-01-01

    This user's guide documents updates to the Zwikker-Kosten Transmission Line Code (ZKTL). This code was developed for analyzing new liner concepts developed to provide increased sound absorption. Contiguous arrays of multi-degree-of-freedom (MDOF) liner elements serve as the model for these liner configurations, and Zwikker and Kosten's theory of sound propagation in channels is used to predict the surface impedance. Transmission matrices for the various liner elements incorporate both analytical and semi-empirical methods. This allows standard matrix techniques to be employed in the code to systematically calculate the composite impedance due to the individual liner elements. The ZKTL code consists of four independent subroutines: 1. Single channel impedance calculation - linear version (SCIC) 2. Single channel impedance calculation - nonlinear version (SCICNL) 3. Multi-channel, multi-segment, multi-layer impedance calculation - linear version (MCMSML) 4. Multi-channel, multi-segment, multi-layer impedance calculation - nonlinear version (MCMSMLNL) Detailed examples, comments, and explanations for each liner impedance computation module are included. Also contained in the guide are depictions of the interactive execution, input files and output files.

  12. Differences in musculoskeletal health due to gender in a rural multiethnic cohort: a Project FRONTIER study.

    PubMed

    Brismée, J M; Yang, S; Lambert, M E; Chyu, M C; Tsai, P; Zhang, Y; Han, J; Hudson, C; Chung, Eunhee; Shen, C L

    2016-04-26

    Very few studies have investigated differences in musculoskeletal health due to gender in a large rural population. The aim of this study is to investigate factors affecting musculoskeletal health in terms of hand grip strength, musculoskeletal discomfort, and gait disturbance in a rural-dwelling, multi-ethnic cohort. Data for 1117 participants (40 years and older, 70% female) of an ongoing rural healthcare study, Project FRONTIER, were analyzed. Subjects with a history of neurological disease, stroke and movement disorder were excluded. Dominant hand grip strength was assessed by dynamometry. Gait disturbance including stiff, spastic, narrow-based, wide-based, unstable or shuffling gait was rated. Musculoskeletal discomfort was assessed by self-reported survey. Data were analyzed by linear, logistic regression and negative binomial regressions as appropriate. Demographic and socioeconomic factors were adjusted in the multiple variable analyses. In both genders, advanced age was a risk factor for weaker hand grip strength; arthritis was positively associated with musculoskeletal discomfort, and fair or poor health was significantly associated with increased risk of gait disturbance. Greater waist circumference was associated with greater musculoskeletal discomfort in males only. In females, advanced age is the risk factor for musculoskeletal discomfort as well as gait disturbance. Females with fair or poor health had weaker hand grip strength. Higher C-reactive protein and HbA1c levels were also positively associated with gait disturbance in females, but not in males. This cross-sectional study demonstrates how gender affects hand grip strength, musculoskeletal discomfort, and gait in a rural-dwelling multi-ethnic cohort. Our results suggest that musculoskeletal health may need to be assessed differently between males and females.

  13. Modeling methodology for a CMOS-MEMS electrostatic comb

    NASA Astrophysics Data System (ADS)

    Iyer, Sitaraman V.; Lakdawala, Hasnain; Mukherjee, Tamal; Fedder, Gary K.

    2002-04-01

    A methodology for combined modeling of capacitance and force 9in a multi-layer electrostatic comb is demonstrated in this paper. Conformal mapping-based analytical methods are limited to 2D symmetric cross-sections and cannot account for charge concentration effects at corners. Vertex capacitance can be more than 30% of the total capacitance in a single-layer 2 micrometers thick comb with 10 micrometers overlap. Furthermore, analytical equations are strictly valid only for perfectly symmetrical finger positions. Fringing and corner effects are likely to be more significant in a multi- layered CMOS-MEMS comb because of the presence of more edges and vertices. Vertical curling of CMOS-MEMS comb fingers may also lead to reduced capacitance and vertical forces. Gyroscopes are particularly sensitive to such undesirable forces, which therefore, need to be well-quantified. In order to address the above issues, a hybrid approach of superposing linear regression models over a set of core analytical models is implemented. Design of experiments is used to obtain data for capacitance and force using a commercial 3D boundary-element solver. Since accurate force values require significantly higher mesh refinement than accurate capacitance, we use numerical derivatives of capacitance values to compute the forces. The model is formulated such that the capacitance and force models use the same regression coefficients. The comb model thus obtained, fits the numerical capacitance data to within +/- 3% and force to within +/- 10%. The model is experimentally verified by measuring capacitance change in a specially designed test structure. The capacitance model matches measurements to within 10%. The comb model is implemented in an Analog Hardware Description Language (ADHL) for use in behavioral simulation of manufacturing variations in a CMOS-MEMS gyroscope.

  14. Labor market integration, immigration experience, and psychological distress in a multi-ethnic sample of immigrants residing in Portugal.

    PubMed

    Teixeira, Ana F; Dias, Sónia F

    2018-01-01

    This study aims at examining how factors relating to immigrants' experience in the host country affect psychological distress (PD). Specifically, we analyzed the association among socio-economic status (SES), integration in the labor market, specific immigration experience characteristics, and PD in a multi-ethnic sample of immigrant individuals residing in Lisbon, Portugal. Using a sample (n = 1375) consisting of all main immigrant groups residing in Portugal's metropolitan area of Lisbon, we estimated multivariable linear regression models of PD regressed on selected sets of socio-economic independent variables. A psychological distress scale was constructed based on five items (feeling physically tired, feeling psychologically tired, feeling happy, feeling full of energy, and feeling lonely). Variables associated with a decrease in PD are being a male (demographic), being satisfied with their income level (SES), living with the core family and having higher number of children (social isolation), planning to remain for longer periods of time in Portugal (migration project), and whether respondents considered themselves to be in good health condition (subjective health status). Study variables negatively associated with immigrants' PD were job insecurity (labor market), and the perception that health professionals were not willing to understand immigrants during a clinical interaction. The study findings emphasized the importance of labor market integration and access to good quality jobs for immigrants' psychological well-being, as well as the existence of family ties in the host country, intention to reside long term in the host country, and high subjective (physical) health. Our research suggests the need to foster cross-national studies of immigrant populations in order to understand the social mechanisms that transverse all migrant groups and contribute to lower psychological well-being.

  15. Stratification for the propensity score compared with linear regression techniques to assess the effect of treatment or exposure.

    PubMed

    Senn, Stephen; Graf, Erika; Caputo, Angelika

    2007-12-30

    Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.

  16. [Research on the methods for multi-class kernel CSP-based feature extraction].

    PubMed

    Wang, Jinjia; Zhang, Lingzhi; Hu, Bei

    2012-04-01

    To relax the presumption of strictly linear patterns in the common spatial patterns (CSP), we studied the kernel CSP (KCSP). A new multi-class KCSP (MKCSP) approach was proposed in this paper, which combines the kernel approach with multi-class CSP technique. In this approach, we used kernel spatial patterns for each class against all others, and extracted signal components specific to one condition from EEG data sets of multiple conditions. Then we performed classification using the Logistic linear classifier. Brain computer interface (BCI) competition III_3a was used in the experiment. Through the experiment, it can be proved that this approach could decompose the raw EEG singles into spatial patterns extracted from multi-class of single trial EEG, and could obtain good classification results.

  17. Linear FBG Temperature Sensor Interrogation with Fabry-Perot ITU Multi-wavelength Reference

    PubMed Central

    Park, Hyoung-Jun; Song, Minho

    2008-01-01

    The equidistantly spaced multi-passbands of a Fabry-Perot ITU filter are used as an efficient multi-wavelength reference for fiber Bragg grating sensor demodulation. To compensate for the nonlinear wavelength tuning effect in the FBG sensor demodulator, a polynomial fitting algorithm was applied to the temporal peaks of the wavelength-scanned ITU filter. The fitted wavelength values are assigned to the peak locations of the FBG sensor reflections, obtaining constant accuracy, regardless of the wavelength scan range and frequency. A linearity error of about 0.18% against a reference thermocouple thermometer was obtained with the suggested method. PMID:27873898

  18. Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.

    PubMed

    Trninić, Marko; Jeličić, Mario; Papić, Vladan

    2015-07-01

    In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.

  19. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  20. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    PubMed

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  1. Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool.

    PubMed

    Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi

    2007-10-01

    Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.

  2. An hourly PM10 diagnosis model for the Bilbao metropolitan area using a linear regression methodology.

    PubMed

    González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O

    2013-07-01

    There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.

  3. Visual field progression in glaucoma: estimating the overall significance of deterioration with permutation analyses of pointwise linear regression (PoPLR).

    PubMed

    O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H

    2012-10-01

    To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.

  4. A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

    ERIC Educational Resources Information Center

    Liou, Pey-Yan

    2009-01-01

    The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

  5. Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.

    PubMed

    Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W

    1992-03-01

    The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.

  6. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    PubMed

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  7. Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

    NASA Astrophysics Data System (ADS)

    Gusriani, N.; Firdaniza

    2018-03-01

    The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.

  8. Multi-metals column adsorption of lead(II), cadmium(II) and manganese(II) onto natural bentonite clay.

    PubMed

    Alexander, Jock Asanja; Surajudeen, Abdulsalam; Aliyu, El-Nafaty Usman; Omeiza, Aroke Umar; Zaini, Muhammad Abbas Ahmad

    2017-10-01

    The present work was aimed at evaluating the multi-metals column adsorption of lead(II), cadmium(II) and manganese(II) ions onto natural bentonite. The bentonite clay adsorbent was characterized for physical and chemical properties using X-ray diffraction, X-ray fluorescence, Brunauer-Emmett-Teller surface area and cation exchange capacity. The column performance was evaluated using adsorbent bed height of 5.0 cm, with varying influent concentrations (10 mg/L and 50 mg/L) and flow rates (1.4 mL/min and 2.4 mL/min). The result shows that the breakthrough time for all metal ions ranged from 50 to 480 minutes. The maximum adsorption capacity was obtained at initial concentration of 10 mg/L and flow rate of 1.4 mL/min, with 2.22 mg/g of lead(II), 1.71 mg/g of cadmium(II) and 0.37 mg/g of manganese(II). The order of metal ions removal by natural bentonite is lead(II) > cadmium(II) > manganese(II). The sorption performance and the dynamic behaviour of the column were predicted using Adams-Bohart, Thomas, and Yoon-Nelson models. The linear regression analysis demonstrated that the Thomas and Yoon-Nelson models fitted well with the column adsorption data for all metal ions. The natural bentonite was effective for the treatment of wastewater laden with multi-metals, and the process parameters obtained from this work can be used at the industrial scale.

  9. Frequency Selection for Multi-frequency Acoustic Measurement of Suspended Sediment

    NASA Astrophysics Data System (ADS)

    Chen, X.; HO, H.; Fu, X.

    2017-12-01

    Multi-frequency acoustic measurement of suspended sediment has found successful applications in marine and fluvial environments. Difficult challenges remain in regard to improving its effectiveness and efficiency when applied to high concentrations and wide size distributions in rivers. We performed a multi-frequency acoustic scattering experiment in a cylindrical tank with a suspension of natural sands. The sands range from 50 to 600 μm in diameter with a lognormal size distribution. The bulk concentration of suspended sediment varied from 1.0 to 12.0 g/L. We found that the commonly used linear relationship between the intensity of acoustic backscatter and suspended sediment concentration holds only at sufficiently low concentrations, for instance below 3.0 g/L. It fails at a critical value of concentration that depends on measurement frequency and the distance between the transducer and the target point. Instead, an exponential relationship was found to work satisfactorily throughout the entire range of concentration. The coefficient and exponent of the exponential function changed, however, with the measuring frequency and distance. Considering the increased complexity of inverting the concentration values when an exponential relationship prevails, we further analyzed the relationship between measurement error and measuring frequency. It was also found that the inversion error may be effectively controlled within 5% if the frequency is properly set. Compared with concentration, grain size was found to heavily affect the selection of optimum frequency. A regression relationship for optimum frequency versus grain size was developed based on the experimental results.

  10. SAE for the prediction of road traffic status from taxicab operating data and bus smart card data

    NASA Astrophysics Data System (ADS)

    Zhengfeng, Huang; Pengjun, Zheng; Wenjun, Xu; Gang, Ren

    Road traffic status is significant for trip decision and traffic management, and thus should be predicted accurately. A contribution is that we consider multi-modal data for traffic status prediction than only using single source data. With the substantial data from Ningbo Passenger Transport Management Sector (NPTMS), we wished to determine whether it was possible to develop Stacked Autoencoders (SAEs) for accurately predicting road traffic status from taxicab operating data and bus smart card data. We show that SAE performed better than linear regression model and Back Propagation (BP) neural network for determining the relationship between road traffic status and those factors. In a 26-month data experiment using SAE, we show that it is possible to develop highly accurate predictions (91% test accuracy) of road traffic status from daily taxicab operating data and bus smart card data.

  11. A parameter estimation subroutine package

    NASA Technical Reports Server (NTRS)

    Bierman, G. J.; Nead, M. W.

    1978-01-01

    Linear least squares estimation and regression analyses continue to play a major role in orbit determination and related areas. In this report we document a library of FORTRAN subroutines that have been developed to facilitate analyses of a variety of estimation problems. Our purpose is to present an easy to use, multi-purpose set of algorithms that are reasonably efficient and which use a minimal amount of computer storage. Subroutine inputs, outputs, usage and listings are given along with examples of how these routines can be used. The following outline indicates the scope of this report: Section (1) introduction with reference to background material; Section (2) examples and applications; Section (3) subroutine directory summary; Section (4) the subroutine directory user description with input, output, and usage explained; and Section (5) subroutine FORTRAN listings. The routines are compact and efficient and are far superior to the normal equation and Kalman filter data processing algorithms that are often used for least squares analyses.

  12. Social networks and health-related quality of life: a population based study among older adults.

    PubMed

    Gallegos-Carrillo, Katia; Mudgal, Jyoti; Sánchez-García, Sergio; Wagner, Fernando A; Gallo, Joseph J; Salmerón, Jorge; García-Peña, Carmen

    2009-01-01

    To examine the relationship between components of social networks and health-related quality of life (HRQL) in older adults with and without depressive symptoms. Comparative cross-sectional study with data from the cohort study 'Integral Study of Depression', carried out in Mexico City during 2004. The sample was selected through a multi-stage probability design. HRQL was measured with the SF-36. Geriatric Depression Scale (GDS) and the Short Anxiety Screening Test (SAST) determined depressive symptoms and anxiety. T-test and multiple linear regressions were conducted. Older adults with depressive symptoms had the lowest scores in all HRQL scales. A larger network of close relatives and friends was associated with better HRQL on several scales. Living alone did not significantly affect HRQL level, in either the study or comparison group. A positive association between some components of social networks and good HRQL exists even in older adults with depressive symptoms.

  13. The standard calibration instrument automation system for the atomic absorption spectrophotometer. Part 3: Program documentation

    NASA Astrophysics Data System (ADS)

    Ryan, D. P.; Roth, G. S.

    1982-04-01

    Complete documentation of the 15 programs and 11 data files of the EPA Atomic Absorption Instrument Automation System is presented. The system incorporates the following major features: (1) multipoint calibration using first, second, or third degree regression or linear interpolation, (2) timely quality control assessments for spiked samples, duplicates, laboratory control standards, reagent blanks, and instrument check standards, (3) reagent blank subtraction, and (4) plotting of calibration curves and raw data peaks. The programs of this system are written in Data General Extended BASIC, Revision 4.3, as enhanced for multi-user, real-time data acquisition. They run in a Data General Nova 840 minicomputer under the operating system RDOS, Revision 6.2. There is a functional description, a symbol definitions table, a functional flowchart, a program listing, and a symbol cross reference table for each program. The structure of every data file is also detailed.

  14. A multi-data source surveillance system to detect a bioterrorism attack during the G8 Summit in Scotland.

    PubMed

    Meyer, N; McMenamin, J; Robertson, C; Donaghy, M; Allardice, G; Cooper, D

    2008-07-01

    In 18 weeks, Health Protection Scotland (HPS) deployed a syndromic surveillance system to early-detect natural or intentional disease outbreaks during the G8 Summit 2005 at Gleneagles, Scotland. The system integrated clinical and non-clinical datasets. Clinical datasets included Accident & Emergency (A&E) syndromes, and General Practice (GPs) codes grouped into syndromes. Non-clinical data included telephone calls to a nurse helpline, laboratory test orders, and hotel staff absenteeism. A cumulative sum-based detection algorithm and a log-linear regression model identified signals in the data. The system had a fax-based track for real-time identification of unusual presentations. Ninety-five signals were triggered by the detection algorithms and four forms were faxed to HPS. Thirteen signals were investigated. The system successfully complemented a traditional surveillance system in identifying a small cluster of gastroenteritis among the police force and triggered interventions to prevent further cases.

  15. Use of anomolous thermal imaging effects for multi-mode systems control during crystal growth

    NASA Technical Reports Server (NTRS)

    Wargo, Michael J.

    1989-01-01

    Real time image processing techniques, combined with multitasking computational capabilities are used to establish thermal imaging as a multimode sensor for systems control during crystal growth. Whereas certain regions of the high temperature scene are presently unusable for quantitative determination of temperature, the anomalous information thus obtained is found to serve as a potentially low noise source of other important systems control output. Using this approach, the light emission/reflection characteristics of the crystal, meniscus and melt system are used to infer the crystal diameter and a linear regression algorithm is employed to determine the local diameter trend. This data is utilized as input for closed loop control of crystal shape. No performance penalty in thermal imaging speed is paid for this added functionality. Approach to secondary (diameter) sensor design and systems control structure is discussed. Preliminary experimental results are presented.

  16. In-line moisture monitoring in fluidized bed granulation using a novel multi-resonance microwave sensor.

    PubMed

    Peters, Johanna; Bartscher, Kathrin; Döscher, Claas; Taute, Wolfgang; Höft, Michael; Knöchel, Reinhard; Breitkreutz, Jörg

    2017-08-01

    Microwave resonance technology (MRT) is known as a process analytical technology (PAT) tool for moisture measurements in fluid-bed granulation. It offers a great potential for wet granulation processes even where the suitability of near-infrared (NIR) spectroscopy is limited, e.g. colored granules, large variations in bulk density. However, previous sensor systems operating around a single resonance frequency showed limitations above approx. 7.5% granule moisture. This paper describes the application of a novel sensor working with four resonance frequencies. In-line data of all four resonance frequencies were collected and further processed. Based on calculation of density-independent microwave moisture values multiple linear regression (MLR) models using Karl-Fischer titration (KF) as well as loss on drying (LOD) as reference methods were build. Rapid, reliable in-process moisture control (RMSEP≤0.5%) even at higher moisture contents was achieved. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Simultaneous determination of 12 chemical constituents in the traditional Chinese Medicinal Prescription Xiao-Yao-San-Jia-Wei by HPLC coupled with photodiode array detection.

    PubMed

    Zhang, Hongmin; Chen, Shiwei; Qin, Feng; Huang, Xi; Ren, Ping; Gu, Xinqi

    2008-12-15

    An HPLC-photodiode array (PDA) detection method was established for the simultaneous determination of 12 components in Xiao-Yao-San-Jia-Wei (XYSJW): geniposide, puerarin, paeoniflorin, ferulic acid, liquiritin, hesperidin, naringin, paeonol, daidzein, glycyrrhizic acid, honokiol, and magnolol. These were separated in less than 70 min using a Waters Symmetry Shield RP 18 column with gradient elution using (A) acetonitrile, (B) water, and (C) acetic acid at a flow rate of 1 ml/min, and with a PDA detector. All calibration curves showed good linear regression (r(2)>0.9992) within the test ranges. The method was validated for specificity, accuracy, precision, and limits of detection. The proposed method enables in a single run the simultaneous identification and determination for quality control of 12 multi-structural components of XYSJW forming the basis of its therapeutic effect.

  18. Analysis of factors that inhibiting implementation of Information Security Management System (ISMS) based on ISO 27001

    NASA Astrophysics Data System (ADS)

    Tatiara, R.; Fajar, A. N.; Siregar, B.; Gunawan, W.

    2018-03-01

    The purpose of this research is to determine multi factors that inhibiting the implementation of the ISMS based on ISO 2700. It is also to propose a follow-up recommendation on the factors that inhibit the implementation of the ISMS. Data collection is derived from questionnaires to 182 respondents from users in data center operation (DCO) at bca, Indonesian telecommunication international (telin), and data centre division at Indonesian Ministry of Health. We analysing data collection with multiple linear regression analysis and paired t-test. The results are multiple factors which inhibiting the implementation of the ISMS from the three organizations which has implement and operate the ISMS, ISMS documentation management, and continual improvement. From this research, we concluded that the processes of implementation in ISMS is the necessity of the role of all parties in succeeding the implementation of the ISMS continuously.

  19. Exposure to tobacco and nicotine product advertising: Associations with perceived prevalence of use among college students.

    PubMed

    Kreitzberg, Daniel S; Herrera, Ana Laura; Loukas, Alexandra; Pasch, Keryn E

    2018-03-22

    The purpose of this study was to examine the relationship between exposure to tobacco marketing and perceptions of peer tobacco use among college students. Participants were 5,767 undergraduate students from 19 colleges/universities in the State of Texas. Students completed an online survey, in the spring of 2016, that assessed past 30 day exposure to e-cigarette, cigar, smokeless tobacco, and traditional cigarette advertising across multiple marketing channels, past 30 day use of each product, and perceived prevalence of peer use. Multi-level linear regression models were run to examine the associations between exposure to tobacco advertising and perceptions of peer tobacco use controlling for age, gender, race/ethnicity, use and school. Greater exposure to advertising was associated with greater perceived prevalence of peer use. Given the normative effects of advertising on perceived peer tobacco use, college tobacco initiatives should include descriptive norms education to counteract inaccurate perceptions.

  20. Orthogonal Projection in Teaching Regression and Financial Mathematics

    ERIC Educational Resources Information Center

    Kachapova, Farida; Kachapov, Ilias

    2010-01-01

    Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…

  1. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  2. A single blood test adjusted for different liver fibrosis targets improves fibrosis staging and especially cirrhosis diagnosis.

    PubMed

    Calès, Paul; Boursier, Jérôme; Oberti, Frédéric; Moal, Valérie; Fouchard Hubert, Isabelle; Bertrais, Sandrine; Hunault, Gilles; Rousselet, Marie Christine

    2018-04-01

    Fibrosis blood tests are usually developed using significant fibrosis, which is a unique diagnostic target; however, these tests are employed for other diagnostic targets, such as cirrhosis. We aimed to improve fibrosis staging accuracy by simultaneously targeting biomarkers for several diagnostic targets. A total of 3,809 patients were included, comprising 1,012 individuals with chronic hepatitis C (CHC) into a derivation population and 2,797 individuals into validation populations of different etiologies (CHC, chronic hepatitis B, human immunodeficiency virus/CHC, nonalcoholic fatty liver disease, alcohol) using Metavir fibrosis stages as reference. FibroMeter biomarkers were targeted for different fibrosis-stage combinations into classical scores by logistic regression. Independent scores were combined into a single score reflecting Metavir stages by linear regression and called Multi-FibroMeter Version Second Generation (V2G). The primary objective was to combine the advantages of a test targeted for significant fibrosis (FibroMeter V2G ) with those of a test targeted for cirrhosis (CirrhoMeter V2G ). In the derivation CHC population, we first compared Multi-FibroMeter V2G to FibroMeter V2G and observed significant increases in the cirrhosis area under the receiver operating characteristic curve (AUROC), Obuchowski index (reflecting all fibrosis-stage AUROCs), and classification metric (six classes expressed as a correctly classified percentage) and a nonsignificant increase in significant fibrosis AUROC. Thereafter, we compared it to CirroMeter V2G and observed a nonsignificant increase in the cirrhosis AUROC. In all 3,809 patients, respective accuracies for Multi-FibroMeter V2G and FibroMeter V2G were the following: cirrhosis AUROC, 0.906 versus 0.878 ( P < 0.001; versus CirroMeter V2G , 0.897, P = 0.014); Obuchowski index, 0.795 versus 0.791 ( P = 0.059); classification, 86.0% versus 82.1% ( P < 0.001); significant fibrosis AUROC, 0.833 versus 0.832 ( P = 0.366). Multi-FibroMeter V2G had the highest correlation with the area of portoseptal fibrosis and the highest reproducibility over time. Correct classification rates of Multi-FibroMeter with hyaluronate (V2G, 86.0%) or without (V3G, 86.1%) did not differ ( P = 0.938). Conclusion: Multitargeting biomarkers significantly improves fibrosis staging and especially cirrhosis diagnosis compared to classical single-targeted blood tests. ( Hepatology Communications 2018;2:455-466).

  3. [Study of Cervical Exfoliated Cell's DNA Quantitative Analysis Based on Multi-Spectral Imaging Technology].

    PubMed

    Wu, Zheng; Zeng, Li-bo; Wu, Qiong-shui

    2016-02-01

    The conventional cervical cancer screening methods mainly include TBS (the bethesda system) classification method and cellular DNA quantitative analysis, however, by using multiple staining method in one cell slide, which is staining the cytoplasm with Papanicolaou reagent and the nucleus with Feulgen reagent, the study of achieving both two methods in the cervical cancer screening at the same time is still blank. Because the difficulty of this multiple staining method is that the absorbance of the non-DNA material may interfere with the absorbance of DNA, so that this paper has set up a multi-spectral imaging system, and established an absorbance unmixing model by using multiple linear regression method based on absorbance's linear superposition character, and successfully stripped out the absorbance of DNA to run the DNA quantitative analysis, and achieved the perfect combination of those two kinds of conventional screening method. Through a series of experiment we have proved that between the absorbance of DNA which is calculated by the absorbance unmixxing model and the absorbance of DNA which is measured there is no significant difference in statistics when the test level is 1%, also the result of actual application has shown that there is no intersection between the confidence interval of the DNA index of the tetraploid cells which are screened by using this paper's analysis method when the confidence level is 99% and the DNA index's judging interval of cancer cells, so that the accuracy and feasibility of the quantitative DNA analysis with multiple staining method expounded by this paper have been verified, therefore this analytical method has a broad application prospect and considerable market potential in early diagnosis of cervical cancer and other cancers.

  4. Genetics-based control of a mimo boiler-turbine plant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dimeo, R.M.; Lee, K.Y.

    1994-12-31

    A genetic algorithm is used to develop an optimal controller for a non-linear, multi-input/multi-output boiler-turbine plant. The algorithm is used to train a control system for the plant over a wide operating range in an effort to obtain better performance. The results of the genetic algorithm`s controller designed from the linearized plant model at a nominal operating point. Because the genetic algorithm is well-suited to solving traditionally difficult optimization problems it is found that the algorithm is capable of developing the controller based on input/output information only. This controller achieves a performance comparable to the standard linear quadratic regulator.

  5. Electronic and spectroscopic characterizations of SNP isomers

    NASA Astrophysics Data System (ADS)

    Trabelsi, Tarek; Al Mogren, Muneerah Mogren; Hochlaf, Majdi; Francisco, Joseph S.

    2018-02-01

    High-level ab initio electronic structure calculations were performed to characterize SNP isomers. In addition to the known linear SNP, cyc-PSN, and linear SPN isomers, we identified a fourth isomer, linear PSN, which is located ˜2.4 eV above the linear SNP isomer. The low-lying singlet and triplet electronic states of the linear SNP and SPN isomers were investigated using a multi-reference configuration interaction method and large basis set. Several bound electronic states were identified. However, their upper rovibrational levels were predicted to pre-dissociate, leading to S + PN, P + NS products, and multi-step pathways were discovered. For the ground states, a set of spectroscopic parameters were derived using standard and explicitly correlated coupled-cluster methods in conjunction with augmented correlation-consistent basis sets extrapolated to the complete basis set limit. We also considered scalar and core-valence effects. For linear isomers, the rovibrational spectra were deduced after generation of their 3D-potential energy surfaces along the stretching and bending coordinates and variational treatments of the nuclear motions.

  6. Two-year clinical trial of a universal adhesive in total-etch and self-etch mode in non-carious cervical lesions☆

    PubMed Central

    Lawson, Nathaniel C.; Robles, Augusto; Fu, Chin-Chuan; Lin, Chee Paul; Sawlani, Kanchan; Burgess, John O.

    2016-01-01

    Objectives To compare the clinical performance of Scotchbond™ Universal Adhesive used in self- and total-etch modes and two-bottle Scotchbond™ Multi-purpose Adhesive in total-etch mode for Class 5 non-carious cervical lesions (NCCLs). Methods 37 adults were recruited with 3 or 6 NCCLs (>1.5 mm deep). Teeth were isolated, and a short cervical bevel was prepared. Teeth were restored randomly with Scotchbond Universal total-etch, Scotchbond Universal self-etch or Scotchbond Multi-purpose followed with a composite resin. Restorations were evaluated at baseline, 6, 12 and 24 months for marginal adaptation, marginal discoloration, secondary caries, and sensitivity to cold using modified USPHS Criteria. Patients and evaluators were blinded. Logistic and linear regression models using a generalized estimating equation were applied to evaluate the effects of time and adhesive material on clinical assessment outcomes over the 24 month follow-up period. Kaplan–Meier method was used to compare the retention between adhesive materials. Results Clinical performance of all adhesive materials deteriorated over time for marginal adaptation, and discoloration (p <0.0001). Both Scotchbond Universal self-etch and Scotchbond Multi-purpose materials were more than three times as likely to contribute to less satisfying performance in marginal discoloration over time than Scotchbond Universal total-etch. The retention rates up to 24 months were 87.6%, 94.9% and 100% for Scotchbond Multi-purpose and Scotchbond Universal self-etch and total-etch, respectively. Conclusions Scotchbond Universal in self- and total- etch modes performed similar to or better than Scotchbond Multipurpose, respectively. Clinical significance 24 month evaluation of a universal adhesive indicates acceptable clinical performance, particularly in a total-etch mode. PMID:26231300

  7. Analysis of Learning Curve Fitting Techniques.

    DTIC Science & Technology

    1987-09-01

    1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied

  8. On vertical profile of ozone at Syowa

    NASA Technical Reports Server (NTRS)

    Chubachi, Shigeru

    1994-01-01

    The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.

  9. Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

    PubMed

    Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

    2018-01-01

    The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  11. Multi-mode sliding mode control for precision linear stage based on fixed or floating stator.

    PubMed

    Fang, Jiwen; Long, Zhili; Wang, Michael Yu; Zhang, Lufan; Dai, Xufei

    2016-02-01

    This paper presents the control performance of a linear motion stage driven by Voice Coil Motor (VCM). Unlike the conventional VCM, the stator of this VCM is regulated, which means it can be adjusted as a floating-stator or fixed-stator. A Multi-Mode Sliding Mode Control (MMSMC), including a conventional Sliding Mode Control (SMC) and an Integral Sliding Mode Control (ISMC), is designed to control the linear motion stage. The control is switched between SMC and IMSC based on the error threshold. To eliminate the chattering, a smooth function is adopted instead of a signum function. The experimental results with the floating stator show that the positioning accuracy and tracking performance of the linear motion stage are improved with the MMSMC approach.

  12. Nonlinear isochrones in murine left ventricular pressure-volume loops: how well does the time-varying elastance concept hold?

    PubMed

    Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P

    2006-04-01

    The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.

  13. Does Nonlinear Modeling Play a Role in Plasmid Bioprocess Monitoring Using Fourier Transform Infrared Spectra?

    PubMed

    Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M

    2017-06-01

    The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.

  14. Physical activity in Iranian older adults who experienced fall during the past 12 months.

    PubMed

    Salehi, Leili; Shokrvash, Behjat; Jamshidi, Ensiyeh; Montazeri, Ali

    2014-10-31

    Physical activity may have several benefits for elderly people. However, the risk of falling might prevent this population from showing interest in physical activity. This research was aimed to explore facilitators and barriers to physical activity in older persons who have experienced at least one fall in the past 12 months. This cross sectional study was conducted in 2010-2011, in Tehran, Iran. Using a multistage sampling method a group of elderly people entered into the study. A multi-section questionnaire was used to collect data on demographic information, physical activity level, and different determinants that might influence physical activity. Several statistical tests including linear regression were used to analyze the data. In all, 180 old people from 40 elderly centers (49 men and 131 women) took part in the study. The mean age of participants was 65.9 ± 6.1 years. The result indicated that most participants experienced two or more falls during the last year (54.5%). Those who had more falls significantly scored lower on the Physical Activity Scale for Elderly (p < 0.0001). 'Keeping in touch with friends' was the most important advantage cited by participants for performing physical activity. The results obtained from linear regression analysis showed that 'perceived benefits' was the only significant factor that associated with physical activity (β = 1.03, p < 0.001). The findings suggest that perceived benefits could facilitate physical activity among elderly regardless of number of falls, self-reported health and daily living activities. However, we observed inverse association between number of falls and physical activity. Indeed the findings suggest that we should reinforce benefits exist when designing programs to increase physical activity among elderly population.

  15. Temporal trend of green space coverage in China and its relationship with urbanization over the last two decades.

    PubMed

    Zhao, Juanjuan; Chen, Shengbin; Jiang, Bo; Ren, Yin; Wang, Hua; Vause, Jonathan; Yu, Haidong

    2013-01-01

    Irrespective of which side is taken in the densification-sprawl debate, insights into the relationship between urban green space coverage and urbanization have been recognized as essential for guiding sustainable urban development. However, knowledge of the relationships between socio-economic variables of urbanization and long-term green space change is still limited. In this paper, using simple regression, hierarchical partitioning and multi-regression, the temporal trend in green space coverage and its relationship with urbanization were investigated using data from 286 cities between 1989 and 2009, covering all provinces in mainland China with the exception of Tibet. We found that: [1] average green space coverage of cities investigated increased steadily from 17.0% in 1989 to 37.3% in 2009; [2] cities with higher recent green space coverage also had relatively higher green space coverage historically; [3] cities in the same region exhibited similar long-term trends in green space coverage; [4] eight of the nine variables characterizing urbanization showed a significant positive linear relationship with green space coverage, with 'per capita GDP' having the highest independent contribution (24.2%); [5] among the climatic and geographic factors investigated, only mean elevation showed a significant effect; and [6] using the seven largest contributing individual factors, a linear model to predict variance in green space coverage was constructed. Here, we demonstrated that green space coverage in built-up areas tended to reflect the effects of urbanization rather than those of climatic or geographic factors. Quantification of the urbanization effects and the characteristics of green space development in China may provide a valuable reference for research into the processes of urban sprawl and its relationship with green space change. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. Assessment of Various Remote Sensing Technologies in Biomass and Nitrogen Content Estimation Using AN Agricultural Test Field

    NASA Astrophysics Data System (ADS)

    Näsi, R.; Viljanen, N.; Kaivosoja, J.; Hakala, T.; Pandžić, M.; Markelin, L.; Honkavaara, E.

    2017-10-01

    Multispectral and hyperspectral imaging is usually acquired by satellite and aircraft platforms. Recently, miniaturized hyperspectral 2D frame cameras have showed great potential to precise agriculture estimations and they are feasible to combine with lightweight platforms, such as drones. Drone platform is a flexible tool for remote sensing applications with environment and agriculture. The assessment and comparison of different platforms such as satellite, aircraft and drones with different sensors, such as hyperspectral and RGB cameras is an important task in order to understand the potential of the data provided by these equipment and to select the most appropriate according to the user applications and requirements. In this context, open and permanent test fields are very significant and helpful experimental environment, since they provide a comparative data for different platforms, sensors and users, allowing multi-temporal analyses as well. Objective of this work was to investigate the feasibility of an open permanent test field in context of precision agriculture. Satellite (Sentinel-2), aircraft and drones with hyperspectral and RGB cameras were assessed in this study to estimate biomass, using linear regression models and in-situ samples. Spectral data and 3D information were used and compared in different combinations to investigate the quality of the models. The biomass estimation accuracies using linear regression models were better than 90 % for the drone based datasets. The results showed that the use of spectral and 3D features together improved the estimation model. However, estimation of nitrogen content was less accurate with the evaluated remote sensing sensors. The open and permanent test field showed to be suitable to provide an accurate and reliable reference data for the commercial users and farmers.

  17. Impact of preoperative calculation of nephron volume loss on future of partial nephrectomy techniques; planning a strategic roadmap for improving functional preservation and securing oncological safety.

    PubMed

    Rha, Koon H; Abdel Raheem, Ali; Park, Sung Y; Kim, Kwang H; Kim, Hyung J; Koo, Kyo C; Choi, Young D; Jung, Byung H; Lee, Sang K; Lee, Won K; Krishnan, Jayram; Shin, Tae Y; Cho, Jin-Seon

    2017-11-01

    To assess the correlation of the resected and ischaemic volume (RAIV), which is a preoperatively calculated volume of nephron loss, with the amount of postoperative renal function (PRF) decline after minimally invasive partial nephrectomy (PN) in a multi-institutional dataset. We identified 348 patients from March 2005 to December 2013 at six institutions. Data on all cases of laparoscopic (n = 85) and robot-assisted PN (n = 263) performed were retrospectively gathered. Univariable and multivariable linear regression analyses were used to identify the associations between various time points of PRF and the RAIV, as a continuous variable. The mean (sd) RAIV was 24.2 (29.2) cm 3 . The mean preoperative estimated glomerular filtration rate (eGFR) and the eGFRs at postoperative day 1, 6 and 36 months after PN were 91.0 and 76.8, 80.2 and 87.7 mL/min/1.73 m 2 , respectively. In multivariable linear regression analysis, the amount of decline in PRF at follow-up was significantly correlated with the RAIV (β 0.261, 0.165, 0.260 at postoperative day 1, 6 and 36 months after PN, respectively). This study has the limitation of its retrospective nature. Preoperatively calculated RAIV significantly correlates with the amount of decline in PRF during long-term follow-up. The RAIV could lead our research to the level of prediction of the amount of PRF decline after PN and thus would be appropriate for assessing the technical advantages of emerging techniques. © 2017 The Authors BJU International © 2017 BJU International Published by John Wiley & Sons Ltd.

  18. VoxelStats: A MATLAB Package for Multi-Modal Voxel-Wise Brain Image Analysis.

    PubMed

    Mathotaarachchi, Sulantha; Wang, Seqian; Shin, Monica; Pascoal, Tharick A; Benedet, Andrea L; Kang, Min Su; Beaudry, Thomas; Fonov, Vladimir S; Gauthier, Serge; Labbe, Aurélie; Rosa-Neto, Pedro

    2016-01-01

    In healthy individuals, behavioral outcomes are highly associated with the variability on brain regional structure or neurochemical phenotypes. Similarly, in the context of neurodegenerative conditions, neuroimaging reveals that cognitive decline is linked to the magnitude of atrophy, neurochemical declines, or concentrations of abnormal protein aggregates across brain regions. However, modeling the effects of multiple regional abnormalities as determinants of cognitive decline at the voxel level remains largely unexplored by multimodal imaging research, given the high computational cost of estimating regression models for every single voxel from various imaging modalities. VoxelStats is a voxel-wise computational framework to overcome these computational limitations and to perform statistical operations on multiple scalar variables and imaging modalities at the voxel level. VoxelStats package has been developed in Matlab(®) and supports imaging formats such as Nifti-1, ANALYZE, and MINC v2. Prebuilt functions in VoxelStats enable the user to perform voxel-wise general and generalized linear models and mixed effect models with multiple volumetric covariates. Importantly, VoxelStats can recognize scalar values or image volumes as response variables and can accommodate volumetric statistical covariates as well as their interaction effects with other variables. Furthermore, this package includes built-in functionality to perform voxel-wise receiver operating characteristic analysis and paired and unpaired group contrast analysis. Validation of VoxelStats was conducted by comparing the linear regression functionality with existing toolboxes such as glim_image and RMINC. The validation results were identical to existing methods and the additional functionality was demonstrated by generating feature case assessments (t-statistics, odds ratio, and true positive rate maps). In summary, VoxelStats expands the current methods for multimodal imaging analysis by allowing the estimation of advanced regional association metrics at the voxel level.

  19. Multiscale characterization and prediction of monsoon rainfall in India using Hilbert-Huang transform and time-dependent intrinsic correlation analysis

    NASA Astrophysics Data System (ADS)

    Adarsh, S.; Reddy, M. Janga

    2017-07-01

    In this paper, the Hilbert-Huang transform (HHT) approach is used for the multiscale characterization of All India Summer Monsoon Rainfall (AISMR) time series and monsoon rainfall time series from five homogeneous regions in India. The study employs the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for multiscale decomposition of monsoon rainfall in India and uses the Normalized Hilbert Transform and Direct Quadrature (NHT-DQ) scheme for the time-frequency characterization. The cross-correlation analysis between orthogonal modes of All India monthly monsoon rainfall time series and that of five climate indices such as Quasi Biennial Oscillation (QBO), El Niño Southern Oscillation (ENSO), Sunspot Number (SN), Atlantic Multi Decadal Oscillation (AMO), and Equatorial Indian Ocean Oscillation (EQUINOO) in the time domain showed that the links of different climate indices with monsoon rainfall are expressed well only for few low-frequency modes and for the trend component. Furthermore, this paper investigated the hydro-climatic teleconnection of ISMR in multiple time scales using the HHT-based running correlation analysis technique called time-dependent intrinsic correlation (TDIC). The results showed that both the strength and nature of association between different climate indices and ISMR vary with time scale. Stemming from this finding, a methodology employing Multivariate extension of EMD and Stepwise Linear Regression (MEMD-SLR) is proposed for prediction of monsoon rainfall in India. The proposed MEMD-SLR method clearly exhibited superior performance over the IMD operational forecast, M5 Model Tree (MT), and multiple linear regression methods in ISMR predictions and displayed excellent predictive skill during 1989-2012 including the four extreme events that have occurred during this period.

  20. Associations of learning style with cultural values and demographics in nursing students in Iran and Malaysia

    PubMed Central

    Abdollahimohammad, Abdolghani; Ja’afar, Rogayah

    2015-01-01

    Purpose: The goal of the current study was to identify associations between the learning style of nursing students and their cultural values and demographic characteristics. Methods: A non-probability purposive sampling method was used to gather data from two populations. All 156 participants were female, Muslim, and full-time degree students. Data were collected from April to June 2010 using two reliable and validated questionnaires: the Learning Style Scales and the Values Survey Module 2008 (VSM 08). A simple linear regression was run for each predictor before conducting multiple linear regression analysis. The forward selection method was used for variable selection. P-values ≤0.05 and ≤0.1 were considered to indicate significance and marginal significance, respectively. Moreover, multi-group confirmatory factor analysis was performed to determine the invariance of the Farsi and English versions of the VSM 08. Results: The perceptive learning style was found to have a significant negative relationship with the power distance and monumentalism indices of the VSM 08. Moreover, a significant negative association was observed between the solitary learning style and the power distance index. However, no significant association was found between the analytic, competitive, and imaginative learning styles and cultural values (P>0.05). Likewise, no significant associations were observed between learning style, including the perceptive, solitary, analytic, competitive, and imaginative learning styles, and year of study or age (P>0.05). Conclusion: Students who reported low values on the power distance and monumentalism indices are more likely to prefer perceptive and solitary learning styles. Within each group of students in our study sample from the same school the year of study and age did not show any significant associations with learning style. PMID:26268831

Top