linear regression yields: Topics by Science.gov

Sample records for linear regression yields

A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

NASA Astrophysics Data System (ADS)

Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

2018-04-01

In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Orthogonal Regression: A Teaching Perspective

ERIC Educational Resources Information Center

Carr, James R.

2012-01-01

A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Unitary Response Regression Models

ERIC Educational Resources Information Center

Lipovetsky, S.

2007-01-01

The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

NASA Astrophysics Data System (ADS)

Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

2018-03-01

This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Growth and yield in Eucalyptus globulus

Treesearch

James A. Rinehart; Richard B. Standiford

1983-01-01

A study of the major Eucalyptus globulus stands throughout California conducted by Woodbridge Metcalf in 1924 provides a complete and accurate data set for generating variable site-density yield models. Two models were developed using linear regression techniques. Model I depicts a linear relationship between age and yield best used for stands between five and fifteen...
On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

PubMed

Pfeiffer, R M; Riedl, R

2015-08-15

We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

PubMed Central

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-01-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

PubMed

Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

2012-12-01

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
What Is Wrong with ANOVA and Multiple Regression? Analyzing Sentence Reading Times with Hierarchical Linear Models

ERIC Educational Resources Information Center

Richter, Tobias

2006-01-01

Most reading time studies using naturalistic texts yield data sets characterized by a multilevel structure: Sentences (sentence level) are nested within persons (person level). In contrast to analysis of variance and multiple regression techniques, hierarchical linear models take the multilevel structure of reading time data into account. They…
Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

NASA Astrophysics Data System (ADS)

Hoffman, A.; Forest, C. E.; Kemanian, A.

2016-12-01

A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.
Plateletpheresis efficiency and mathematical correction of software-derived platelet yield prediction: A linear regression and ROC modeling approach.

PubMed

Jaime-Pérez, José Carlos; Jiménez-Castillo, Raúl Alberto; Vázquez-Hernández, Karina Elizabeth; Salazar-Riojas, Rosario; Méndez-Ramírez, Nereida; Gómez-Almaguer, David

2017-10-01

Advances in automated cell separators have improved the efficiency of plateletpheresis and the possibility of obtaining double products (DP). We assessed cell processor accuracy of predicted platelet (PLT) yields with the goal of a better prediction of DP collections. This retrospective proof-of-concept study included 302 plateletpheresis procedures performed on a Trima Accel v6.0 at the apheresis unit of a hematology department. Donor variables, software predicted yield and actual PLT yield were statistically evaluated. Software prediction was optimized by linear regression analysis and its optimal cut-off to obtain a DP assessed by receiver operating characteristic curve (ROC) modeling. Three hundred and two plateletpheresis procedures were performed; in 271 (89.7%) occasions, donors were men and in 31 (10.3%) women. Pre-donation PLT count had the best direct correlation with actual PLT yield (r = 0.486. P < .001). Means of software machine-derived values differed significantly from actual PLT yield, 4.72 × 10 11 vs.6.12 × 10 11 , respectively, (P < .001). The following equation was developed to adjust these values: actual PLT yield= 0.221 + (1.254 × theoretical platelet yield). ROC curve model showed an optimal apheresis device software prediction cut-off of 4.65 × 10 11 to obtain a DP, with a sensitivity of 82.2%, specificity of 93.3%, and an area under the curve (AUC) of 0.909. Trima Accel v6.0 software consistently underestimated PLT yields. Simple correction derived from linear regression analysis accurately corrected this underestimation and ROC analysis identified a precise cut-off to reliably predict a DP. © 2016 Wiley Periodicals, Inc.
Use of AMMI and linear regression models to analyze genotype-environment interaction in durum wheat.

PubMed

Nachit, M M; Nachit, G; Ketata, H; Gauch, H G; Zobel, R W

1992-03-01

The joint durum wheat (Triticum turgidum L var 'durum') breeding program of the International Maize and Wheat Improvement Center (CIMMYT) and the International Center for Agricultural Research in the Dry Areas (ICARDA) for the Mediterranean region employs extensive multilocation testing. Multilocation testing produces significant genotype-environment (GE) interaction that reduces the accuracy for estimating yield and selecting appropriate germ plasm. The sum of squares (SS) of GE interaction was partitioned by linear regression techniques into joint, genotypic, and environmental regressions, and by Additive Main effects and the Multiplicative Interactions (AMMI) model into five significant Interaction Principal Component Axes (IPCA). The AMMI model was more effective in partitioning the interaction SS than the linear regression technique. The SS contained in the AMMI model was 6 times higher than the SS for all three regressions. Postdictive assessment recommended the use of the first five IPCA axes, while predictive assessment AMMI1 (main effects plus IPCA1). After elimination of random variation, AMMI1 estimates for genotypic yields within sites were more precise than unadjusted means. This increased precision was equivalent to increasing the number of replications by a factor of 3.7.
Spatial Assessment of Model Errors from Four Regression Techniques

Treesearch

Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

2005-01-01

Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Advanced statistics: linear regression, part II: multiple linear regression.

PubMed

Marill, Keith A

2004-01-01

The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
A comparison of two adaptive multivariate analysis methods (PLSR and ANN) for winter wheat yield forecasting using Landsat-8 OLI images

NASA Astrophysics Data System (ADS)

Chen, Pengfei; Jing, Qi

2017-02-01

An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.
Phytotoxicity and accumulation of chromium in carrot plants and the derivation of soil thresholds for Chinese soils.

PubMed

Ding, Changfeng; Li, Xiaogang; Zhang, Taolin; Ma, Yibing; Wang, Xingxiang

2014-10-01

Soil environmental quality standards in respect of heavy metals for farmlands should be established considering both their effects on crop yield and their accumulation in the edible part. A greenhouse experiment was conducted to investigate the effects of chromium (Cr) on biomass production and Cr accumulation in carrot plants grown in a wide range of soils. The results revealed that carrot yield significantly decreased in 18 of the total 20 soils with Cr addition being the soil environmental quality standard of China. The Cr content of carrot grown in the five soils with pH>8.0 exceeded the maximum allowable level (0.5mgkg(-1)) according to the Chinese General Standard for Contaminants in Foods. The relationship between carrot Cr concentration and soil pH could be well fitted (R(2)=0.70, P<0.0001) by a linear-linear segmented regression model. The addition of Cr to soil influenced carrot yield firstly rather than the food quality. The major soil factors controlling Cr phytotoxicity and the prediction models were further identified and developed using path analysis and stepwise multiple linear regression analysis. Soil Cr thresholds for phytotoxicity meanwhile ensuring food safety were then derived on the condition of 10 percent yield reduction. Copyright © 2014 Elsevier Inc. All rights reserved.
Ranking contributing areas of salt and selenium in the Lower Gunnison River Basin, Colorado, using multiple linear regression models

USGS Publications Warehouse

Linard, Joshua I.

2013-01-01

Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.
Food Crops Response to Climate Change

NASA Astrophysics Data System (ADS)

Butler, E.; Huybers, P.

2009-12-01

Projections of future climate show a warming world and heterogeneous changes in precipitation. Generally, warming temperatures indicate a decrease in crop yields where they are currently grown. However, warmer climate will also open up new areas at high latitudes for crop production. Thus, there is a question whether the warmer climate with decreased yields but potentially increased growing area will produce a net increase or decrease of overall food crop production. We explore this question through a multiple linear regression model linking temperature and precipitation to crop yield. Prior studies have emphasised temporal regression which indicate uniformly decreased yields, but neglect the potentially increased area opened up for crop production. This study provides a compliment to the prior work by exploring this spatial variation. We explore this subject with a multiple linear regression model from temperature, precipitation and crop yield data over the United States. The United States was chosen as the training region for the model because there are good crop data available over the same time frame as climate data and presumably the yield from crops in the United States is optimized with respect to potential yield. We study corn, soybeans, sorghum, hard red winter wheat and soft red winter wheat using monthly averages of temperature and precipitation from NCEP reanalysis and yearly yield data from the National Agriculture Statistics Service for 1948-2008. The use of monthly averaged temperature and precipitation, which neglect extreme events that can have a significant impact on crops limits this study as does the exclusive use of United States agricultural data. The GFDL 2.1 model under a 720ppm CO2 scenario provides temperature and precipitation fields for 2040-2100 which are used to explore how the spatial regions available for crop production will change under these new conditions.
Modeling maximum daily temperature using a varying coefficient regression model

Treesearch

Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith

2014-01-01

Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...
Evaluation of the CEAS model for barley yields in North Dakota and Minnesota

NASA Technical Reports Server (NTRS)

Barnett, T. L. (Principal Investigator)

1981-01-01

The CEAS yield model is based upon multiple regression analysis at the CRD and state levels. For the historical time series, yield is regressed on a set of variables derived from monthly mean temperature and monthly precipitation. Technological trend is represented by piecewise linear and/or quadriatic functions of year. Indicators of yield reliability obtained from a ten-year bootstrap test (1970-79) demonstrated that biases are small and performance as indicated by the root mean square errors are acceptable for intended application, however, model response for individual years particularly unusual years, is not very reliable and shows some large errors. The model is objective, adequate, timely, simple and not costly. It considers scientific knowledge on a broad scale but not in detail, and does not provide a good current measure of modeled yield reliability.

[Regional scale remote sensing-based yield estimation of winter wheat by using MODIS-NDVI data: a case study of Jining City in Shandong Province].

PubMed

Ren, Jianqiang; Chen, Zhongxin; Tang, Huajun

2006-12-01

Taking Jining City of Shandong Province, one of the most important winter wheat production regions in Huanghuaihai Plain as an example, the winter wheat yield was estimated by using the 250 m MODIS-NDVI data smoothed by Savitzky-Golay filter. The NDVI values between 0. 20 and 0. 80 were selected, and the sum of NDVI value for each county was calculated to build its relation with winter wheat yield. By using stepwise regression method, the linear regression model between NDVI and winter wheat yield was established, with the precision validated by the ground survey data. The results showed that the relative error of predicted yield was between -3.6% and 3.9%, suggesting that the method was relatively accurate and feasible.
Random regression models using Legendre polynomials or linear splines for test-day milk yield of dairy Gyr (Bos indicus) cattle.

PubMed

Pereira, R J; Bignardi, A B; El Faro, L; Verneque, R S; Vercesi Filho, A E; Albuquerque, L G

2013-01-01

Studies investigating the use of random regression models for genetic evaluation of milk production in Zebu cattle are scarce. In this study, 59,744 test-day milk yield records from 7,810 first lactations of purebred dairy Gyr (Bos indicus) and crossbred (dairy Gyr × Holstein) cows were used to compare random regression models in which additive genetic and permanent environmental effects were modeled using orthogonal Legendre polynomials or linear spline functions. Residual variances were modeled considering 1, 5, or 10 classes of days in milk. Five classes fitted the changes in residual variances over the lactation adequately and were used for model comparison. The model that fitted linear spline functions with 6 knots provided the lowest sum of residual variances across lactation. On the other hand, according to the deviance information criterion (DIC) and bayesian information criterion (BIC), a model using third-order and fourth-order Legendre polynomials for additive genetic and permanent environmental effects, respectively, provided the best fit. However, the high rank correlation (0.998) between this model and that applying third-order Legendre polynomials for additive genetic and permanent environmental effects, indicates that, in practice, the same bulls would be selected by both models. The last model, which is less parameterized, is a parsimonious option for fitting dairy Gyr breed test-day milk yield records. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
A comparison of radiometric correction techniques in the evaluation of the relationship between LST and NDVI in Landsat imagery.

PubMed

Tan, Kok Chooi; Lim, Hwee San; Matjafri, Mohd Zubir; Abdullah, Khiruddin

2012-06-01

Atmospheric corrections for multi-temporal optical satellite images are necessary, especially in change detection analyses, such as normalized difference vegetation index (NDVI) rationing. Abrupt change detection analysis using remote-sensing techniques requires radiometric congruity and atmospheric correction to monitor terrestrial surfaces over time. Two atmospheric correction methods were used for this study: relative radiometric normalization and the simplified method for atmospheric correction (SMAC) in the solar spectrum. A multi-temporal data set consisting of two sets of Landsat images from the period between 1991 and 2002 of Penang Island, Malaysia, was used to compare NDVI maps, which were generated using the proposed atmospheric correction methods. Land surface temperature (LST) was retrieved using ATCOR3_T in PCI Geomatica 10.1 image processing software. Linear regression analysis was utilized to analyze the relationship between NDVI and LST. This study reveals that both of the proposed atmospheric correction methods yielded high accuracy through examination of the linear correlation coefficients. To check for the accuracy of the equation obtained through linear regression analysis for every single satellite image, 20 points were randomly chosen. The results showed that the SMAC method yielded a constant value (in terms of error) to predict the NDVI value from linear regression analysis-derived equation. The errors (average) from both proposed atmospheric correction methods were less than 10%.
Simple agrometeorological models for estimating Guineagrass yield in Southeast Brazil.

PubMed

Pezzopane, José Ricardo Macedo; da Cruz, Pedro Gomes; Santos, Patricia Menezes; Bosi, Cristiam; de Araujo, Leandro Coelho

2014-09-01

The objective of this work was to develop and evaluate agrometeorological models to simulate the production of Guineagrass. For this purpose, we used forage yield from 54 growing periods between December 2004-January 2007 and April 2010-March 2012 in irrigated and non-irrigated pastures in São Carlos, São Paulo state, Brazil (latitude 21°57'42″ S, longitude 47°50'28″ W and altitude 860 m). Initially we performed linear regressions between the agrometeorological variables and the average dry matter accumulation rate for irrigated conditions. Then we determined the effect of soil water availability on the relative forage yield considering irrigated and non-irrigated pastures, by means of segmented linear regression among water balance and relative production variables (dry matter accumulation rates with and without irrigation). The models generated were evaluated with independent data related to 21 growing periods without irrigation in the same location, from eight growing periods in 2000 and 13 growing periods between December 2004-January 2007 and April 2010-March 2012. The results obtained show the satisfactory predictive capacity of the agrometeorological models under irrigated conditions based on univariate regression (mean temperature, minimum temperature and potential evapotranspiration or degreedays) or multivariate regression. The response of irrigation on production was well correlated with the climatological water balance variables (ratio between actual and potential evapotranspiration or between actual and maximum soil water storage). The models that performed best for estimating Guineagrass yield without irrigation were based on minimum temperature corrected by relative soil water storage, determined by the ratio between the actual soil water storage and the soil water holding capacity.irrigation in the same location, in 2000, 2010 and 2011. The results obtained show the satisfactory predictive capacity of the agrometeorological models under irrigated conditions based on univariate regression (mean temperature, potential evapotranspiration or degree-days) or multivariate regression. The response of irrigation on production was well correlated with the climatological water balance variables (ratio between actual and potential evapotranspiration or between actual and maximum soil water storage). The models that performed best for estimating Guineagrass yield without irrigation were based on degree-days corrected by the water deficit factor.
Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

NASA Technical Reports Server (NTRS)

MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

2005-01-01

Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
SU-F-T-130: [18F]-FDG Uptake Dose Response in Lung Correlates Linearly with Proton Therapy Dose

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, D; Titt, U; Mirkovic, D

2016-06-15

Purpose: Analysis of clinical outcomes in lung cancer patients treated with protons using 18F-FDG uptake in lung as a measure of dose response. Methods: A test case lung cancer patient was selected in an unbiased way. The test patient’s treatment planning and post treatment positron emission tomography (PET) were collected from picture archiving and communication system at the UT M.D. Anderson Cancer Center. Average computerized tomography scan was registered with post PET/CT through both rigid and deformable registrations for selected region of interest (ROI) via VelocityAI imaging informatics software. For the voxels in the ROI, a system that extracts themore » Standard Uptake Value (SUV) from PET was developed, and the corresponding relative biological effectiveness (RBE) weighted (both variable and constant) dose was computed using the Monte Carlo (MC) methods. The treatment planning system (TPS) dose was also obtained. Using histogram analysis, the voxel average normalized SUV vs. 3 different doses was obtained and linear regression fit was performed. Results: From the registration process, there were some regions that showed significant artifacts near the diaphragm and heart region, which yielded poor r-squared values when the linear regression fit was performed on normalized SUV vs. dose. Excluding these values, TPS fit yielded mean r-squared value of 0.79 (range 0.61–0.95), constant RBE fit yielded 0.79 (range 0.52–0.94), and variable RBE fit yielded 0.80 (range 0.52–0.94). Conclusion: A system that extracts SUV from PET to correlate between normalized SUV and various dose calculations was developed. A linear relation between normalized SUV and all three different doses was found.« less
Post-processing through linear regression

NASA Astrophysics Data System (ADS)

van Schaeybroeck, B.; Vannitsem, S.

2011-03-01

Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Transfer Student Success: Educationally Purposeful Activities Predictive of Undergraduate GPA

ERIC Educational Resources Information Center

Fauria, Renee M.; Fuller, Matthew B.

2015-01-01

Researchers evaluated the effects of Educationally Purposeful Activities (EPAs) on transfer and nontransfer students' cumulative GPAs. Hierarchical, linear, and multiple regression models yielded seven statistically significant educationally purposeful items that influenced undergraduate student GPAs. Statistically significant positive EPAs for…
Optimized multiple linear mappings for single image super-resolution

NASA Astrophysics Data System (ADS)

Zhang, Kaibing; Li, Jie; Xiong, Zenggang; Liu, Xiuping; Gao, Xinbo

2017-12-01

Learning piecewise linear regression has been recognized as an effective way for example learning-based single image super-resolution (SR) in literature. In this paper, we employ an expectation-maximization (EM) algorithm to further improve the SR performance of our previous multiple linear mappings (MLM) based SR method. In the training stage, the proposed method starts with a set of linear regressors obtained by the MLM-based method, and then jointly optimizes the clustering results and the low- and high-resolution subdictionary pairs for regression functions by using the metric of the reconstruction errors. In the test stage, we select the optimal regressor for SR reconstruction by accumulating the reconstruction errors of m-nearest neighbors in the training set. Thorough experimental results carried on six publicly available datasets demonstrate that the proposed SR method can yield high-quality images with finer details and sharper edges in terms of both quantitative and perceptual image quality assessments.
Adaptive local linear regression with application to printer color management.

PubMed

Gupta, Maya R; Garcia, Eric K; Chin, Erika

2008-06-01

Local learning methods, such as local linear regression and nearest neighbor classifiers, base estimates on nearby training samples, neighbors. Usually, the number of neighbors used in estimation is fixed to be a global "optimal" value, chosen by cross validation. This paper proposes adapting the number of neighbors used for estimation to the local geometry of the data, without need for cross validation. The term enclosing neighborhood is introduced to describe a set of neighbors whose convex hull contains the test point when possible. It is proven that enclosing neighborhoods yield bounded estimation variance under some assumptions. Three such enclosing neighborhood definitions are presented: natural neighbors, natural neighbors inclusive, and enclosing k-NN. The effectiveness of these neighborhood definitions with local linear regression is tested for estimating lookup tables for color management. Significant improvements in error metrics are shown, indicating that enclosing neighborhoods may be a promising adaptive neighborhood definition for other local learning tasks as well, depending on the density of training samples.
A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water.

PubMed

Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil

2015-12-07

High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.
A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water

PubMed Central

Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil

2015-01-01

High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190
Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

NASA Astrophysics Data System (ADS)

Prahutama, Alan; Suparti; Wahyu Utami, Tiani

2018-03-01

Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Predicting Reactive Intermediate Quantum Yields from Dissolved Organic Matter Photolysis Using Optical Properties and Antioxidant Capacity.

PubMed

Mckay, Garrett; Huang, Wenxi; Romera-Castillo, Cristina; Crouch, Jenna E; Rosario-Ortiz, Fernando L; Jaffé, Rudolf

2017-05-16

The antioxidant capacity and formation of photochemically produced reactive intermediates (RI) was studied for water samples collected from the Florida Everglades with different spatial (marsh versus estuarine) and temporal (wet versus dry season) characteristics. Measured RI included triplet excited states of dissolved organic matter ( 3 DOM*), singlet oxygen ( 1 O 2 ), and the hydroxyl radical ( • OH). Single and multiple linear regression modeling were performed using a broad range of extrinsic (to predict RI formation rates, R RI ) and intrinsic (to predict RI quantum yields, Φ RI ) parameters. Multiple linear regression models consistently led to better predictions of R RI and Φ RI for our data set but poor prediction of Φ RI for a previously published data set,1 probably because the predictors are intercorrelated (Pearson's r > 0.5). Single linear regression models were built with data compiled from previously published studies (n ≈ 120) in which E2:E3, S, and Φ RI values were measured, which revealed a high degree of similarity between RI-optical property relationships across DOM samples of diverse sources. This study reveals that • OH formation is, in general, decoupled from 3 DOM* and 1 O 2 formation, providing supporting evidence that 3 DOM* is not a • OH precursor. Finally, Φ RI for 1 O 2 and 3 DOM* correlated negatively with antioxidant activity (a surrogate for electron donating capacity) for the collected samples, which is consistent with intramolecular oxidation of DOM moieties by 3 DOM*.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Carpenter, Daniel; Westover, Tyler; Howe, Daniel

Here, we report here on an experimental study to produce refinery-ready fuel blendstocks via catalytic hydrodeoxygenation (upgrading) of pyrolysis oil using several biomass feedstocks and various blends. Blends were tested along with the pure materials to determine the effect of blending on product yields and qualities. Within experimental error, oil yields from fast pyrolysis and upgrading are shown to be linear functions of the blend components. Switchgrass exhibited lower fast pyrolysis and upgrading yields than the woody samples, which included clean pine, oriented strand board (OSB), and a mix of pinon and juniper (PJ). The notable exception was PJ, formore » which the poor upgrading yield of 18% was likely associated with the very high viscosity of the PJ fast pyrolysis oil (947 cp). The highest fast pyrolysis yield (54% dry basis) was obtained from clean pine, while the highest upgrading yield (50%) was obtained from a blend of 80% clean pine and 20% OSB (CP 8OSB 2). For switchgrass, reducing the fast pyrolysis temperature to 450 degrees C resulted in a significant increase to the pyrolysis oil yield and reduced hydrogen consumption during hydrotreating, but did not directly affect the hydrotreating oil yield. The water content of fast pyrolysis oils was also observed to increase linearly with the summed content of potassium and sodium, ranging from 21% for clean pine to 37% for switchgrass. Multiple linear regression models demonstrate that fast pyrolysis is strongly dependent upon the contents lignin and volatile matter as well as the sum of potassium and sodium.« less
Approximate Probabilistic Methods for Survivability/Vulnerability Analysis of Strategic Structures.

DTIC Science & Technology

1978-07-15

weapon yield, in kilotons; K = energy coupling factor; C = coefficient determined from linear regression; a, b = exponents determined from linear...hn(l + .582 00 = 0.54 In the case of the applied pressure, according to Perret and Bass (1975), the variabilities in the exponents a and b of Eq. 32...ATTN: WESSF, L. Ingram ATTN: ATC-T ATTN: Library ATTN: F. Brown BMD Systems Command ATTN: J. Strange Deoartment of the Army ATTN: BMDSC-H, N. Hurst
Use of multivariate linear regression and support vector regression to predict functional outcome after surgery for cervical spondylotic myelopathy.

PubMed

Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C

2015-09-01

This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ozone and sulfur dioxide effects on three tall fescue cultivars

DOE Office of Scientific and Technical Information (OSTI.GOV)

Flagler, R.B.; Youngner, V.B.

Although many reports have been published concerning differential susceptibility of various crops and/or cultivars to air pollutants, most have used foliar injury instead of the marketable yield as the factor that determined susceptibility for the crop. In an examination of screening in terms of marketable yield, three cultivars of tall fescue (Festuca arundinacea Schreb.), 'Alta,' 'Fawn,' and 'Kentucky 31,' were exposed to 0-0.40 ppm O/sub 3/ or 0-0.50 ppm SO/sub 2/ 6 h/d, once a week, for 7 and 9 weeks, respectively. Experimental design was a randomized complete block with three replications. Statistical analysis was by standard analysis of variancemore » and regression techniques. Three variables were analyzed: top dry weight (yield), tiller number, and weight per tiller. Ozone had a significant effect on all three variables. Significant linear decreases in yield and weight per tiller occurred with increasing O/sub 3/ concentrations. Linear regressions of these variables on O/sub 3/ concentration produced significantly different regression coefficients. The coefficient for Kentucky 31 was significantly greater than Alta or Fawn, which did not differ from each other. This indicated that Kentucky 31 was more susceptible to O/sub 3/ than either of the other cultivars. Percent reductions in dry weight for the three cultivars at highest O/sub 3/ level were 35, 44, and 53%, respectively, for Fawn, Alta, and Kentucky 31. For weight per tiller, Kentucky 31 had a higher percent reduction than the other cultivars (59 vs. 46 and 44%). Tiller number was generally increased by O/sub 3/, but this variable was not useful for determining differential susceptibility to the pollutant. Sulfur dioxide treatments produced no significant effects on any of the variables analyzed.« less
LAS bioconcentration is isomer specific

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tolls, J.; Haller, M.; Graaf, I. de

1995-12-31

The authors measured parent compound specific bioconcentration data for linear alkylbenzene sulfonates in Pimephales promelas. They did so by using cold, custom synthesized sulfophenyl alkanes. They observed that, within homologous series of isomers, the uptake rate constants (k{sub 1}) and the bioconcentration factor (BCF) increase with increasing number of carbon atoms in the alkyl chain (n{sub C-atoms}). In contrast, the elimination rate constant k{sub 2} appears to be independent of the alkyl chain length. Regressions of log BCF vs n{sub C-atoms} yielded different slopes for the homologous groups of the 5- and the 2-sulfophenyl alkane isomers. Regression of all logmore » BCF-data vs log 1/CMC yielded a good description of the data. However, when regressing the data for both homologous series separately again very different slopes are obtained. The results therefore indicate that hydrophobicity-bioconcentration relationships may be different for different homologous groups of sulfophenyl alkanes.« less
Catalytic hydroprocessing of fast pyrolysis oils: Impact of biomass feedstock on process efficiency

DOE PAGES

Carpenter, Daniel; Westover, Tyler; Howe, Daniel; ...

2016-12-01

Here, we report here on an experimental study to produce refinery-ready fuel blendstocks via catalytic hydrodeoxygenation (upgrading) of pyrolysis oil using several biomass feedstocks and various blends. Blends were tested along with the pure materials to determine the effect of blending on product yields and qualities. Within experimental error, oil yields from fast pyrolysis and upgrading are shown to be linear functions of the blend components. Switchgrass exhibited lower fast pyrolysis and upgrading yields than the woody samples, which included clean pine, oriented strand board (OSB), and a mix of pinon and juniper (PJ). The notable exception was PJ, formore » which the poor upgrading yield of 18% was likely associated with the very high viscosity of the PJ fast pyrolysis oil (947 cp). The highest fast pyrolysis yield (54% dry basis) was obtained from clean pine, while the highest upgrading yield (50%) was obtained from a blend of 80% clean pine and 20% OSB (CP 8OSB 2). For switchgrass, reducing the fast pyrolysis temperature to 450 degrees C resulted in a significant increase to the pyrolysis oil yield and reduced hydrogen consumption during hydrotreating, but did not directly affect the hydrotreating oil yield. The water content of fast pyrolysis oils was also observed to increase linearly with the summed content of potassium and sodium, ranging from 21% for clean pine to 37% for switchgrass. Multiple linear regression models demonstrate that fast pyrolysis is strongly dependent upon the contents lignin and volatile matter as well as the sum of potassium and sodium.« less

Application of third molar development and eruption models in estimating dental age in Malay sub-adults.

PubMed

Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc

2015-08-01

The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Simulation of relationship between river discharge and sediment yield in the semi-arid river watersheds

NASA Astrophysics Data System (ADS)

Khaleghi, Mohammad Reza; Varvani, Javad

2018-02-01

Complex and variable nature of the river sediment yield caused many problems in estimating the long-term sediment yield and problems input into the reservoirs. Sediment Rating Curves (SRCs) are generally used to estimate the suspended sediment load of the rivers and drainage watersheds. Since the regression equations of the SRCs are obtained by logarithmic retransformation and have a little independent variable in this equation, they also overestimate or underestimate the true sediment load of the rivers. To evaluate the bias correction factors in Kalshor and Kashafroud watersheds, seven hydrometric stations of this region with suitable upstream watershed and spatial distribution were selected. Investigation of the accuracy index (ratio of estimated sediment yield to observed sediment yield) and the precision index of different bias correction factors of FAO, Quasi-Maximum Likelihood Estimator (QMLE), Smearing, and Minimum-Variance Unbiased Estimator (MVUE) with LSD test showed that FAO coefficient increases the estimated error in all of the stations. Application of MVUE in linear and mean load rating curves has not statistically meaningful effects. QMLE and smearing factors increased the estimated error in mean load rating curve, but that does not have any effect on linear rating curve estimation.
Accounting for the decrease of photosystem photochemical efficiency with increasing irradiance to estimate quantum yield of leaf photosynthesis.

PubMed

Yin, Xinyou; Belay, Daniel W; van der Putten, Peter E L; Struik, Paul C

2014-12-01

Maximum quantum yield for leaf CO2 assimilation under limiting light conditions (Φ CO2LL) is commonly estimated as the slope of the linear regression of net photosynthetic rate against absorbed irradiance over a range of low-irradiance conditions. Methodological errors associated with this estimation have often been attributed either to light absorptance by non-photosynthetic pigments or to some data points being beyond the linear range of the irradiance response, both causing an underestimation of Φ CO2LL. We demonstrate here that a decrease in photosystem (PS) photochemical efficiency with increasing irradiance, even at very low levels, is another source of error that causes a systematic underestimation of Φ CO2LL. A model method accounting for this error was developed, and was used to estimate Φ CO2LL from simultaneous measurements of gas exchange and chlorophyll fluorescence on leaves using various combinations of species, CO2, O2, or leaf temperature levels. The conventional linear regression method under-estimated Φ CO2LL by ca. 10-15%. Differences in the estimated Φ CO2LL among measurement conditions were generally accounted for by different levels of photorespiration as described by the Farquhar-von Caemmerer-Berry model. However, our data revealed that the temperature dependence of PSII photochemical efficiency under low light was an additional factor that should be accounted for in the model.
Quantum algorithm for linear regression

NASA Astrophysics Data System (ADS)

Wang, Guoming

2017-07-01

We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Quantifying the sensitivity of feedstock properties and process conditions on hydrochar yield, carbon content, and energy content.

PubMed

Li, Liang; Wang, Yiying; Xu, Jiting; Flora, Joseph R V; Hoque, Shamia; Berge, Nicole D

2018-08-01

Hydrothermal carbonization (HTC) is a wet, low temperature thermal conversion process that continues to gain attention for the generation of hydrochar. The importance of specific process conditions and feedstock properties on hydrochar characteristics is not well understood. To evaluate this, linear and non-linear models were developed to describe hydrochar characteristics based on data collected from HTC-related literature. A Sobol analysis was subsequently conducted to identify parameters that most influence hydrochar characteristics. Results from this analysis indicate that for each investigated hydrochar property, the model fit and predictive capability associated with the random forest models is superior to both the linear and regression tree models. Based on results from the Sobol analysis, the feedstock properties and process conditions most influential on hydrochar yield, carbon content, and energy content were identified. In addition, a variational process parameter sensitivity analysis was conducted to determine how feedstock property importance changes with process conditions. Copyright © 2018 Elsevier Ltd. All rights reserved.
Bayesian Correction for Misclassification in Multilevel Count Data Models.

PubMed

Nelson, Tyler; Song, Joon Jin; Chin, Yoo-Mi; Stamey, James D

2018-01-01

Covariate misclassification is well known to yield biased estimates in single level regression models. The impact on hierarchical count models has been less studied. A fully Bayesian approach to modeling both the misclassified covariate and the hierarchical response is proposed. Models with a single diagnostic test and with multiple diagnostic tests are considered. Simulation studies show the ability of the proposed model to appropriately account for the misclassification by reducing bias and improving performance of interval estimators. A real data example further demonstrated the consequences of ignoring the misclassification. Ignoring misclassification yielded a model that indicated there was a significant, positive impact on the number of children of females who observed spousal abuse between their parents. When the misclassification was accounted for, the relationship switched to negative, but not significant. Ignoring misclassification in standard linear and generalized linear models is well known to lead to biased results. We provide an approach to extend misclassification modeling to the important area of hierarchical generalized linear models.
Estimating V0[subscript 2]max Using a Personalized Step Test

ERIC Educational Resources Information Center

Webb, Carrie; Vehrs, Pat R.; George, James D.; Hager, Ronald

2014-01-01

The purpose of this study was to develop a step test with a personalized step rate and step height to predict cardiorespiratory fitness in 80 college-aged males and females using the self-reported perceived functional ability scale and data collected during the step test. Multiple linear regression analysis yielded a model (R = 0.90, SEE = 3.43…
Wheat yield dynamics: a structural econometric analysis.

PubMed

Sahin, Afsin; Akdi, Yilmaz; Arslan, Fahrettin

2007-10-15

In this study we initially have tried to explore the wheat situation in Turkey, which has a small-open economy and in the member countries of European Union (EU). We have observed that increasing the wheat yield is fundamental to obtain comparative advantage among countries by depressing domestic prices. Also the changing structure of supporting schemes in Turkey makes it necessary to increase its wheat yield level. For this purpose, we have used available data to determine the dynamics of wheat yield by Ordinary Least Square Regression methods. In order to find out whether there is a linear relationship among these series we have checked each series whether they are integrated at the same order or not. Consequently, we have pointed out that fertilizer usage and precipitation level are substantial inputs for producing high wheat yield. Furthermore, in respect for our model, fertilizer usage affects wheat yield more than precipitation level.
Evaluation of the Williams-type model for barley yields in North Dakota and Minnesota

NASA Technical Reports Server (NTRS)

Barnett, T. L. (Principal Investigator)

1981-01-01

The Williams-type yield model is based on multiple regression analysis of historial time series data at CRD level pooled to regional level (groups of similar CRDs). Basic variables considered in the analysis include USDA yield, monthly mean temperature, monthly precipitation, soil texture and topographic information, and variables derived from these. Technologic trend is represented by piecewise linear and/or quadratic functions of year. Indicators of yield reliability obtained from a ten-year bootstrap test (1970-1979) demonstrate that biases are small and performance based on root mean square appears to be acceptable for the intended AgRISTARS large area applications. The model is objective, adequate, timely, simple, and not costly. It consideres scientific knowledge on a broad scale but not in detail, and does not provide a good current measure of modeled yield reliability.
Modeling the relationships between quality and biochemical composition of fatty liver in mule ducks.

PubMed

Theron, L; Cullere, M; Bouillier-Oudot, M; Manse, H; Dalle Zotte, A; Molette, C; Fernandez, X; Vitezica, Z G

2012-09-01

The fatty liver of mule ducks (i.e., French "foie gras") is the most valuable product in duck production systems. Its quality is measured by the technological yield, which is the opposite of the fat loss during cooking. The purpose of this study was to determine whether biochemical measures of fatty liver could be used to accurately predict the technological yield (TY). Ninety-one male mule ducks were bred, overfed, and slaughtered under commercial conditions. Fatty liver weight (FLW) and biochemical variables, such as DM, lipid (LIP), and protein content (PROT), were collected. To evaluate evidence for nonlinear fat loss during cooking, we compared regression models describing linear and nonlinear relations between biochemical measures and TY. We detected significantly greater (P = 0.02) linear relation between DM and TY. Our results indicate that LIP and PROT follow a different pattern (linear) than DM and showed that LIP and PROT are nonexclusive contributing factors to TY. Other components, such as carbohydrates, other than those measured in this study, could contribute to DM. Stepwise regression for TY was performed. The traditional model with FLW was tested. The results showed that the weight of the liver is of limited value in the determination of fat loss during cooking (R(2) = 0.14). The most accurate TY prediction equation included DM (in linear and quadratic terms), FLW, and PROT (R(2) = 0.43). Biochemical measures in the fatty liver were more accurate predictors of TY than FLW. The model is useful in commercial conditions because DM, PROT, and FLW are noninvasive measures.
Carcass yield and meat quality in broilers fed with canola meal.

PubMed

Gopinger, E; Xavier, E G; Lemes, J S; Moraes, P O; Elias, M C; Roll, V F B

2014-01-01

1. This study evaluated the effects of canola meal in broiler diets on carcass yield, carcass composition, and instrumental and sensory analyses of meat. 2. A total of 320 one-day-old Cobb broilers were used in a 35-d experiment using a completely randomised design with 5 concentrations of canola meal (0, 10, 20, 30 and 40%) as a dietary substitute for soya bean meal. 3. Polynomial regression at 5% significance was used to evaluate the effects of canola meal content. The following variables were measured: carcass yield, chemical composition of meat, and instrumental and sensorial analyses. 4. The results showed that carcass yield exhibited a quadratic effect that was crescent to the level of 18% of canola meal based on the weight of the leg and a quadratic increase at concentrations up to 8.4% of canola meal based on the weight of the chest. The yield of the chest exhibited a linear behaviour. 5. The chemical composition of leg meat, instrumental analysis of breast meat and sensory characteristics of the breast meat was not significantly affected by the inclusion of canola meal. The chemical composition of the breast meat exhibited an increased linear effect in terms of dry matter and ether extract and a decreased linear behaviour in terms of the ash content. 6. In conclusion, soya bean meal can be substituted with canola meal at concentrations up to 20% of the total diet without affecting carcass yield, composition of meat or the instrumental or sensory characteristics of the meat of broilers.
The comparison of robust partial least squares regression with robust principal component regression on a real

NASA Astrophysics Data System (ADS)

Polat, Esra; Gunay, Suleyman

2013-10-01

One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Parametric correlation functions to model the structure of permanent environmental (co)variances in milk yield random regression models.

PubMed

Bignardi, A B; El Faro, L; Cardoso, V L; Machado, P F; Albuquerque, L G

2009-09-01

The objective of the present study was to estimate milk yield genetic parameters applying random regression models and parametric correlation functions combined with a variance function to model animal permanent environmental effects. A total of 152,145 test-day milk yields from 7,317 first lactations of Holstein cows belonging to herds located in the southeastern region of Brazil were analyzed. Test-day milk yields were divided into 44 weekly classes of days in milk. Contemporary groups were defined by herd-test-day comprising a total of 2,539 classes. The model included direct additive genetic, permanent environmental, and residual random effects. The following fixed effects were considered: contemporary group, age of cow at calving (linear and quadratic regressions), and the population average lactation curve modeled by fourth-order orthogonal Legendre polynomial. Additive genetic effects were modeled by random regression on orthogonal Legendre polynomials of days in milk, whereas permanent environmental effects were estimated using a stationary or nonstationary parametric correlation function combined with a variance function of different orders. The structure of residual variances was modeled using a step function containing 6 variance classes. The genetic parameter estimates obtained with the model using a stationary correlation function associated with a variance function to model permanent environmental effects were similar to those obtained with models employing orthogonal Legendre polynomials for the same effect. A model using a sixth-order polynomial for additive effects and a stationary parametric correlation function associated with a seventh-order variance function to model permanent environmental effects would be sufficient for data fitting.
[Winter wheat yield gap between field blocks based on comparative performance analysis].

PubMed

Chen, Jian; Wang, Zhong-Yi; Li, Liang-Tao; Zhang, Ke-Feng; Yu, Zhen-Rong

2008-09-01

Based on a two-year household survey data, the yield gap of winter wheat in Quzhou County of Hebei Province, China in 2003-2004 was studied through comparative performance analysis (CPA). The results showed that there was a greater yield gap (from 4.2 to 7.9 t x hm(-2)) between field blocks, with a variation coefficient of 0.14. Through stepwise forward linear multiple regression, it was found that the yield model with 8 selected variables could explain 63% variability of winter wheat yield. Among the variables selected, soil salinity, soil fertility, and irrigation water quality were the most important limiting factors, accounting for 52% of the total yield gap. Crop variety was another important limiting factor, accounting for 14%; while planting date, fertilizer type, disease and pest, and water press accounted for 7%, 14%, 10%, and 3%, respectively. Therefore, besides soil and climate conditions, management practices occupied the majority of yield variability in Quzhou County, suggesting that the yield gap could be reduced significantly through optimum field management.
Gastrointestinal Spatiotemporal mRNA Expression of Ghrelin vs Growth Hormone Receptor and New Growth Yield Machine Learning Model Based on Perturbation Theory.

PubMed

Ran, Tao; Liu, Yong; Li, Hengzhi; Tang, Shaoxun; He, Zhixiong; Munteanu, Cristian R; González-Díaz, Humberto; Tan, Zhiliang; Zhou, Chuanshe

2016-07-27

The management of ruminant growth yield has economic importance. The current work presents a study of the spatiotemporal dynamic expression of Ghrelin and GHR at mRNA levels throughout the gastrointestinal tract (GIT) of kid goats under housing and grazing systems. The experiments show that the feeding system and age affected the expression of either Ghrelin or GHR with different mechanisms. Furthermore, the experimental data are used to build new Machine Learning models based on the Perturbation Theory, which can predict the effects of perturbations of Ghrelin and GHR mRNA expression on the growth yield. The models consider eight longitudinal GIT segments (rumen, abomasum, duodenum, jejunum, ileum, cecum, colon and rectum), seven time points (0, 7, 14, 28, 42, 56 and 70 d) and two feeding systems (Supplemental and Grazing feeding) as perturbations from the expected values of the growth yield. The best regression model was obtained using Random Forest, with the coefficient of determination R(2) of 0.781 for the test subset. The current results indicate that the non-linear regression model can accurately predict the growth yield and the key nodes during gastrointestinal development, which is helpful to optimize the feeding management strategies in ruminant production system.
Gastrointestinal Spatiotemporal mRNA Expression of Ghrelin vs Growth Hormone Receptor and New Growth Yield Machine Learning Model Based on Perturbation Theory

PubMed Central

Ran, Tao; Liu, Yong; Li, Hengzhi; Tang, Shaoxun; He, Zhixiong; Munteanu, Cristian R.; González-Díaz, Humberto; Tan, Zhiliang; Zhou, Chuanshe

2016-01-01

The management of ruminant growth yield has economic importance. The current work presents a study of the spatiotemporal dynamic expression of Ghrelin and GHR at mRNA levels throughout the gastrointestinal tract (GIT) of kid goats under housing and grazing systems. The experiments show that the feeding system and age affected the expression of either Ghrelin or GHR with different mechanisms. Furthermore, the experimental data are used to build new Machine Learning models based on the Perturbation Theory, which can predict the effects of perturbations of Ghrelin and GHR mRNA expression on the growth yield. The models consider eight longitudinal GIT segments (rumen, abomasum, duodenum, jejunum, ileum, cecum, colon and rectum), seven time points (0, 7, 14, 28, 42, 56 and 70 d) and two feeding systems (Supplemental and Grazing feeding) as perturbations from the expected values of the growth yield. The best regression model was obtained using Random Forest, with the coefficient of determination R2 of 0.781 for the test subset. The current results indicate that the non-linear regression model can accurately predict the growth yield and the key nodes during gastrointestinal development, which is helpful to optimize the feeding management strategies in ruminant production system. PMID:27460882
A new graphic plot analysis for determination of neuroreceptor binding in positron emission tomography studies.

PubMed

Ito, Hiroshi; Yokoi, Takashi; Ikoma, Yoko; Shidahara, Miho; Seki, Chie; Naganawa, Mika; Takahashi, Hidehiko; Takano, Harumasa; Kimura, Yuichi; Ichise, Masanori; Suhara, Tetsuya

2010-01-01

In positron emission tomography (PET) studies with radioligands for neuroreceptors, tracer kinetics have been described by the standard two-tissue compartment model that includes the compartments of nondisplaceable binding and specific binding to receptors. In the present study, we have developed a new graphic plot analysis to determine the total distribution volume (V(T)) and nondisplaceable distribution volume (V(ND)) independently, and therefore the binding potential (BP(ND)). In this plot, Y(t) is the ratio of brain tissue activity to time-integrated arterial input function, and X(t) is the ratio of time-integrated brain tissue activity to time-integrated arterial input function. The x-intercept of linear regression of the plots for early phase represents V(ND), and the x-intercept of linear regression of the plots for delayed phase after the equilibrium time represents V(T). BP(ND) can be calculated by BP(ND)=V(T)/V(ND)-1. Dynamic PET scanning with measurement of arterial input function was performed on six healthy men after intravenous rapid bolus injection of [(11)C]FLB457. The plot yielded a curve in regions with specific binding while it yielded a straight line through all plot data in regions with no specific binding. V(ND), V(T), and BP(ND) values calculated by the present method were in good agreement with those by conventional non-linear least-squares fitting procedure. This method can be used to distinguish graphically whether the radioligand binding includes specific binding or not.
Assessment of energy crops alternative to maize for biogas production in the Greater Region.

PubMed

Mayer, Frédéric; Gerin, Patrick A; Noo, Anaïs; Lemaigre, Sébastien; Stilmant, Didier; Schmit, Thomas; Leclech, Nathael; Ruelle, Luc; Gennen, Jerome; von Francken-Welz, Herbert; Foucart, Guy; Flammang, Jos; Weyland, Marc; Delfosse, Philippe

2014-08-01

The biomethane yield of various energy crops, selected among potential alternatives to maize in the Greater Region, was assessed. The biomass yield, the volatile solids (VS) content and the biochemical methane potential (BMP) were measured to calculate the biomethane yield per hectare of all plant species. For all species, the dry matter biomass yield and the VS content were the main factors that influence, respectively, the biomethane yield and the BMP. Both values were predicted with good accuracy by linear regressions using the biomass yield and the VS as independent variable. The perennial crop miscanthus appeared to be the most promising alternative to maize when harvested as green matter in autumn and ensiled. Miscanthus reached a biomethane yield of 5.5 ± 1 × 10(3)m(3)ha(-1) during the second year after the establishment, as compared to 5.3 ± 1 × 10(3)m(3)ha(-1) for maize under similar crop conditions. Copyright © 2014. Published by Elsevier Ltd.
Blackleg (Leptosphaeria maculans) Severity and Yield Loss in Canola in Alberta, Canada

PubMed Central

Hwang, Sheau-Fang; Strelkov, Stephen E.; Peng, Gary; Ahmed, Hafiz; Zhou, Qixing; Turnbull, George

2016-01-01

Blackleg, caused by Leptosphaeria maculans, is an important disease of oilseed rape (Brassica napus L.) in Canada and throughout the world. Severe epidemics of blackleg can result in significant yield losses. Understanding disease-yield relationships is a prerequisite for measuring the agronomic efficacy and economic benefits of control methods. Field experiments were conducted in 2013, 2014, and 2015 to determine the relationship between blackleg disease severity and yield in a susceptible cultivar and in moderately resistant to resistant canola hybrids. Disease severity was lower, and seed yield was 120%–128% greater, in the moderately resistant to resistant hybrids compared with the susceptible cultivar. Regression analysis showed that pod number and seed yield declined linearly as blackleg severity increased. Seed yield per plant decreased by 1.8 g for each unit increase in disease severity, corresponding to a decline in yield of 17.2% for each unit increase in disease severity. Pyraclostrobin fungicide reduced disease severity in all site-years and increased yield. These results show that the reduction of blackleg in canola crops substantially improves yields. PMID:27447676
[Optimization of cultivation conditions in se-enriched Spirulina platensis].

PubMed

Huang, Zhi; Zheng, Wen-Jie; Guo, Bao-Jiang

2002-05-01

Orthogonal combination design was adopted in examining the Spirulina platensis (S. platensis) yield and the influence of four factors (Se content, Se-adding method, S content and NaHCO3 content) on algae growth. The results showed that Se content, Se-adding method and NaHCO3 content were key factors in cultivation conditions of Se-enriched S. platensis with the optimal combination being Se at 300 mg/L, Se-adding amount equally divided into three times and NaHCO3 at 16.8 g/L. Algae yield had a remarkable correlation with OD560 and floating rate by linear regression analysis. There was a corresponding relationship between effects of the four factors on algae yield and on OD560, floating rate too. In conclusion, OD560 and floating rate could be served as yield-forming factors.

Metabolic control analysis using transient metabolite concentrations. Determination of metabolite concentration control coefficients.

PubMed Central

Delgado, J; Liao, J C

1992-01-01

The methodology previously developed for determining the Flux Control Coefficients [Delgado & Liao (1992) Biochem. J. 282, 919-927] is extended to the calculation of metabolite Concentration Control Coefficients. It is shown that the transient metabolite concentrations are related by a few algebraic equations, attributed to mass balance, stoichiometric constraints, quasi-equilibrium or quasi-steady states, and kinetic regulations. The coefficients in these relations can be estimated using linear regression, and can be used to calculate the Control Coefficients. The theoretical basis and two examples are discussed. Although the methodology is derived based on the linear approximation of enzyme kinetics, it yields reasonably good estimates of the Control Coefficients for systems with non-linear kinetics. PMID:1497632
Clinical laboratory investigation of the Sanofi ACCESS CK-MB procedure and comparison to electrophoresis and Abbott IMx.

PubMed

Mao, G D; Adeli, K; Eisenbrey, A B; Artiss, J D

1996-07-01

This evaluation was undertaken to verify the application protocol for the CK-MB assay on the ACCESS Immunoassay Analyzer (Sanofi Diagnostics Pasteur, Chaska, MN). The results show that the ACCESS CK-MB assay total imprecision was 6.8% to 9.1%. Analytical linearity of the ACCESS CK-MB assay was excellent in the range of < 1-214 micrograms/L. A comparison of the ACCESS CK-MB assay with the IMx (Abbott Laboratories, Abbott Park, IL) method shows good correlation r = 0.990 (n = 108). Linear regression analysis yielded Y = 1.36X-0.3, Sx/y = 7.2. ACCESS CK-MB values also correlated well with CK-MB by electrophoresis with r = 0.968 (n = 132). The linear regression equation for this comparison was Y = 1.08X + 1.4, Sx/y = 14.1. The expected non-myocardial infarction range of CK-MB determined by the ACCESS system was 1.3-9.4 micrograms/L (mean = 4.0, n = 58). The ACCESS CK-MB assay would appear to be rapid, precise and clinically useful.
Estimating and testing interactions when explanatory variables are subject to non-classical measurement error.

PubMed

Murad, Havi; Kipnis, Victor; Freedman, Laurence S

2016-10-01

Assessing interactions in linear regression models when covariates have measurement error (ME) is complex.We previously described regression calibration (RC) methods that yield consistent estimators and standard errors for interaction coefficients of normally distributed covariates having classical ME. Here we extend normal based RC (NBRC) and linear RC (LRC) methods to a non-classical ME model, and describe more efficient versions that combine estimates from the main study and internal sub-study. We apply these methods to data from the Observing Protein and Energy Nutrition (OPEN) study. Using simulations we show that (i) for normally distributed covariates efficient NBRC and LRC were nearly unbiased and performed well with sub-study size ≥200; (ii) efficient NBRC had lower MSE than efficient LRC; (iii) the naïve test for a single interaction had type I error probability close to the nominal significance level, whereas efficient NBRC and LRC were slightly anti-conservative but more powerful; (iv) for markedly non-normal covariates, efficient LRC yielded less biased estimators with smaller variance than efficient NBRC. Our simulations suggest that it is preferable to use: (i) efficient NBRC for estimating and testing interaction effects of normally distributed covariates and (ii) efficient LRC for estimating and testing interactions for markedly non-normal covariates. © The Author(s) 2013.
Echocardiographic Linear Dimensions for Assessment of Right Ventricular Chamber Volume as Demonstrated by Cardiac Magnetic Resonance.

PubMed

Kim, Jiwon; Srinivasan, Aparna; Seoane, Tania; Di Franco, Antonino; Peskin, Charles S; McQueen, David M; Paul, Tracy K; Feher, Attila; Geevarghese, Alexi; Rozenstrauch, Meenakshi; Devereux, Richard B; Weinsaft, Jonathan W

2016-09-01

Echocardiography-derived linear dimensions offer straightforward indices of right ventricular (RV) structure but have not been systematically compared with RV volumes on cardiac magnetic resonance (CMR). Echocardiography and CMR were interpreted among patients with coronary artery disease imaged via prospective (90%) and retrospective (10%) registries. For echocardiography, American Society of Echocardiography-recommended RV dimensions were measured in apical four-chamber (basal RV width, mid RV width, and RV length), parasternal long-axis (proximal RV outflow tract [RVOT]), and short-axis (distal RVOT) views. For CMR, RV end-diastolic volume and RV end-systolic volume were quantified using border planimetry. Two hundred seventy-two patients underwent echocardiography and CMR within a narrow interval (0.4 ± 1.0 days); complete acquisition of all American Society of Echocardiography-recommended dimensions was feasible in 98%. All echocardiographic dimensions differed between patients with and those without RV dilation on CMR (P < .05). Basal RV width (r = 0.70), proximal RVOT width (r = 0.68), and RV length (r = 0.61) yielded the highest correlations with RV end-diastolic volume on CMR; end-systolic dimensions yielded similar correlations (r = 0.68, r = 0.66, and r = 0.65, respectively). In multivariate regression, basal RV width (regression coefficient = 1.96 per mm; 95% CI, 1.22-2.70; P < .001), RV length (regression coefficient = 0.97; 95% CI, 0.56-1.37; P < .001), and proximal RVOT width (regression coefficient = 2.62; 95% CI, 1.79-3.44; P < .001) were independently associated with CMR RV end-diastolic volume (r = 0.80). RV end-systolic volume was similarly associated with echocardiographic dimensions (basal RV width: 1.59 per mm [95% CI, 1.06-2.13], P < .001; RV length: 1.00 [95% CI, 0.66-1.34], P < .001; proximal RVOT width: 1.80 [95% CI, 1.22-2.39], P < .001) (r = 0.79). RV linear dimensions provide readily obtainable markers of RV chamber size. Proximal RVOT and basal width are independently associated with CMR volumes, supporting the use of multiple linear dimensions when assessing RV size on echocardiography. Copyright © 2016 American Society of Echocardiography. Published by Elsevier Inc. All rights reserved.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests

NASA Astrophysics Data System (ADS)

Shumway, R. H.

2001-10-01

- The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Classical and Bayesian Seismic Yield Estimation: The 1998 Indian and Pakistani Tests

NASA Astrophysics Data System (ADS)

Shumway, R. H.

The nuclear tests in May, 1998, in India and Pakistan have stimulated a renewed interest in yield estimation, based on limited data from uncalibrated test sites. We study here the problem of estimating yields using classical and Bayesian methods developed by Shumway (1992), utilizing calibration data from the Semipalatinsk test site and measured magnitudes for the 1998 Indian and Pakistani tests given by Murphy (1998). Calibration is done using multivariate classical or Bayesian linear regression, depending on the availability of measured magnitude-yield data and prior information. Confidence intervals for the classical approach are derived applying an extension of Fieller's method suggested by Brown (1982). In the case where prior information is available, the posterior predictive magnitude densities are inverted to give posterior intervals for yield. Intervals obtained using the joint distribution of magnitudes are comparable to the single-magnitude estimates produced by Murphy (1998) and reinforce the conclusion that the announced yields of the Indian and Pakistani tests were too high.
Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: Insights into spatial variability using high-resolution satellite data

PubMed Central

Alexeeff, Stacey E.; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A.

2016-01-01

Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1km x 1km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R2 yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with greater than 0.9 out-of-sample R2 yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the standard errors. Land use regression models performed better in chronic effects simulations. These results can help researchers when interpreting health effect estimates in these types of studies. PMID:24896768
Lead in teeth from lead-dosed goats: Microdistribution and relationship to the cumulative lead dose

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bellis, David J.; Hetter, Katherine M.; Jones, Joseph

2008-01-15

Teeth are commonly used as a biomarker of long-term lead exposure. There appear to be few data, however, on the content or distribution of lead in teeth where data on specific lead intake (dose) are also available. This study describes the analysis of a convenience sample of teeth from animals that were dosed with lead for other purposes, i.e., a proficiency testing program for blood lead. Lead concentration of whole teeth obtained from 23 animals, as determined by atomic absorption spectrometry, varied from 0.6 to 80 {mu}g g{sup -1}. Linear regression of whole tooth lead ({mu}g g{sup -1}) on themore » cumulative lead dose received by the animal (g) yielded a slope of 1.2, with r{sup 2}=0.647 (p<0.0001). Laser ablation inductively coupled plasma mass spectrometry was employed to determine lead content at micrometer scale spatial resolution in the teeth of seven goats representing the dosing range. Highly localized concentrations of lead, ranging from about 10 to 2000 {mu}g g{sup -1}, were found in circumpulpal dentine. Linear regression of circumpulpal lead ({mu}g g{sup -1}) on cumulative lead dose (g) yielded a slope of 23 with r{sup 2}=0.961 (p=0.0001). The data indicated that whole tooth lead, and especially circumpulpal lead, of dosed goats increased linearly with cumulative lead exposure. These data suggest that circumpulpal dentine is a better biomarker of cumulative lead exposure than is whole tooth lead, at least for lead-dosed goats.« less
Comparison of CEAS and Williams-type models for spring wheat yields in North Dakota and Minnesota

NASA Technical Reports Server (NTRS)

Barnett, T. L. (Principal Investigator)

1982-01-01

The CEAS and Williams-type yield models are both based on multiple regression analysis of historical time series data at CRD level. The CEAS model develops a separate relation for each CRD; the Williams-type model pools CRD data to regional level (groups of similar CRDs). Basic variables considered in the analyses are USDA yield, monthly mean temperature, monthly precipitation, and variables derived from these. The Williams-type model also used soil texture and topographic information. Technological trend is represented in both by piecewise linear functions of year. Indicators of yield reliability obtained from a ten-year bootstrap test of each model (1970-1979) demonstrate that the models are very similar in performance in all respects. Both models are about equally objective, adequate, timely, simple, and inexpensive. Both consider scientific knowledge on a broad scale but not in detail. Neither provides a good current measure of modeled yield reliability. The CEAS model is considered very slightly preferable for AgRISTARS applications.
Geographical variation of cerebrovascular disease in New York State: the correlation with income

PubMed Central

Han, Daikwon; Carrow, Shannon S; Rogerson, Peter A; Munschauer, Frederick E

2005-01-01

Background Income is known to be associated with cerebrovascular disease; however, little is known about the more detailed relationship between cerebrovascular disease and income. We examined the hypothesis that the geographical distribution of cerebrovascular disease in New York State may be predicted by a nonlinear model using income as a surrogate socioeconomic risk factor. Results We used spatial clustering methods to identify areas with high and low prevalence of cerebrovascular disease at the ZIP code level after smoothing rates and correcting for edge effects; geographic locations of high and low clusters of cerebrovascular disease in New York State were identified with and without income adjustment. To examine effects of income, we calculated the excess number of cases using a non-linear regression with cerebrovascular disease rates taken as the dependent variable and income and income squared taken as independent variables. The resulting regression equation was: excess rate = 32.075 - 1.22*10-4(income) + 8.068*10-10(income2), and both income and income squared variables were significant at the 0.01 level. When income was included as a covariate in the non-linear regression, the number and size of clusters of high cerebrovascular disease prevalence decreased. Some 87 ZIP codes exceeded the critical value of the local statistic yielding a relative risk of 1.2. The majority of low cerebrovascular disease prevalence geographic clusters disappeared when the non-linear income effect was included. For linear regression, the excess rate of cerebrovascular disease falls with income; each $10,000 increase in median income of each ZIP code resulted in an average reduction of 3.83 observed cases. The significant nonlinear effect indicates a lessening of this income effect with increasing income. Conclusion Income is a non-linear predictor of excess cerebrovascular disease rates, with both low and high observed cerebrovascular disease rate areas associated with higher income. Income alone explains a significant amount of the geographical variance in cerebrovascular disease across New York State since both high and low clusters of cerebrovascular disease dissipate or disappear with income adjustment. Geographical modeling, including non-linear effects of income, may allow for better identification of other non-traditional risk factors. PMID:16242043
Estimating standard errors in feature network models.

PubMed

Frank, Laurence E; Heiser, Willem J

2007-05-01

Feature network models are graphical structures that represent proximity data in a discrete space while using the same formalism that is the basis of least squares methods employed in multidimensional scaling. Existing methods to derive a network model from empirical data only give the best-fitting network and yield no standard errors for the parameter estimates. The additivity properties of networks make it possible to consider the model as a univariate (multiple) linear regression problem with positivity restrictions on the parameters. In the present study, both theoretical and empirical standard errors are obtained for the constrained regression parameters of a network model with known features. The performance of both types of standard error is evaluated using Monte Carlo techniques.
Advanced statistics: linear regression, part I: simple linear regression.

PubMed

Marill, Keith A

2004-01-01

Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression.

PubMed

Meng, Yilin; Roux, Benoît

2015-08-11

The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost.
Efficient Determination of Free Energy Landscapes in Multiple Dimensions from Biased Umbrella Sampling Simulations Using Linear Regression

PubMed Central

2015-01-01

The weighted histogram analysis method (WHAM) is a standard protocol for postprocessing the information from biased umbrella sampling simulations to construct the potential of mean force with respect to a set of order parameters. By virtue of the WHAM equations, the unbiased density of state is determined by satisfying a self-consistent condition through an iterative procedure. While the method works very effectively when the number of order parameters is small, its computational cost grows rapidly in higher dimension. Here, we present a simple and efficient alternative strategy, which avoids solving the self-consistent WHAM equations iteratively. An efficient multivariate linear regression framework is utilized to link the biased probability densities of individual umbrella windows and yield an unbiased global free energy landscape in the space of order parameters. It is demonstrated with practical examples that free energy landscapes that are comparable in accuracy to WHAM can be generated at a small fraction of the cost. PMID:26574437
Relationships among ultrasonic and mechanical properties of cancellous bone in human calcaneus in vitro.

PubMed

Wear, Keith A; Nagaraja, Srinidhi; Dreher, Maureen L; Sadoughi, Saghi; Zhu, Shan; Keaveny, Tony M

2017-10-01

Clinical bone sonometers applied at the calcaneus measure broadband ultrasound attenuation and speed of sound. However, the relation of ultrasound measurements to bone strength is not well-characterized. Addressing this issue, we assessed the extent to which ultrasonic measurements convey in vitro mechanical properties in 25 human calcaneal cancellous bone specimens (approximately 2×4×2cm). Normalized broadband ultrasound attenuation, speed of sound, and broadband ultrasound backscatter were measured with 500kHz transducers. To assess mechanical properties, non-linear finite element analysis, based on micro-computed tomography images (34-micron cubic voxel), was used to estimate apparent elastic modulus, overall specimen stiffness, and apparent yield stress, with models typically having approximately 25-30 million elements. We found that ultrasound parameters were correlated with mechanical properties with R=0.70-0.82 (p<0.001). Multiple regression analysis indicated that ultrasound measurements provide additional information regarding mechanical properties beyond that provided by bone quantity alone (p≤0.05). Adding ultrasound variables to linear regression models based on bone quantity improved adjusted squared correlation coefficients from 0.65 to 0.77 (stiffness), 0.76 to 0.81 (apparent modulus), and 0.67 to 0.73 (yield stress). These results indicate that ultrasound can provide complementary (to bone quantity) information regarding mechanical behavior of cancellous bone. Published by Elsevier Inc.
Epidemiology and impact of Fasciola hepatica exposure in high-yielding dairy herds

PubMed Central

Howell, Alison; Baylis, Matthew; Smith, Rob; Pinchbeck, Gina; Williams, Diana

2015-01-01

The liver fluke Fasciola hepatica is a trematode parasite with a worldwide distribution and is the cause of important production losses in the dairy industry. The aim of this observational study was to assess the prevalence of exposure to F. hepatica in a group of high yielding dairy herds, to determine the risk factors and investigate their associations with production and fertility parameters. Bulk milk tank samples from 606 herds that supply a single retailer with liquid milk were tested with an antibody ELISA for F. hepatica. Multivariable linear regression was used to investigate the effect of farm management and environmental risk factors on F. hepatica exposure. Higher rainfall, grazing boggy pasture, presence of beef cattle on farm, access to a stream or pond and smaller herd size were associated with an increased risk of exposure. Univariable regression was used to look for associations between fluke exposure and production-related variables including milk yield, composition, somatic cell count and calving index. Although causation cannot be assumed, a significant (p < 0.001) negative association was seen between F. hepatica exposure and estimated milk yield at the herd level, representing a 15% decrease in yield for an increase in F. hepatica exposure from the 25th to the 75th percentile. This remained significant when fertility, farm management and environmental factors were controlled for. No associations were found between F. hepatica exposure and any of the other production, disease or fertility variables. PMID:26093971
Multitrait, Random Regression, or Simple Repeatability Model in High-Throughput Phenotyping Data Improve Genomic Prediction for Wheat Grain Yield.

PubMed

Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E

2017-07-01

High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.
Multiple-trait random regression models for the estimation of genetic parameters for milk, fat, and protein yield in buffaloes.

PubMed

Borquis, Rusbel Raul Aspilcueta; Neto, Francisco Ribeiro de Araujo; Baldi, Fernando; Hurtado-Lugo, Naudin; de Camargo, Gregório M F; Muñoz-Berrocal, Milthon; Tonhati, Humberto

2013-09-01

In this study, genetic parameters for test-day milk, fat, and protein yield were estimated for the first lactation. The data analyzed consisted of 1,433 first lactations of Murrah buffaloes, daughters of 113 sires from 12 herds in the state of São Paulo, Brazil, with calvings from 1985 to 2007. Ten-month classes of lactation days were considered for the test-day yields. The (co)variance components for the 3 traits were estimated using the regression analyses by Bayesian inference applying an animal model by Gibbs sampling. The contemporary groups were defined as herd-year-month of the test day. In the model, the random effects were additive genetic, permanent environment, and residual. The fixed effects were contemporary group and number of milkings (1 or 2), the linear and quadratic effects of the covariable age of the buffalo at calving, as well as the mean lactation curve of the population, which was modeled by orthogonal Legendre polynomials of fourth order. The random effects for the traits studied were modeled by Legendre polynomials of third and fourth order for additive genetic and permanent environment, respectively, the residual variances were modeled considering 4 residual classes. The heritability estimates for the traits were moderate (from 0.21-0.38), with higher estimates in the intermediate lactation phase. The genetic correlation estimates within and among the traits varied from 0.05 to 0.99. The results indicate that the selection for any trait test day will result in an indirect genetic gain for milk, fat, and protein yield in all periods of the lactation curve. The accuracy associated with estimated breeding values obtained using multi-trait random regression was slightly higher (around 8%) compared with single-trait random regression. This difference may be because to the greater amount of information available per animal. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Improving mass-univariate analysis of neuroimaging data by modelling important unknown covariates: Application to Epigenome-Wide Association Studies.

PubMed

Guillaume, Bryan; Wang, Changqing; Poh, Joann; Shen, Mo Jun; Ong, Mei Lyn; Tan, Pei Fang; Karnani, Neerja; Meaney, Michael; Qiu, Anqi

2018-06-01

Statistical inference on neuroimaging data is often conducted using a mass-univariate model, equivalent to fitting a linear model at every voxel with a known set of covariates. Due to the large number of linear models, it is challenging to check if the selection of covariates is appropriate and to modify this selection adequately. The use of standard diagnostics, such as residual plotting, is clearly not practical for neuroimaging data. However, the selection of covariates is crucial for linear regression to ensure valid statistical inference. In particular, the mean model of regression needs to be reasonably well specified. Unfortunately, this issue is often overlooked in the field of neuroimaging. This study aims to adopt the existing Confounder Adjusted Testing and Estimation (CATE) approach and to extend it for use with neuroimaging data. We propose a modification of CATE that can yield valid statistical inferences using Principal Component Analysis (PCA) estimators instead of Maximum Likelihood (ML) estimators. We then propose a non-parametric hypothesis testing procedure that can improve upon parametric testing. Monte Carlo simulations show that the modification of CATE allows for more accurate modelling of neuroimaging data and can in turn yield a better control of False Positive Rate (FPR) and Family-Wise Error Rate (FWER). We demonstrate its application to an Epigenome-Wide Association Study (EWAS) on neonatal brain imaging and umbilical cord DNA methylation data obtained as part of a longitudinal cohort study. Software for this CATE study is freely available at http://www.bioeng.nus.edu.sg/cfa/Imaging_Genetics2.html. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Random Forests for Global and Regional Crop Yield Predictions.

PubMed

Jeong, Jig Han; Resop, Jonathan P; Mueller, Nathaniel D; Fleisher, David H; Yun, Kyungdahm; Butler, Ethan E; Timlin, Dennis J; Shim, Kyo-Moon; Gerber, James S; Reddy, Vangimalla R; Kim, Soo-Hyung

2016-01-01

Accurate predictions of crop yield are critical for developing effective agricultural and food policies at the regional and global scales. We evaluated a machine-learning method, Random Forests (RF), for its ability to predict crop yield responses to climate and biophysical variables at global and regional scales in wheat, maize, and potato in comparison with multiple linear regressions (MLR) serving as a benchmark. We used crop yield data from various sources and regions for model training and testing: 1) gridded global wheat grain yield, 2) maize grain yield from US counties over thirty years, and 3) potato tuber and maize silage yield from the northeastern seaboard region. RF was found highly capable of predicting crop yields and outperformed MLR benchmarks in all performance statistics that were compared. For example, the root mean square errors (RMSE) ranged between 6 and 14% of the average observed yield with RF models in all test cases whereas these values ranged from 14% to 49% for MLR models. Our results show that RF is an effective and versatile machine-learning method for crop yield predictions at regional and global scales for its high accuracy and precision, ease of use, and utility in data analysis. RF may result in a loss of accuracy when predicting the extreme ends or responses beyond the boundaries of the training data.

Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

PubMed

Chen, Baojiang; Qin, Jing

2014-05-10

In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.
A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

2015-05-01

The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Estimating yields of salt- and water-stressed forages with remote sensing in the visible and near infrared.

PubMed

Poss, J A; Russell, W B; Grieve, C M

2006-01-01

In arid irrigated regions, the proportion of crop production under deficit irrigation with poorer quality water is increasing as demand for fresh water soars and efforts to prevent saline water table development occur. Remote sensing technology to quantify salinity and water stress effects on forage yield can be an important tool to address yield loss potential when deficit irrigating with poor water quality. Two important forages, alfalfa (Medicago sativa L.) and tall wheatgrass (Agropyron elongatum L.), were grown in a volumetric lysimeter facility where rootzone salinity and water content were varied and monitored. Ground-based hyperspectral canopy reflectance in the visible and near infrared (NIR) were related to forage yields from a broad range of salinity and water stress conditions. Canopy reflectance spectra were obtained in the 350- to 1000-nm region from two viewing angles (nadir view, 45 degrees from nadir). Nadir view vegetation indices (VI) were not as strongly correlated with leaf area index changes attributed to water and salinity stress treatments for both alfalfa and wheatgrass. From a list of 71 VIs, two were selected for a multiple linear-regression model that estimated yield under varying salinity and water stress conditions. With data obtained during the second harvest of a three-harvest 100-d growing period, regression coefficients for each crop were developed and then used with the model to estimate fresh weights for preceding and succeeding harvests during the same 100-d interval. The model accounted for 72% of the variation in yields in wheatgrass and 94% in yields of alfalfa within the same salinity and water stress treatment period. The model successfully predicted yield in three out of four cases when applied to the first and third harvest yields. Correlations between indices and yield increased as canopy development progressed. Growth reductions attributed to simultaneous salinity and water stress were well characterized, but the corrections for effects of varying tissue nitrogen (N) and very low leaf area index (LAI) are necessary.
Correlation and simple linear regression.

PubMed

Eberly, Lynn E

2007-01-01

This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
The impact of furfural concentrations and substrate-to-biomass ratios on biological hydrogen production from synthetic lignocellulosic hydrolysate using mesophilic anaerobic digester sludge.

PubMed

Akobi, Chinaza; Hafez, Hisham; Nakhla, George

2016-12-01

This study evaluated the impact of furfural (a furan derivative) on hydrogen production rates and yields at initial substrate-to-microorganism ratios (S°/X°) of 4, 2, 1, and 0.5gCOD/gVSS and furfural concentrations of 4, 2, 1, and 0.5g/L. Fermentation studies were carried out in batches using synthetic lignocellulosic hydrolysate as substrate and mesophilic anaerobic digester sludge as seed. Contrary to other literature studies where furfural was inhibitory, this study showed that furfural concentrations of up to 1g/L enhanced hydrogen production with yields as high as 19% from the control (batch without furfural). Plots of hydrogen yields against gfurfural/gsugars and hydrogen yields versus gfurfural/gbiomass showed negative linear correlation indicating that these parameters influence biohydrogen production. Regression analysis indicated that gfurfural/gsugars initial exerted a greater effect on the degree of inhibition of hydrogen production than gfurfural/gVSS final . Copyright © 2016 Elsevier Ltd. All rights reserved.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary

NASA Astrophysics Data System (ADS)

Gillis, Nicolas; Luce, Robert

2018-01-01

A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Seeking maximum linearity of transfer functions

NASA Astrophysics Data System (ADS)

Silva, Filipi N.; Comin, Cesar H.; Costa, Luciano da F.

2016-12-01

Linearity is an important and frequently sought property in electronics and instrumentation. Here, we report a method capable of, given a transfer function (theoretical or derived from some real system), identifying the respective most linear region of operation with a fixed width. This methodology, which is based on least squares regression and systematic consideration of all possible regions, has been illustrated with respect to both an analytical (sigmoid transfer function) and a simple situation involving experimental data of a low-power, one-stage class A transistor current amplifier. Such an approach, which has been addressed in terms of transfer functions derived from experimentally obtained characteristic surface, also yielded contributions such as the estimation of local constants of the device, as opposed to typically considered average values. The reported method and results pave the way to several further applications in other types of devices and systems, intelligent control operation, and other areas such as identifying regions of power law behavior.
Random regression models using different functions to model test-day milk yield of Brazilian Holstein cows.

PubMed

Bignardi, A B; El Faro, L; Torres Júnior, R A A; Cardoso, V L; Machado, P F; Albuquerque, L G

2011-10-31

We analyzed 152,145 test-day records from 7317 first lactations of Holstein cows recorded from 1995 to 2003. Our objective was to model variations in test-day milk yield during the first lactation of Holstein cows by random regression model (RRM), using various functions in order to obtain adequate and parsimonious models for the estimation of genetic parameters. Test-day milk yields were grouped into weekly classes of days in milk, ranging from 1 to 44 weeks. The contemporary groups were defined as herd-test-day. The analyses were performed using a single-trait RRM, including the direct additive, permanent environmental and residual random effects. In addition, contemporary group and linear and quadratic effects of the age of cow at calving were included as fixed effects. The mean trend of milk yield was modeled with a fourth-order orthogonal Legendre polynomial. The additive genetic and permanent environmental covariance functions were estimated by random regression on two parametric functions, Ali and Schaeffer and Wilmink, and on B-spline functions of days in milk. The covariance components and the genetic parameters were estimated by the restricted maximum likelihood method. Results from RRM parametric and B-spline functions were compared to RRM on Legendre polynomials and with a multi-trait analysis, using the same data set. Heritability estimates presented similar trends during mid-lactation (13 to 31 weeks) and between week 37 and the end of lactation, for all RRM. Heritabilities obtained by multi-trait analysis were of a lower magnitude than those estimated by RRM. The RRMs with a higher number of parameters were more useful to describe the genetic variation of test-day milk yield throughout the lactation. RRM using B-spline and Legendre polynomials as base functions appears to be the most adequate to describe the covariance structure of the data.
Bivariate least squares linear regression: Towards a unified analytic formalism. I. Functional models

NASA Astrophysics Data System (ADS)

Caimmi, R.

2011-08-01

Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both heteroscedastic and homoscedastic data. Conversely, samples related to different methods produce discrepant results, due to the presence of (still undetected) systematic errors, which implies no definitive statement can be made at present. A comparison is also made between different expressions of regression line slope and intercept variance estimators, where fractional discrepancies are found to be not exceeding a few percent, which grows up to about 20% in the presence of large dispersion data. An extension of the formalism to structural models is left to a forthcoming paper.
Experimental paleotemperature equation for planktonic foraminifera

NASA Astrophysics Data System (ADS)

Erez, Jonathan; Luz, Boaz

1983-06-01

Small live individuals of Globigerinoides sacculifer which were cultured in the laboratory reached maturity and produced garnets. Fifty to ninety percent of their skeleton weight was deposited under controlled water temperature (14° to 30°C) and water isotopic composition, and a correction was made to account for the isotopic composition of the original skeleton using control groups. Comparison of. the actual growth temperatures with the calculated temperature based on paleotemperature equations for inorganic CaCO 3 indicate that the foraminifera precipitate their CaCO 3 in isotopic equilibrium. Comparison with equations developed for biogenic calcite give a similarly good fit. Linear regression with CRAIG'S (1965) equation yields: t = -0.07 + 1.01 t̂ (r= 0.95) where t is the actual growth temperature and t̂ Is the calculated paleotemperature. The intercept and the slope of this linear equation show that the familiar paleotemperature equation developed originally for mollusca carbonate, is equally applicable for the planktonic foraminifer G. sacculifer. Second order regression of the culture temperature and the delta difference ( δ18Oc - δ18Ow) yield a correlation coefficient of r = 0.95: t̂ = 17.0 - 4.52(δ 18Oc - δ 18Ow) + 0.03(δ 18Oc - δ 18Ow) 2t̂, δ 18Oc and δ18Ow are the estimated temperature, the isotopic composition of the shell carbonate and the sea water respectively. A possible cause for nonequilibnum isotopic compositions reported earlier for living planktonic foraminifera is the improper combustion of the organic matter.
Performance and carcass yield of crossbred dairy steers fed diets with different levels of concentrate.

PubMed

da Silva, Gabriel Santana; Chaves Véras, Antônia Sherlanea; de Andrade Ferreira, Marcelo; Moreira Dutra, Wilson; Menezes Wanderley Neves, Maria Luciana; Oliveira Souza, Evaristo Jorge; Ramos de Carvalho, Francisco Fernando; de Lima, Dorgival Morais

2015-10-01

The objective of this study was to evaluate the influence of diets with increasing concentrate levels (170, 340, 510 and 680 g/kg of total dry matter) on dry matter intake, digestibility, performance and carcass characteristics of 25 Holstein-Zebu crossbred dairy steers in a feedlot. A completely randomized design was used, and data were submitted to analysis of variance and regression. The dry matter intake and digestibility coefficients of all nutrients increased linearly. The total weight gain and average daily gain added 1.16 kg and 9.90 g, respectively, for each 10 g/kg increase in concentrate. The empty body weight, hot carcass weight and cold carcass weight responded linearly to increasing concentrate. The hot carcass yield and cold carcass yield, gains in empty body weight and carcass gain were also influenced, as were the efficiencies of carcass deposition and carcass deposition rate. It is concluded that increasing concentrate levels in feedlot diets increase the intake and digestibility of dry matter and other nutrients, improving the feed efficiency, performance and physical characteristics of the carcass. Furthermore and of importance concerning the climate change debate, evidence from the literature indicates that enteric methane production would be reduced with increasing concentrate levels such as those used.
Parameterizing sorption isotherms using a hybrid global-local fitting procedure.

PubMed

Matott, L Shawn; Singh, Anshuman; Rabideau, Alan J

2017-05-01

Predictive modeling of the transport and remediation of groundwater contaminants requires an accurate description of the sorption process, which is usually provided by fitting an isotherm model to site-specific laboratory data. Commonly used calibration procedures, listed in order of increasing sophistication, include: trial-and-error, linearization, non-linear regression, global search, and hybrid global-local search. Given the considerable variability in fitting procedures applied in published isotherm studies, we investigated the importance of algorithm selection through a series of numerical experiments involving 13 previously published sorption datasets. These datasets, considered representative of state-of-the-art for isotherm experiments, had been previously analyzed using trial-and-error, linearization, or non-linear regression methods. The isotherm expressions were re-fit using a 3-stage hybrid global-local search procedure (i.e. global search using particle swarm optimization followed by Powell's derivative free local search method and Gauss-Marquardt-Levenberg non-linear regression). The re-fitted expressions were then compared to previously published fits in terms of the optimized weighted sum of squared residuals (WSSR) fitness function, the final estimated parameters, and the influence on contaminant transport predictions - where easily computed concentration-dependent contaminant retardation factors served as a surrogate measure of likely transport behavior. Results suggest that many of the previously published calibrated isotherm parameter sets were local minima. In some cases, the updated hybrid global-local search yielded order-of-magnitude reductions in the fitness function. In particular, of the candidate isotherms, the Polanyi-type models were most likely to benefit from the use of the hybrid fitting procedure. In some cases, improvements in fitness function were associated with slight (<10%) changes in parameter values, but in other cases significant (>50%) changes in parameter values were noted. Despite these differences, the influence of isotherm misspecification on contaminant transport predictions was quite variable and difficult to predict from inspection of the isotherms. Copyright © 2017 Elsevier B.V. All rights reserved.
Detection of Powdery Mildew in Two Winter Wheat Plant Densities and Prediction of Grain Yield Using Canopy Hyperspectral Reflectance

PubMed Central

Cao, Xueren; Luo, Yong; Zhou, Yilin; Fan, Jieru; Xu, Xiangming; West, Jonathan S.; Duan, Xiayu; Cheng, Dengfa

2015-01-01

To determine the influence of plant density and powdery mildew infection of winter wheat and to predict grain yield, hyperspectral canopy reflectance of winter wheat was measured for two plant densities at Feekes growth stage (GS) 10.5.3, 10.5.4, and 11.1 in the 2009–2010 and 2010–2011 seasons. Reflectance in near infrared (NIR) regions was significantly correlated with disease index at GS 10.5.3, 10.5.4, and 11.1 at two plant densities in both seasons. For the two plant densities, the area of the red edge peak (Σdr 680–760 nm), difference vegetation index (DVI), and triangular vegetation index (TVI) were significantly correlated negatively with disease index at three GSs in two seasons. Compared with other parameters Σdr 680–760 nm was the most sensitive parameter for detecting powdery mildew. Linear regression models relating mildew severity to Σdr 680–760 nm were constructed at three GSs in two seasons for the two plant densities, demonstrating no significant difference in the slope estimates between the two plant densities at three GSs. Σdr 680–760 nm was correlated with grain yield at three GSs in two seasons. The accuracies of partial least square regression (PLSR) models were consistently higher than those of models based on Σdr 680760 nm for disease index and grain yield. PLSR can, therefore, provide more accurate estimation of disease index of wheat powdery mildew and grain yield using canopy reflectance. PMID:25815468
Total suspended solids concentrations and yields for water-quality monitoring stations in Gwinnett County, Georgia, 1996-2009

USGS Publications Warehouse

Landers, Mark N.

2013-01-01

The U.S. Geological Survey, in cooperation with the Gwinnett County Department of Water Resources, established a water-quality monitoring program during late 1996 to collect comprehensive, consistent, high-quality data for use by watershed managers. As of 2009, continuous streamflow and water-quality data as well as discrete water-quality samples were being collected for 14 watershed monitoring stations in Gwinnett County. This report provides statistical summaries of total suspended solids (TSS) concentrations for 730 stormflow and 710 base-flow water-quality samples collected between 1996 and 2009 for 14 watershed monitoring stations in Gwinnett County. Annual yields of TSS were estimated for each of the 14 watersheds using methods described in previous studies. TSS yield was estimated using linear, ordinary least-squares regression of TSS and explanatory variables of discharge, turbidity, season, date, and flow condition. The error of prediction for estimated yields ranged from 1 to 42 percent for the stations in this report; however, the actual overall uncertainty of the estimated yields cannot be less than that of the observed yields (± 15 to 20 percent). These watershed yields provide a basis for evaluation of how watershed characteristics, climate, and watershed management practices affect suspended sediment yield.
Pseudo-second order models for the adsorption of safranin onto activated carbon: comparison of linear and non-linear regression methods.

PubMed

Kumar, K Vasanth

2007-04-02

Kinetic experiments were carried out for the sorption of safranin onto activated carbon particles. The kinetic data were fitted to pseudo-second order model of Ho, Sobkowsk and Czerwinski, Blanchard et al. and Ritchie by linear and non-linear regression methods. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo-second order models were the same. Non-linear regression analysis showed that both Blanchard et al. and Ho have similar ideas on the pseudo-second order model but with different assumptions. The best fit of experimental data in Ho's pseudo-second order expression by linear and non-linear regression method showed that Ho pseudo-second order model was a better kinetic expression when compared to other pseudo-second order kinetic expressions.
Genomic prediction based on data from three layer lines using non-linear regression models.

PubMed

Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

2014-11-06

Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.
A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

PubMed

Anderson, Carl A; McRae, Allan F; Visscher, Peter M

2006-07-01

Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Genetic evaluation of lactation persistency for five breeds of dairy cattle.

PubMed

Cole, J B; Null, D J

2009-05-01

Cows with high lactation persistency tend to produce less milk than expected at the beginning of lactation and more than expected at the end. Best prediction of lactation persistency is calculated as a function of trait-specific standard lactation curves and linear regressions of test-day deviations on days in milk. Because regression coefficients are deviations from a tipping point selected to make yield and lactation persistency phenotypically uncorrelated it should be possible to use 305-d actual yield and lactation persistency to predict yield for lactations with later endpoints. The objectives of this study were to calculate (co)variance components and breeding values for best predictions of lactation persistency of milk (PM), fat (PF), protein (PP), and somatic cell score (PSCS) in breeds other than Holstein, and to demonstrate the calculation of prediction equations for 400-d actual milk yield. Data included lactations from Ayrshire, Brown Swiss, Guernsey (GU), Jersey (JE), and Milking Shorthorn (MS) cows calving since 1997. The number of sires evaluated ranged from 86 (MS) to 3,192 (JE), and mean sire estimated breeding value for PM ranged from 0.001 (Ayrshire) to 0.10 (Brown Swiss); mean estimated breeding value for PSCS ranged from -0.01 (MS) to -0.043 (JE). Heritabilities were generally highest for PM (0.09 to 0.15) and lowest for PSCS (0.03 to 0.06), with PF and PP having intermediate values (0.07 to 0.13). Repeatabilities varied considerably between breeds, ranging from 0.08 (PSCS in GU, JE, and MS) to 0.28 (PM in GU). Genetic correlations of PM, PF, and PP with PSCS were moderate and favorable (negative), indicating that increasing lactation persistency of yield traits is associated with decreases in lactation persistency of SCS, as expected. Genetic correlations among yield and lactation persistency were low to moderate and ranged from -0.55 (PP in GU) to 0.40 (PP in MS). Prediction equations for 400-d milk yield were calculated for each breed by regression of both 305-d yield and 305-d yield and lactation persistency on 400-d yield. Goodness-of-fit was very good for both models, but the addition of lactation persistency to the model significantly improved fit in all cases. Routine genetic evaluations for lactation persistency, as well as the development of prediction equations for several lactation end-points, may provide producers with tools to better manage their herds.
Linear regression crash prediction models : issues and proposed solutions.

DOT National Transportation Integrated Search

2010-05-01

The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Comparison between Linear and Nonlinear Regression in a Laboratory Heat Transfer Experiment

ERIC Educational Resources Information Center

Gonçalves, Carine Messias; Schwaab, Marcio; Pinto, José Carlos

2013-01-01

In order to interpret laboratory experimental data, undergraduate students are used to perform linear regression through linearized versions of nonlinear models. However, the use of linearized models can lead to statistically biased parameter estimates. Even so, it is not an easy task to introduce nonlinear regression and show for the students…

Climatically driven yield variability of major crops in Khakassia (South Siberia)

NASA Astrophysics Data System (ADS)

Babushkina, Elena A.; Belokopytova, Liliana V.; Zhirnova, Dina F.; Shah, Santosh K.; Kostyakova, Tatiana V.

2018-06-01

We investigated the variability of yield of the three main crop cultures in the Khakassia Republic: spring wheat, spring barley, and oats. In terms of yield values, variability characteristics, and climatic response, the agricultural territory of Khakassia can be divided into three zones: (1) the Northern Zone, where crops yield has a high positive response to the amount of precipitation, May-July, and a moderately negative one to the temperatures of the same period; (2) the Central Zone, where crops yield depends mainly on temperatures; and (3) the Southern Zone, where climate has the least expressed impact on yield. The dominant pattern in the crops yield is caused by water stress during periods of high temperatures and low moisture supply with heat stress as additional reason. Differences between zones are due to combinations of temperature latitudinal gradient, precipitation altitudinal gradient, and the presence of a well-developed hydrological network and the irrigational system as moisture sources in the Central Zone. More detailed analysis shows differences in the climatic sensitivity of crops during phases of their vegetative growth and grain development and, to a lesser extent, during harvesting period. Multifactor linear regression models were constructed to estimate climate- and autocorrelation-induced variability of the crops yield. These models allowed prediction of the possibility of yield decreasing by at least 2-11% in the next decade due to increasing of the regional summer temperatures.
Climatically driven yield variability of major crops in Khakassia (South Siberia)

NASA Astrophysics Data System (ADS)

Babushkina, Elena A.; Belokopytova, Liliana V.; Zhirnova, Dina F.; Shah, Santosh K.; Kostyakova, Tatiana V.

2017-12-01

We investigated the variability of yield of the three main crop cultures in the Khakassia Republic: spring wheat, spring barley, and oats. In terms of yield values, variability characteristics, and climatic response, the agricultural territory of Khakassia can be divided into three zones: (1) the Northern Zone, where crops yield has a high positive response to the amount of precipitation, May-July, and a moderately negative one to the temperatures of the same period; (2) the Central Zone, where crops yield depends mainly on temperatures; and (3) the Southern Zone, where climate has the least expressed impact on yield. The dominant pattern in the crops yield is caused by water stress during periods of high temperatures and low moisture supply with heat stress as additional reason. Differences between zones are due to combinations of temperature latitudinal gradient, precipitation altitudinal gradient, and the presence of a well-developed hydrological network and the irrigational system as moisture sources in the Central Zone. More detailed analysis shows differences in the climatic sensitivity of crops during phases of their vegetative growth and grain development and, to a lesser extent, during harvesting period. Multifactor linear regression models were constructed to estimate climate- and autocorrelation-induced variability of the crops yield. These models allowed prediction of the possibility of yield decreasing by at least 2-11% in the next decade due to increasing of the regional summer temperatures.
An overall strategy based on regression models to estimate relative survival and model the effects of prognostic factors in cancer survival studies.

PubMed

Remontet, L; Bossard, N; Belot, A; Estève, J

2007-05-10

Relative survival provides a measure of the proportion of patients dying from the disease under study without requiring the knowledge of the cause of death. We propose an overall strategy based on regression models to estimate the relative survival and model the effects of potential prognostic factors. The baseline hazard was modelled until 10 years follow-up using parametric continuous functions. Six models including cubic regression splines were considered and the Akaike Information Criterion was used to select the final model. This approach yielded smooth and reliable estimates of mortality hazard and allowed us to deal with sparse data taking into account all the available information. Splines were also used to model simultaneously non-linear effects of continuous covariates and time-dependent hazard ratios. This led to a graphical representation of the hazard ratio that can be useful for clinical interpretation. Estimates of these models were obtained by likelihood maximization. We showed that these estimates could be also obtained using standard algorithms for Poisson regression. Copyright 2006 John Wiley & Sons, Ltd.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

ERIC Educational Resources Information Center

Haberman, Shelby J.; Sinharay, Sandip

2010-01-01

Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

PubMed

Wilke, Marko

2018-02-01

This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.
Advantage of multiple spot urine collections for estimating daily sodium excretion: comparison with two 24-h urine collections as reference.

PubMed

Uechi, Ken; Asakura, Keiko; Ri, Yui; Masayasu, Shizuko; Sasaki, Satoshi

2016-02-01

Several estimation methods for 24-h sodium excretion using spot urine sample have been reported, but accurate estimation at the individual level remains difficult. We aimed to clarify the most accurate method of estimating 24-h sodium excretion with different numbers of available spot urine samples. A total of 370 participants from throughout Japan collected multiple 24-h urine and spot urine samples independently. Participants were allocated randomly into a development and a validation dataset. Two estimation methods were established in the development dataset using the two 24-h sodium excretion samples as reference: the 'simple mean method' estimated by multiplying the sodium-creatinine ratio by predicted 24-h creatinine excretion, whereas the 'regression method' employed linear regression analysis. The accuracy of the two methods was examined by comparing the estimated means and concordance correlation coefficients (CCC) in the validation dataset. Mean sodium excretion by the simple mean method with three spot urine samples was closest to that by 24-h collection (difference: -1.62 mmol/day). CCC with the simple mean method increased with an increased number of spot urine samples at 0.20, 0.31, and 0.42 using one, two, and three samples, respectively. This method with three spot urine samples yielded higher CCC than the regression method (0.40). When only one spot urine sample was available for each study participant, CCC was higher with the regression method (0.36). The simple mean method with three spot urine samples yielded the most accurate estimates of sodium excretion. When only one spot urine sample was available, the regression method was preferable.
Transmission of linear regression patterns between time series: From relationship in time series to complex networks

NASA Astrophysics Data System (ADS)

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

2014-07-01

The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Transmission of linear regression patterns between time series: from relationship in time series to complex networks.

PubMed

Gao, Xiangyun; An, Haizhong; Fang, Wei; Huang, Xuan; Li, Huajiao; Zhong, Weiqiong; Ding, Yinghui

2014-07-01

The linear regression parameters between two time series can be different under different lengths of observation period. If we study the whole period by the sliding window of a short period, the change of the linear regression parameters is a process of dynamic transmission over time. We tackle fundamental research that presents a simple and efficient computational scheme: a linear regression patterns transmission algorithm, which transforms linear regression patterns into directed and weighted networks. The linear regression patterns (nodes) are defined by the combination of intervals of the linear regression parameters and the results of the significance testing under different sizes of the sliding window. The transmissions between adjacent patterns are defined as edges, and the weights of the edges are the frequency of the transmissions. The major patterns, the distance, and the medium in the process of the transmission can be captured. The statistical results of weighted out-degree and betweenness centrality are mapped on timelines, which shows the features of the distribution of the results. Many measurements in different areas that involve two related time series variables could take advantage of this algorithm to characterize the dynamic relationships between the time series from a new perspective.
Echocardiographic Linear Dimensions for Assessment of Right Ventricular Chamber Volume as Demonstrated by Cardiac Magnetic Resonance

PubMed Central

Kim, Jiwon; Srinivasan, Aparna; Garcia, Tania S.; Franco, Antonino Di; Peskin, Charles S.; McQueen, David M.; Paul, Tracy K.; Feher, Attila; Geevarghese, Alexi; Rozenstrauch, Meenakshi; Devereux, Richard B.; Weinsaft, Jonathan W.

2016-01-01

Background Echo-derived linear dimensions offer straightforward indices of right ventricular (RV) structure but have not been systematically compared to RV volumes on cardiac magnetic resonance (CMR). Methods Echo and CMR were interpreted among CAD patients imaged via prospective (90%) or retrospective (10%) registries. For echo, American Society of Echocardiography (ASE) recommended RV dimensions were measured in apical 4-chamber (basal RV width, mid RV width, RV length), parasternal long (proximal RV outflow tract [pRVOT]) and short axis (distal RVOT) views. For CMR, RV end-diastolic (RV-EDV) and end-systolic (RV-ESV) volumes were quantified via border planimetry. Results 272 patients underwent echo and CMR within a narrow interval (0.4±1.0 days); complete acquisition of all ASE dimensions was feasible in 98%. All echo dimensions differed between patients with and without RV dilation on CMR (p<0.05). Basal RV width (r=0.70), pRVOT width (r=0.68), and RV length (r=0.61) yielded highest correlations with RV-EDV on CMR; end-systolic dimensions yielded similar correlations (r=0.68, 0.66, 0.65 respectively). In multivariable regression, basal RV width (regression coefficient 1.96 per mm [CI 1.22–2.70], p<0.001), RV length (0.97[0.56–1.37], p<0.001) and pRVOT width (2.62 [1.79–3.44], p<0.001) were independently associated with CMR RV-EDV[r= 0.80]. RV-ESV was similarly associated with echo dimensions (basal RV width; 1.59 per mm [CI 1.06–2.13], p<0.001) | RV length; 1.00 [0.66–1.34], p<0.001) | pRVOT width; 1.80 [1.22–2.39], p<0.001) [r= 0.79]. Conclusions RV linear dimensions provide readily obtainable markers of RV chamber size. Proximal RVOT and basal width are independently associated with CMR volumes, supporting use of multiple linear dimensions when assessing RV size on echo. PMID:27297619
Comparison of random regression models with Legendre polynomials and linear splines for production traits and somatic cell score of Canadian Holstein cows.

PubMed

Bohmanova, J; Miglior, F; Jamrozik, J; Misztal, I; Sullivan, P G

2008-09-01

A random regression model with both random and fixed regressions fitted by Legendre polynomials of order 4 was compared with 3 alternative models fitting linear splines with 4, 5, or 6 knots. The effects common for all models were a herd-test-date effect, fixed regressions on days in milk (DIM) nested within region-age-season of calving class, and random regressions for additive genetic and permanent environmental effects. Data were test-day milk, fat and protein yields, and SCS recorded from 5 to 365 DIM during the first 3 lactations of Canadian Holstein cows. A random sample of 50 herds consisting of 96,756 test-day records was generated to estimate variance components within a Bayesian framework via Gibbs sampling. Two sets of genetic evaluations were subsequently carried out to investigate performance of the 4 models. Models were compared by graphical inspection of variance functions, goodness of fit, error of prediction of breeding values, and stability of estimated breeding values. Models with splines gave lower estimates of variances at extremes of lactations than the model with Legendre polynomials. Differences among models in goodness of fit measured by percentages of squared bias, correlations between predicted and observed records, and residual variances were small. The deviance information criterion favored the spline model with 6 knots. Smaller error of prediction and higher stability of estimated breeding values were achieved by using spline models with 5 and 6 knots compared with the model with Legendre polynomials. In general, the spline model with 6 knots had the best overall performance based upon the considered model comparison criteria.
Array of Hall Effect Sensors for Linear Positioning of a Magnet Independently of Its Strength Variation. A Case Study: Monitoring Milk Yield during Milking in Goats

PubMed Central

García-Diego, Fernando-Juan; Sánchez-Quinche, Angel; Merello, Paloma; Beltrán, Pedro; Peris, Cristófol

2013-01-01

In this study we propose an electronic system for linear positioning of a magnet independent of its modulus, which could vary because of aging, different fabrication process, etc. The system comprises a linear array of 24 Hall Effect sensors of proportional response. The data from all sensors are subject to a pretreatment (normalization) by row (position) making them independent on the temporary variation of its magnetic field strength. We analyze the particular case of the individual flow in milking of goats. The multiple regression analysis allowed us to calibrate the electronic system with a percentage of explanation R2 = 99.96%. In our case, the uncertainty in the linear position of the magnet is 0.51 mm that represents 0.019 L of goat milk. The test in farm compared the results obtained by direct reading of the volume with those obtained by the proposed electronic calibrated system, achieving a percentage of explanation of 99.05%. PMID:23793020
Advances in simultaneous atmospheric profile and cloud parameter regression based retrieval from high-spectral resolution radiance measurements

NASA Astrophysics Data System (ADS)

Weisz, Elisabeth; Smith, William L.; Smith, Nadia

2013-06-01

The dual-regression (DR) method retrieves information about the Earth surface and vertical atmospheric conditions from measurements made by any high-spectral resolution infrared sounder in space. The retrieved information includes temperature and atmospheric gases (such as water vapor, ozone, and carbon species) as well as surface and cloud top parameters. The algorithm was designed to produce a high-quality product with low latency and has been demonstrated to yield accurate results in real-time environments. The speed of the retrieval is achieved through linear regression, while accuracy is achieved through a series of classification schemes and decision-making steps. These steps are necessary to account for the nonlinearity of hyperspectral retrievals. In this work, we detail the key steps that have been developed in the DR method to advance accuracy in the retrieval of nonlinear parameters, specifically cloud top pressure. The steps and their impact on retrieval results are discussed in-depth and illustrated through relevant case studies. In addition to discussing and demonstrating advances made in addressing nonlinearity in a linear geophysical retrieval method, advances toward multi-instrument geophysical analysis by applying the DR to three different operational sounders in polar orbit are also noted. For any area on the globe, the DR method achieves consistent accuracy and precision, making it potentially very valuable to both the meteorological and environmental user communities.
Stratospheric Ozone Trends and Variability as Seen by SCIAMACHY from 2002 to 2012

NASA Technical Reports Server (NTRS)

Gebhardt, C.; Rozanov, A.; Hommel, R.; Weber, M.; Bovensmann, H.; Burrows, J. P.; Degenstein, D.; Froidevaux, L.; Thompson, A. M.

2014-01-01

Vertical profiles of the rate of linear change (trend) in the altitude range 15-50 km are determined from decadal O3 time series obtained from SCIAMACHY/ENVISAT measurements in limb-viewing geometry. The trends are calculated by using a multivariate linear regression. Seasonal variations, the quasi-biennial oscillation, signatures of the solar cycle and the El Nino-Southern Oscillation are accounted for in the regression. The time range of trend calculation is August 2002-April 2012. A focus for analysis are the zonal bands of 20 deg N - 20 deg S (tropics), 60 - 50 deg N, and 50 - 60 deg S (midlatitudes). In the tropics, positive trends of up to 5% per decade between 20 and 30 km and negative trends of up to 10% per decade between 30 and 38 km are identified. Positive O3 trends of around 5% per decade are found in the upper stratosphere in the tropics and at midlatitudes. Comparisons between SCIAMACHY and EOS MLS show reasonable agreement both in the tropics and at midlatitudes for most altitudes. In the tropics, measurements from OSIRIS/Odin and SHADOZ are also analysed. These yield rates of linear change of O3 similar to those from SCIAMACHY. However, the trends from SCIAMACHY near 34 km in the tropics are larger than MLS and OSIRIS by a factor of around two.
Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

PubMed

Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

2018-06-01

This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Revisiting tests for neglected nonlinearity using artificial neural networks.

PubMed

Cho, Jin Seo; Ishida, Isao; White, Halbert

2011-05-01

Tests for regression neglected nonlinearity based on artificial neural networks (ANNs) have so far been studied by separately analyzing the two ways in which the null of regression linearity can hold. This implies that the asymptotic behavior of general ANN-based tests for neglected nonlinearity is still an open question. Here we analyze a convenient ANN-based quasi-likelihood ratio statistic for testing neglected nonlinearity, paying careful attention to both components of the null. We derive the asymptotic null distribution under each component separately and analyze their interaction. Somewhat remarkably, it turns out that the previously known asymptotic null distribution for the type 1 case still applies, but under somewhat stronger conditions than previously recognized. We present Monte Carlo experiments corroborating our theoretical results and showing that standard methods can yield misleading inference when our new, stronger regularity conditions are violated.
Synthesis, spectral studies and antimicrobial activities of some 2-naphthyl pyrazoline derivatives

NASA Astrophysics Data System (ADS)

Sakthinathan, S. P.; Vanangamudi, G.; Thirunarayanan, G.

A series of 2-naphthyl pyrazolines were synthesized by the cyclization of 2-naphthyl chalcones and phenylhydrazine hydrochloride in the presence of sodium acetate. The yields of pyrazoline derivatives are more than 80%. The synthesized pyrazolines were characterized by their physical constants, IR, 1H, 13C and MS spectra. From the IR and NMR spectra the Cdbnd N (cm-1) stretches, the pyrazoline ring proton chemical shifts (ppm) of δ, Hb and Hc and also the carbon chemical shifts (ppm) of δCdbnd N are correlated with Hammett substituent constants, F and R, and Swain-Lupton's parameters using single and multi-regression analyses. From the results of linear regression analysis, the effect of substituents on the group frequencies has been predicted. The antimicrobial activities of all synthesized pyrazolines have been studied.
Relationship between compatibilizer and yield strength of PLA/PP Blend

NASA Astrophysics Data System (ADS)

Jariyakulsith, Pattanun; Puajindanetr, Somchai

2018-01-01

The aim of this research is to study the relationship between compatibilizer and yield strength of polylactic acid (PLA) and polypropylene (PP) blend. The PLA is blended with PP (PLA/PP) at the ratios of 70/30, 50/50 and 30/70. In addition, (1) polypropylene grafted maleic anhydride (PP-g-MAH) as a compatibilizer at 0.3 and 0.7 part per hundred of PLA/PP resin (phr) and (2) dicumyl peroxide (DCP) being an initiator at 0.03 and 0.07 phr are added in each composition. Yield strength is characterized to study the interaction between compatibilizer, initiator and yield strength by using experimental design of multilevel full factorial. The results show that (1) the yield strength of PLA/PP blend are increased after addition of compatibilizer. Because the adding of PP-g-MAH and DCP resulted in improving compatibility between PLA and PP. (2) there are interaction between PP-g-MAH and DCP that have affected the final properties of PLA/PP blend. The highest yield strength of 27.68 MPa is provided at the ratio of 70/30 blend by using the 0.3 phr of PP-g-MAH and 0.03 phr of DCP. Linear regression model is fitted and follow the assumptions of normal distribution.
Standardization and validation of the body weight adjustment regression equations in Olympic weightlifting.

PubMed

Kauhanen, Heikki; Komi, Paavo V; Häkkinen, Keijo

2002-02-01

The problems in comparing the performances of Olympic weightlifters arise from the fact that the relationship between body weight and weightlifting results is not linear. In the present study, this relationship was examined by using a nonparametric curve fitting technique of robust locally weighted regression (LOWESS) on relatively large data sets of the weightlifting results made in top international competitions. Power function formulas were derived from the fitted LOWESS values to represent the relationship between the 2 variables in a way that directly compares the snatch, clean-and-jerk, and total weightlifting results of a given athlete with those of the world-class weightlifters (golden standards). A residual analysis of several other parametric models derived from the initial results showed that they all experience inconsistencies, yielding either underestimation or overestimation of certain body weights. In addition, the existing handicapping formulas commonly used in normalizing the performances of Olympic weightlifters did not yield satisfactory results when applied to the present data. It was concluded that the devised formulas may provide objective means for the evaluation of the performances of male weightlifters, regardless of their body weights, ages, or performance levels.
Digital Image Restoration Under a Regression Model - The Unconstrained, Linear Equality and Inequality Constrained Approaches

DTIC Science & Technology

1974-01-01

REGRESSION MODEL - THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January 1974 Nelson Delfino d’Avila Mascarenha;? Image...Report 520 DIGITAL IMAGE RESTORATION UNDER A REGRESSION MODEL THE UNCONSTRAINED, LINEAR EQUALITY AND INEQUALITY CONSTRAINED APPROACHES January...a two- dimensional form adequately describes the linear model . A dis- cretization is performed by using quadrature methods. By trans
Element enrichment factor calculation using grain-size distribution and functional data regression.

PubMed

Sierra, C; Ordóñez, C; Saavedra, A; Gallego, J R

2015-01-01

In environmental geochemistry studies it is common practice to normalize element concentrations in order to remove the effect of grain size. Linear regression with respect to a particular grain size or conservative element is a widely used method of normalization. In this paper, the utility of functional linear regression, in which the grain-size curve is the independent variable and the concentration of pollutant the dependent variable, is analyzed and applied to detrital sediment. After implementing functional linear regression and classical linear regression models to normalize and calculate enrichment factors, we concluded that the former regression technique has some advantages over the latter. First, functional linear regression directly considers the grain-size distribution of the samples as the explanatory variable. Second, as the regression coefficients are not constant values but functions depending on the grain size, it is easier to comprehend the relationship between grain size and pollutant concentration. Third, regularization can be introduced into the model in order to establish equilibrium between reliability of the data and smoothness of the solutions. Copyright © 2014 Elsevier Ltd. All rights reserved.

[Comparative evaluation of the sensitivity of Acinetobacter to colistin, using the prediffusion and minimum inhibitory concentration methods: detection of heteroresistant isolates].

PubMed

Herrera, Melina E; Mobilia, Liliana N; Posse, Graciela R

2011-01-01

The objective of this study is to perform a comparative evaluation of the prediffusion and minimum inhibitory concentration (MIC) methods for the detection of sensitivity to colistin, and to detect Acinetobacter baumanii-calcoaceticus complex (ABC) heteroresistant isolates to colistin. We studied 75 isolates of ABC recovered from clinically significant samples obtained from various centers. Sensitivity to colistin was determined by prediffusion as well as by MIC. All the isolates were sensitive to colistin, with MIC = 2µg/ml. The results were analyzed by dispersion graph and linear regression analysis, revealing that the prediffusion method did not correlate with the MIC values for isolates sensitive to colistin (r² = 0.2017). Detection of heteroresistance to colistin was determined by plaque efficiency of all the isolates with the same initial MICs of 2, 1, and 0.5 µg/ml, which resulted in 14 of them with a greater than 8-fold increase in the MIC in some cases. When the sensitivity of these resistant colonies was determined by prediffusion, the resulting dispersion graph and linear regression analysis yielded an r² = 0.604, which revealed a correlation between the methodologies used.
Who Will Win?: Predicting the Presidential Election Using Linear Regression

ERIC Educational Resources Information Center

Lamb, John H.

2007-01-01

This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
A mathematical framework for yield (vs. rate) optimization in constraint-based modeling and applications in metabolic engineering.

PubMed

Klamt, Steffen; Müller, Stefan; Regensburger, Georg; Zanghellini, Jürgen

2018-05-01

The optimization of metabolic rates (as linear objective functions) represents the methodical core of flux-balance analysis techniques which have become a standard tool for the study of genome-scale metabolic models. Besides (growth and synthesis) rates, metabolic yields are key parameters for the characterization of biochemical transformation processes, especially in the context of biotechnological applications. However, yields are ratios of rates, and hence the optimization of yields (as nonlinear objective functions) under arbitrary linear constraints is not possible with current flux-balance analysis techniques. Despite the fundamental importance of yields in constraint-based modeling, a comprehensive mathematical framework for yield optimization is still missing. We present a mathematical theory that allows one to systematically compute and analyze yield-optimal solutions of metabolic models under arbitrary linear constraints. In particular, we formulate yield optimization as a linear-fractional program. For practical computations, we transform the linear-fractional yield optimization problem to a (higher-dimensional) linear problem. Its solutions determine the solutions of the original problem and can be used to predict yield-optimal flux distributions in genome-scale metabolic models. For the theoretical analysis, we consider the linear-fractional problem directly. Most importantly, we show that the yield-optimal solution set (like the rate-optimal solution set) is determined by (yield-optimal) elementary flux vectors of the underlying metabolic model. However, yield- and rate-optimal solutions may differ from each other, and hence optimal (biomass or product) yields are not necessarily obtained at solutions with optimal (growth or synthesis) rates. Moreover, we discuss phase planes/production envelopes and yield spaces, in particular, we prove that yield spaces are convex and provide algorithms for their computation. We illustrate our findings by a small example and demonstrate their relevance for metabolic engineering with realistic models of E. coli. We develop a comprehensive mathematical framework for yield optimization in metabolic models. Our theory is particularly useful for the study and rational modification of cell factories designed under given yield and/or rate requirements. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
The microcomputer scientific software series 2: general linear model--regression.

Treesearch

Harold M. Rauscher

1983-01-01

The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.

Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less
Does the high–tech industry consistently reduce CO{sub 2} emissions? Results from nonparametric additive regression model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Bin; Research Center of Applied Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013; Lin, Boqiang, E-mail: bqlin@xmu.edu.cn

China is currently the world's largest carbon dioxide (CO{sub 2}) emitter. Moreover, total energy consumption and CO{sub 2} emissions in China will continue to increase due to the rapid growth of industrialization and urbanization. Therefore, vigorously developing the high–tech industry becomes an inevitable choice to reduce CO{sub 2} emissions at the moment or in the future. However, ignoring the existing nonlinear links between economic variables, most scholars use traditional linear models to explore the impact of the high–tech industry on CO{sub 2} emissions from an aggregate perspective. Few studies have focused on nonlinear relationships and regional differences in China. Basedmore » on panel data of 1998–2014, this study uses the nonparametric additive regression model to explore the nonlinear effect of the high–tech industry from a regional perspective. The estimated results show that the residual sum of squares (SSR) of the nonparametric additive regression model in the eastern, central and western regions are 0.693, 0.054 and 0.085 respectively, which are much less those that of the traditional linear regression model (3.158, 4.227 and 7.196). This verifies that the nonparametric additive regression model has a better fitting effect. Specifically, the high–tech industry produces an inverted “U–shaped” nonlinear impact on CO{sub 2} emissions in the eastern region, but a positive “U–shaped” nonlinear effect in the central and western regions. Therefore, the nonlinear impact of the high–tech industry on CO{sub 2} emissions in the three regions should be given adequate attention in developing effective abatement policies. - Highlights: • The nonlinear effect of the high–tech industry on CO{sub 2} emissions was investigated. • The high–tech industry yields an inverted “U–shaped” effect in the eastern region. • The high–tech industry has a positive “U–shaped” nonlinear effect in other regions. • The linear impact of the high–tech industry in the eastern region is the strongest.« less
Modelling drought-related yield losses in Iberia using remote sensing and multiscalar indices

NASA Astrophysics Data System (ADS)

Ribeiro, Andreia F. S.; Russo, Ana; Gouveia, Célia M.; Páscoa, Patrícia

2018-04-01

The response of two rainfed winter cereal yields (wheat and barley) to drought conditions in the Iberian Peninsula (IP) was investigated for a long period (1986-2012). Drought hazard was evaluated based on the multiscalar Standardized Precipitation Evapotranspiration Index (SPEI) and three remote sensing indices, namely the Vegetation Condition (VCI), the Temperature Condition (TCI), and the Vegetation Health (VHI) Indices. A correlation analysis between the yield and the drought indicators was conducted, and multiple linear regression (MLR) and artificial neural network (ANN) models were established to estimate yield at the regional level. The correlation values suggested that yield reduces with moisture depletion (low values of VCI) during early-spring and with too high temperatures (low values of TCI) close to the harvest time. Generally, all drought indicators displayed greatest influence during the plant stages in which the crop is photosynthetically more active (spring and summer), rather than the earlier moments of plants life cycle (autumn/winter). Our results suggested that SPEI is more relevant in the southern sector of the IP, while remote sensing indices are rather good in estimating cereal yield in the northern sector of the IP. The strength of the statistical relationships found by MLR and ANN methods is quite similar, with some improvements found by the ANN. A great number of true positives (hits) of occurrence of yield-losses exhibiting hit rate (HR) values higher than 69% was obtained.
Predictors of mother and child DNA yields in buccal cell samples collected in pediatric cancer epidemiologic studies: a report from the Children's Oncology group.

PubMed

Poynter, Jenny N; Ross, Julie A; Hooten, Anthony J; Langer, Erica; Blommer, Crystal; Spector, Logan G

2013-08-12

Collection of high-quality DNA is essential for molecular epidemiology studies. Methods have been evaluated for optimal DNA collection in studies of adults; however, DNA collection in young children poses additional challenges. Here, we have evaluated predictors of DNA quantity in buccal cells collected for population-based studies of infant leukemia (N = 489 mothers and 392 children) and hepatoblastoma (HB; N = 446 mothers and 412 children) conducted through the Children's Oncology Group. DNA samples were collected by mail using mouthwash (for mothers and some children) and buccal brush (for children) collection kits and quantified using quantitative real-time PCR. Multivariable linear regression models were used to identify predictors of DNA yield. Median DNA yield was higher for mothers in both studies compared with their children (14 μg vs. <1 μg). Significant predictors of DNA yield in children included case-control status (β = -0.69, 50% reduction, P = 0.01 for case vs. control children), brush collection type, and season of sample collection. Demographic factors were not strong predictors of DNA yield in mothers or children in this analysis. The association with seasonality suggests that conditions during transport may influence DNA yield. The low yields observed in most children in these studies highlight the importance of developing alternative methods for DNA collection in younger age groups.
[Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

PubMed

Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

2017-05-10

We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
[Ultrasonic measurements of fetal thalamus, caudate nucleus and lenticular nucleus in prenatal diagnosis].

PubMed

Yang, Ruiqi; Wang, Fei; Zhang, Jialing; Zhu, Chonglei; Fan, Limei

2015-05-19

To establish the reference values of thalamus, caudate nucleus and lenticular nucleus diameters through fetal thalamic transverse section. A total of 265 fetuses at our hospital were randomly selected from November 2012 to August 2014. And the transverse and length diameters of thalamus, caudate nucleus and lenticular nucleus were measured. SPSS 19.0 statistical software was used to calculate the regression curve of fetal diameter changes and gestational weeks of pregnancy. P < 0.05 was considered as having statistical significance. The linear regression equation of fetal thalamic length diameter and gestational week was: Y = 0.051X+0.201, R = 0.876, linear regression equation of thalamic transverse diameter and fetal gestational week was: Y = 0.031X+0.229, R = 0.817, linear regression equation of fetal head of caudate nucleus length diameter and gestational age was: Y = 0.033X+0.101, R = 0.722, linear regression equation of fetal head of caudate nucleus transverse diameter and gestational week was: R = 0.025 - 0.046, R = 0.711, linear regression equation of fetal lentiform nucleus length diameter and gestational week was: Y = 0.046+0.229, R = 0.765, linear regression equation of fetal lentiform nucleus diameter and gestational week was: Y = 0.025 - 0.05, R = 0.772. Ultrasonic measurement of diameter of fetal thalamus caudate nucleus, and lenticular nucleus through thalamic transverse section is simple and convenient. And measurements increase with fetal gestational weeks and there is linear regression relationship between them.
Local Linear Regression for Data with AR Errors.

PubMed

Li, Runze; Li, Yan

2009-07-01

In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.
A computational approach to compare regression modelling strategies in prediction research.

PubMed

Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

2016-08-25

It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Practical Session: Simple Linear Regression

NASA Astrophysics Data System (ADS)

Clausel, M.; Grégoire, G.

2014-12-01

Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Morse Code, Scrabble, and the Alphabet

ERIC Educational Resources Information Center

Richardson, Mary; Gabrosek, John; Reischman, Diann; Curtiss, Phyliss

2004-01-01

In this paper we describe an interactive activity that illustrates simple linear regression. Students collect data and analyze it using simple linear regression techniques taught in an introductory applied statistics course. The activity is extended to illustrate checks for regression assumptions and regression diagnostics taught in an…
Lunisolar tidal force and its relationship to chlorophyll fluorescence in Arabidopsis thaliana.

PubMed

Fisahn, Joachim; Klingelé, Emile; Barlow, Peter

2015-01-01

The yield of chlorophyll fluorescence Ft was measured in leaves of Arabidopsis thaliana over periods of several days under conditions of continuous illumination (LL) without the application of saturating light pulses. After linearization of the time series of the chlorophyll fluorescence yield (ΔFt), oscillations became apparent with periodicities in the circatidal range. Alignments of these linearized time series ΔFt with the lunisolar tidal acceleration revealed high degrees of synchrony and phase congruence. Similar congruence with the lunisolar tide was obtained with the linearized quantum yield of PSII (ΔФII), recorded after application of saturating light pulses. These findings strongly suggest that there is an exogenous timekeeper which is a stimulus for the oscillations detected in both the linearized yield of chlorophyll fluorescence (ΔFt) and the linearized quantum yield of PSII (ΔФII).
Lunisolar tidal force and its relationship to chlorophyll fluorescence in Arabidopsis thaliana

PubMed Central

Fisahn, Joachim; Klingelé, Emile; Barlow, Peter

2015-01-01

The yield of chlorophyll fluorescence Ft was measured in leaves of Arabidopsis thaliana over periods of several days under conditions of continuous illumination (LL) without the application of saturating light pulses. After linearization of the time series of the chlorophyll fluorescence yield (ΔFt), oscillations became apparent with periodicities in the circatidal range. Alignments of these linearized time series ΔFt with the lunisolar tidal acceleration revealed high degrees of synchrony and phase congruence. Similar congruence with the lunisolar tide was obtained with the linearized quantum yield of PSII (ΔФII), recorded after application of saturating light pulses. These findings strongly suggest that there is an exogenous timekeeper which is a stimulus for the oscillations detected in both the linearized yield of chlorophyll fluorescence (ΔFt) and the linearized quantum yield of PSII (ΔФII). PMID:26376108
An investigation to improve the Menhaden fishery prediction and detection model through the application of ERTS-A data

NASA Technical Reports Server (NTRS)

Maughan, P. M. (Principal Investigator)

1973-01-01

The author has identified the following significant results. Linear regression of secchi disc visibility against number of sets yielded significant results in a number of instances. The variability seen in the slope of the regression lines is due to the nonuniformity of sample size. The longer the period sampled, the larger the total number of attempts. Further, there is no reason to expect either the influence of transparency or of other variables to remain constant throughout the season. However, the fact that the data for the entire season, variable as it is, was significant at the 5% level, suggests its potential utility for predictive modeling. Thus, this regression equation will be considered representative and will be utilized for the first numerical model. Secchi disc visibility was also regressed against number of sets for the three day period September 27-September 29, 1972 to determine if surface truth data supported the intense relationship between ERTS-1 identified turbidity and fishing effort previously discussed. A very negative correlation was found. These relationship lend additional credence to the hypothesis that ERTS imagery, when utilized as a source of visibility (turbidity) data, may be useful as a predictive tool.
a Comparison Between Two Ols-Based Approaches to Estimating Urban Multifractal Parameters

NASA Astrophysics Data System (ADS)

Huang, Lin-Shan; Chen, Yan-Guang

Multifractal theory provides a new spatial analytical tool for urban studies, but many basic problems remain to be solved. Among various pending issues, the most significant one is how to obtain proper multifractal dimension spectrums. If an algorithm is improperly used, the parameter spectrums will be abnormal. This paper is devoted to investigating two ordinary least squares (OLS)-based approaches for estimating urban multifractal parameters. Using empirical study and comparative analysis, we demonstrate how to utilize the adequate linear regression to calculate multifractal parameters. The OLS regression analysis has two different approaches. One is that the intercept is fixed to zero, and the other is that the intercept is not limited. The results of comparative study show that the zero-intercept regression yields proper multifractal parameter spectrums within certain scale range of moment order, while the common regression method often leads to abnormal multifractal parameter values. A conclusion can be reached that fixing the intercept to zero is a more advisable regression method for multifractal parameters estimation, and the shapes of spectral curves and value ranges of fractal parameters can be employed to diagnose urban problems. This research is helpful for scientists to understand multifractal models and apply a more reasonable technique to multifractal parameter calculations.
Reversed inverse regression for the univariate linear calibration and its statistical properties derived using a new methodology

NASA Astrophysics Data System (ADS)

Kang, Pilsang; Koo, Changhoi; Roh, Hokyu

2017-11-01

Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.

PubMed

Ferrari, Alberto; Comelli, Mario

2016-12-01

In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.

Analyzing prospective teachers' images of scientists using positive, negative and stereotypical images of scientists

NASA Astrophysics Data System (ADS)

Subramaniam, Karthigeyan; Esprívalo Harrell, Pamela; Wojnowski, David

2013-04-01

Background and purpose : This study details the use of a conceptual framework to analyze prospective teachers' images of scientists to reveal their context-specific conceptions of scientists. The conceptual framework consists of context-specific conceptions related to positive, stereotypical and negative images of scientists as detailed in the literature on the images, role and work of scientists. Sample, design and method : One hundred and ninety-six drawings of scientists, generated by prospective teachers, were analyzed using the Draw-A-Scientist-Test Checklist (DAST-C), a binary linear regression and the conceptual framework. Results : The results of the binary linear regression analysis revealed a statistically significant difference for two DAST-C elements: ethnicity differences with regard to drawing a scientist who was Caucasian and gender differences for indications of danger. Analysis using the conceptual framework helped to categorize the same drawings into positive, stereotypical, negative and composite images of a scientist. Conclusions : The conceptual framework revealed that drawings were focused on the physical appearance of the scientist, and to a lesser extent on the equipment, location and science-related practices that provided the context of a scientist's role and work. Implications for teacher educators include the need to understand that there is a need to provide tools, like the conceptual framework used in this study, to help prospective teachers to confront and engage with their multidimensional perspectives of scientists in light of the current trends on perceiving and valuing scientists. In addition, teacher educators need to use the conceptual framework, which yields qualitative perspectives about drawings, together with the DAST-C, which yields quantitative measure for drawings, to help prospective teachers to gain a holistic outlook on their drawings of scientists.
Seasonal and temporal patterns of NDMA formation potentials in surface waters.

PubMed

Uzun, Habibullah; Kim, Daekyun; Karanfil, Tanju

2015-02-01

The seasonal and temporal patterns of N-nitrosodimethylamine (NDMA) formation potentials (FPs) were examined with water samples collected monthly for 21 month period in 12 surface waters. This long term study allowed monitoring the patterns of NDMA FPs under dynamic weather conditions (e.g., rainy and dry periods) covering several seasons. Anthropogenically impacted waters which were determined by high sucralose levels (>100 ng/L) had higher NDMA FPs than limited impacted sources (<100 ng/L). In most sources, NDMA FP showed more variability in spring months, while seasonal mean values remained relatively consistent. The study also showed that watershed characteristics played an important role in the seasonal and temporal patterns. In the two dam-controlled river systems (SW A and G), the NDMA FP levels at the downstream sampling locations were controlled by the NDMA levels in the dams independent of either the increases in discharge rates due to water releases from the dams prior to or during the heavy rain events or intermittent high NDMA FP levels observed at the upstream of dams. The large reservoirs and impoundments on rivers examined in this study appeared serving as an equalization basin for NDMA precursors. On the other hand, in a river without an upstream reservoir (SW E), the NDMA levels were influenced by the ratio of an upstream wastewater treatment plant (WWTP) effluent discharge to the river discharge rate. The impact of WWTP effluent decreased during the high river flow periods due to rain events. Linear regression with independent variables DOC, DON, and sucralose yielded poor correlations with NDMA FP (R(2) < 0.27). Multiple linear regression analysis using DOC and log [sucralose] yielded a better correlation with NDMA FP (R(2) = 0.53). Copyright © 2014 Elsevier Ltd. All rights reserved.
OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

PubMed

Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

2012-01-01

The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Simple method for quick estimation of aquifer hydrogeological parameters

NASA Astrophysics Data System (ADS)

Ma, C.; Li, Y. Y.

2017-08-01

Development of simple and accurate methods to determine the aquifer hydrogeological parameters was of importance for groundwater resources assessment and management. Aiming at the present issue of estimating aquifer parameters based on some data of the unsteady pumping test, a fitting function of Theis well function was proposed using fitting optimization method and then a unitary linear regression equation was established. The aquifer parameters could be obtained by solving coefficients of the regression equation. The application of the proposed method was illustrated, using two published data sets. By the error statistics and analysis on the pumping drawdown, it showed that the method proposed in this paper yielded quick and accurate estimates of the aquifer parameters. The proposed method could reliably identify the aquifer parameters from long distance observed drawdowns and early drawdowns. It was hoped that the proposed method in this paper would be helpful for practicing hydrogeologists and hydrologists.
Climate change and maize yield in Iowa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Hong; Twine, Tracy E.; Girvetz, Evan

Climate is changing across the world, including the major maize-growing state of Iowa in the USA. To maintain crop yields, farmers will need a suite of adaptation strategies, and choice of strategy will depend on how the local to regional climate is expected to change. Here we predict how maize yield might change through the 21 st century as compared with late 20 th century yields across Iowa, USA, a region representing ideal climate and soils for maize production that contributes substantially to the global maize economy. To account for climate model uncertainty, we drive a dynamic ecosystem model withmore » output from six climate models and two future climate forcing scenarios. Despite a wide range in the predicted amount of warming and change to summer precipitation, all simulations predict a decrease in maize yields from late 20 th century to middle and late 21 st century ranging from 15% to 50%. Linear regression of all models predicts a 6% state-averaged yield decrease for every 1°C increase in warm season average air temperature. When the influence of moisture stress on crop growth is removed from the model, yield decreases either remain the same or are reduced, depending on predicted changes in warm season precipitation. Lastly, our results suggest that even if maize were to receive all the water it needed, under the strongest climate forcing scenario yields will decline by 10-20% by the end of the 21 st century.« less
Climate change and maize yield in Iowa

DOE PAGES

Xu, Hong; Twine, Tracy E.; Girvetz, Evan

2016-05-24

Climate is changing across the world, including the major maize-growing state of Iowa in the USA. To maintain crop yields, farmers will need a suite of adaptation strategies, and choice of strategy will depend on how the local to regional climate is expected to change. Here we predict how maize yield might change through the 21 st century as compared with late 20 th century yields across Iowa, USA, a region representing ideal climate and soils for maize production that contributes substantially to the global maize economy. To account for climate model uncertainty, we drive a dynamic ecosystem model withmore » output from six climate models and two future climate forcing scenarios. Despite a wide range in the predicted amount of warming and change to summer precipitation, all simulations predict a decrease in maize yields from late 20 th century to middle and late 21 st century ranging from 15% to 50%. Linear regression of all models predicts a 6% state-averaged yield decrease for every 1°C increase in warm season average air temperature. When the influence of moisture stress on crop growth is removed from the model, yield decreases either remain the same or are reduced, depending on predicted changes in warm season precipitation. Lastly, our results suggest that even if maize were to receive all the water it needed, under the strongest climate forcing scenario yields will decline by 10-20% by the end of the 21 st century.« less
Quality of life in breast cancer patients--a quantile regression analysis.

PubMed

Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

2008-01-01

Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Interpretation of commonly used statistical regression models.

PubMed

Kasza, Jessica; Wolfe, Rory

2014-01-01

A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
An Investigation of Widespread Ozone Damage to the Soybean Crop in the Upper Midwest Determined From Ground-Based and Satellite Measurements

NASA Technical Reports Server (NTRS)

Fishman, Jack; Creilson, John K.; Parker, Peter A.; Ainsworth, Elizabeth A.; Vining, G. Geoffrey; Szarka, John; Booker, Fitzgerald L.; Xu, Xiaojing

2010-01-01

Elevated concentrations of ground-level ozone (O3) are frequently measured over farmland regions in many parts of the world. While numerous experimental studies show that O3 can significantly decrease crop productivity, independent verifications of yield losses at current ambient O3 concentrations in rural locations are sparse. In this study, soybean crop yield data during a 5-year period over the Midwest of the United States were combined with ground and satellite O3 measurements to provide evidence that yield losses on the order of 10% could be estimated through the use of a multiple linear regression model. Yield loss trends based on both conventional ground-based instrumentation and satellite-derived tropospheric O3 measurements were statistically significant and were consistent with results obtained from open-top chamber experiments and an open-air experimental facility (SoyFACE, Soybean Free Air Concentration Enrichment) in central Illinois. Our analysis suggests that such losses are a relatively new phenomenon due to the increase in background tropospheric O3 levels over recent decades. Extrapolation of these findings supports previous studies that estimate the global economic loss to the farming community of more than $10 billion annually.
Quality by Design approach to spray drying processing of crystalline nanosuspensions.

PubMed

Kumar, Sumit; Gokhale, Rajeev; Burgess, Diane J

2014-04-10

Quality by Design (QbD) principles were explored to understand spray drying process for the conversion of liquid nanosuspensions into solid nano-crystalline dry powders using indomethacin as a model drug. The effects of critical process variables: inlet temperature, flow and aspiration rates on critical quality attributes (CQAs): particle size, moisture content, percent yield and crystallinity were investigated employing a full factorial design. A central cubic design was employed to generate the response surface for particle size and percent yield. Multiple linear regression analysis and ANOVA were employed to identify and estimate the effect of critical parameters, establish their relationship with CQAs, create design space and model the spray drying process. Inlet temperature was identified as the only significant factor (p value <0.05) to affect dry powder particle size. Higher inlet temperatures caused drug surface melting and hence aggregation of the dried nano-crystalline powders. Aspiration and flow rates were identified as significant factors affecting yield (p value <0.05). Higher yields were obtained at higher aspiration and lower flow rates. All formulations had less than 3% (w/w) moisture content. Formulations dried at higher inlet temperatures had lower moisture compared to those dried at lower inlet temperatures. Published by Elsevier B.V.
Ultrasonic-assisted extraction and in-vitro antioxidant activity of polysaccharide from Hibiscus leaf.

PubMed

Afshari, Kasra; Samavati, Vahid; Shahidi, Seyed-Ahmad

2015-03-01

The effects of ultrasonic power, extraction time, extraction temperature, and the water-to-raw material ratio on extraction yield of crude polysaccharide from the leaf of Hibiscus rosa-sinensis (HRLP) were optimized by statistical analysis using response surface methodology. The response surface methodology (RSM) was used to optimize HRLP extraction yield by implementing the Box-Behnken design (BBD). The experimental data obtained were fitted to a second-order polynomial equation using multiple regression analysis and also analyzed by appropriate statistical methods (ANOVA). Analysis of the results showed that the linear and quadratic terms of these four variables had significant effects. The optimal conditions for the highest extraction yield of HRLP were: ultrasonic power, 93.59 W; extraction time, 25.71 min; extraction temperature, 93.18°C; and the water to raw material ratio, 24.3 mL/g. Under these conditions, the experimental yield was 9.66±0.18%, which is well in close agreement with the value predicted by the model 9.526%. The results demonstrated that HRLP had strong scavenging activities in vitro on DPPH and hydroxyl radicals. Copyright © 2014 Elsevier B.V. All rights reserved.
Effect of Calf Gender on Milk Yield and Fatty Acid Content in Holstein Dairy Cows

PubMed Central

Ehrlich, James L.; Grove-White, Dai H.

2017-01-01

The scale of sexed semen use to avoid the birth of unwanted bull calves in the UK dairy industry depends on several economic factors. It has been suggested in other studies that calf gender may affect milk yield in Holsteins- something that would affect the economics of sexed semen use. The present study used a large milk recording data set to evaluate the effect of calf gender (both calf born and calf in utero) on both milk yield and saturated fat content. Linear regression was used to model data for first lactation and second lactation separately. Results showed that giving birth to a heifer calf conferred a 1% milk yield advantage in first lactation heifers, whilst giving birth to a bull calf conferred a 0.5% advantage in second lactation. Heifer calves were also associated with a 0.66kg reduction in saturated fatty acid content of milk in first lactation, but there was no significant difference between the genders in second lactation. No relationship was found between calf gender and milk mono- or polyunsaturated fatty acid content. The observed effects of calf gender on both yield and saturated fatty acid content was considered minor when compared to nutritional and genetic influences. PMID:28068399
Effect of Calf Gender on Milk Yield and Fatty Acid Content in Holstein Dairy Cows.

PubMed

Gillespie, Amy V; Ehrlich, James L; Grove-White, Dai H

2017-01-01

The scale of sexed semen use to avoid the birth of unwanted bull calves in the UK dairy industry depends on several economic factors. It has been suggested in other studies that calf gender may affect milk yield in Holsteins- something that would affect the economics of sexed semen use. The present study used a large milk recording data set to evaluate the effect of calf gender (both calf born and calf in utero) on both milk yield and saturated fat content. Linear regression was used to model data for first lactation and second lactation separately. Results showed that giving birth to a heifer calf conferred a 1% milk yield advantage in first lactation heifers, whilst giving birth to a bull calf conferred a 0.5% advantage in second lactation. Heifer calves were also associated with a 0.66kg reduction in saturated fatty acid content of milk in first lactation, but there was no significant difference between the genders in second lactation. No relationship was found between calf gender and milk mono- or polyunsaturated fatty acid content. The observed effects of calf gender on both yield and saturated fatty acid content was considered minor when compared to nutritional and genetic influences.
Hydrothermal carbonization of Opuntia ficus-indica cladodes: Role of process parameters on hydrochar properties.

PubMed

Volpe, Maurizio; Goldfarb, Jillian L; Fiori, Luca

2018-01-01

Opuntia ficus-indica cladodes are a potential source of solid biofuel from marginal, dry land. Experiments assessed the effects of temperature (180-250°C), reaction time (0.5-3h) and biomass to water ratio (B/W; 0.07-0.30) on chars produced via hydrothermal carbonization. Multivariate linear regression demonstrated that the three process parameters are critically important to hydrochar solid yield, while B/W drives energy yield. Heating value increased together with temperature and reaction time and was maximized at intermediate B/W (0.14-0.20). Microscopy shows evidence of secondary char formed at higher temperatures and B/W ratios. X-ray diffraction, thermogravimetric data, microscopy and inductively coupled plasma mass spectrometry suggest that calcium oxalate in the raw biomass remains in the hydrochar; at higher temperatures, the mineral decomposes into CO 2 and may catalyze char/tar decomposition. Copyright © 2017 Elsevier Ltd. All rights reserved.
Use of probabilistic weights to enhance linear regression myoelectric control

NASA Astrophysics Data System (ADS)

Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

2015-12-01

Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Atoms-in-molecules study of the genetically encoded amino acids. III. Bond and atomic properties and their correlations with experiment including mutation-induced changes in protein stability and genetic coding.

PubMed

Matta, Chérif F; Bader, Richard F W

2003-08-15

This article presents a study of the molecular charge distributions of the genetically encoded amino acids (AA), one that builds on the previous determination of their equilibrium geometries and the demonstrated transferability of their common geometrical parameters. The properties of the charge distributions are characterized and given quantitative expression in terms of the bond and atomic properties determined within the quantum theory of atoms-in-molecules (QTAIM) that defines atoms and bonds in terms of the observable charge density. The properties so defined are demonstrated to be remarkably transferable, a reflection of the underlying transferability of the charge distributions of the main chain and other groups common to the AA. The use of the atomic properties in obtaining an understanding of the biological functions of the AA, whether free or bound in a polypeptide, is demonstrated by the excellent statistical correlations they yield with experimental physicochemical properties. A property of the AA side chains of particular importance is the charge separation index (CSI), a quantity previously defined as the sum of the magnitudes of the atomic charges and which measures the degree of separation of positive and negative charges in the side chain of interest. The CSI values provide a correlation with the measured free energies of transfer of capped side chain analogues, from the vapor phase to aqueous solution, yielding a linear regression equation with r2 = 0.94. The atomic volume is defined by the van der Waals isodensity surface and it, together with the CSI, which accounts for the electrostriction of the solvent, yield a linear regression (r2 = 0.98) with the measured partial molar volumes of the AAs. The changes in free energies of transfer from octanol to water upon interchanging 153 pairs of AAs and from cyclohexane to water upon interchanging 190 pairs of AAs, were modeled using only three calculated parameters (representing electrostatic and volume contributions) yielding linear regressions with r2 values of 0.78 and 0.89, respectively. These results are a prelude to the single-site mutation-induced changes in the stabilities of two typical proteins: ubiquitin and staphylococcal nuclease. Strong quadratic correlations (r2 approximately 0.9) were obtained between DeltaCSI upon mutation and each of the two terms DeltaDeltaH and TDeltaDeltaS taken from recent and accurate differential scanning calorimetry experiments on ubiquitin. When the two terms are summed to yield DeltaDeltaG, the quadratic terms nearly cancel, and the result is a simple linear fit between DeltaDeltaG and DeltaCSI with r2 = 0.88. As another example, the change in the stability of staphylococcal nuclease upon mutation has been fitted linearly (r2 = 0.83) to the sum of a DeltaCSI term and a term representing the change in the van der Waals volume of the side chains upon mutation. The suggested correlation of the polarity of the side chain with the second letter of the AA triplet genetic codon is given concrete expression in a classification of the side chains in terms of their CSI values and their group dipole moments. For example, all amino acids with a pyrimidine base as their second letter in mRNA possess side-chain CSI < or = 2.8 (with the exception of Cys), whereas all those with CSI > 2.8 possess an purine base. The article concludes with two proposals for measuring and predicting molecular complementarity: van der Waals complementarity expressed in terms of the van der Waals isodensity surface and Lewis complementarity expressed in terms of the local charge concentrations and depletions defined by the topology of the Laplacian of the electron density. A display of the experimentally accessible Laplacian distribution for a folded protein would offer a clear picture of the operation of the "stereochemical code" proposed as the determinant in the folding process. Copyright 2003 Wiley-Liss, Inc.
Simplified large African carnivore density estimators from track indices.

PubMed

Winterbach, Christiaan W; Ferreira, Sam M; Funston, Paul J; Somers, Michael J

2016-01-01

The range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices. We did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear model y = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models. The Lion on Clay and Low Density on Sand models with intercept were not significant ( P > 0.05). The other four models with intercept and the six models thorough origin were all significant ( P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support. Our results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formula observed track density = 3.26 × carnivore density can be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km 2 or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.
[From clinical judgment to linear regression model.

PubMed

Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

2013-01-01

When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Extending local canonical correlation analysis to handle general linear contrasts for FMRI data.

PubMed

Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

2012-01-01

Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic.
Extending Local Canonical Correlation Analysis to Handle General Linear Contrasts for fMRI Data

PubMed Central

Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

2012-01-01

Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic. PMID:22461786

A model based on feature objects aided strategy to evaluate the methane generation from food waste by anaerobic digestion.

PubMed

Yu, Meijuan; Zhao, Mingxing; Huang, Zhenxing; Xi, Kezhong; Shi, Wansheng; Ruan, Wenquan

2018-02-01

A model based on feature objects (FOs) aided strategy was used to evaluate the methane generation from food waste by anaerobic digestion. The kinetics of feature objects was tested by the modified Gompertz model and the first-order kinetic model, and the first-order kinetic hydrolysis constants were used to estimate the reaction rate of homemade and actual food waste. The results showed that the methane yields of four feature objects were significantly different. The anaerobic digestion of homemade food waste and actual food waste had various methane yields and kinetic constants due to the different contents of FOs in food waste. Combining the kinetic equations with the multiple linear regression equation could well express the methane yield of food waste, as the R 2 of food waste was more than 0.9. The predictive methane yields of the two actual food waste were 528.22 mL g -1  TS and 545.29 mL g -1  TS with the model, while the experimental values were 527.47 mL g -1  TS and 522.1 mL g -1  TS, respectively. The relative error between the experimental cumulative methane yields and the predicted cumulative methane yields were both less than 5%. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

PubMed

Hemmila, April; McGill, Jim; Ritter, David

2008-03-01

To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Linearity versus Nonlinearity of Offspring-Parent Regression: An Experimental Study of Drosophila Melanogaster

PubMed Central

Gimelfarb, A.; Willis, J. H.

1994-01-01

An experiment was conducted to investigate the offspring-parent regression for three quantitative traits (weight, abdominal bristles and wing length) in Drosophila melanogaster. Linear and polynomial models were fitted for the regressions of a character in offspring on both parents. It is demonstrated that responses by the characters to selection predicted by the nonlinear regressions may differ substantially from those predicted by the linear regressions. This is true even, and especially, if selection is weak. The realized heritability for a character under selection is shown to be determined not only by the offspring-parent regression but also by the distribution of the character and by the form and strength of selection. PMID:7828818
Estuarine Sediment Deposition during Wetland Restoration: A GIS and Remote Sensing Modeling Approach

NASA Technical Reports Server (NTRS)

Newcomer, Michelle; Kuss, Amber; Kentron, Tyler; Remar, Alex; Choksi, Vivek; Skiles, J. W.

2011-01-01

Restoration of the industrial salt flats in the San Francisco Bay, California is an ongoing wetland rehabilitation project. Remote sensing maps of suspended sediment concentration, and other GIS predictor variables were used to model sediment deposition within these recently restored ponds. Suspended sediment concentrations were calibrated to reflectance values from Landsat TM 5 and ASTER using three statistical techniques -- linear regression, multivariate regression, and an Artificial Neural Network (ANN), to map suspended sediment concentrations. Multivariate and ANN regressions using ASTER proved to be the most accurate methods, yielding r2 values of 0.88 and 0.87, respectively. Predictor variables such as sediment grain size and tidal frequency were used in the Marsh Sedimentation (MARSED) model for predicting deposition rates for three years. MARSED results for a fully restored pond show a root mean square deviation (RMSD) of 66.8 mm (<1) between modeled and field observations. This model was further applied to a pond breached in November 2010 and indicated that the recently breached pond will reach equilibrium levels after 60 months of tidal inundation.
Reconstruction of missing daily streamflow data using dynamic regression models

NASA Astrophysics Data System (ADS)

Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault

2015-12-01

River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.
Linear and nonlinear regression techniques for simultaneous and proportional myoelectric control.

PubMed

Hahne, J M; Biessmann, F; Jiang, N; Rehbaum, H; Farina, D; Meinecke, F C; Muller, K-R; Parra, L C

2014-03-01

In recent years the number of active controllable joints in electrically powered hand-prostheses has increased significantly. However, the control strategies for these devices in current clinical use are inadequate as they require separate and sequential control of each degree-of-freedom (DoF). In this study we systematically compare linear and nonlinear regression techniques for an independent, simultaneous and proportional myoelectric control of wrist movements with two DoF. These techniques include linear regression, mixture of linear experts (ME), multilayer-perceptron, and kernel ridge regression (KRR). They are investigated offline with electro-myographic signals acquired from ten able-bodied subjects and one person with congenital upper limb deficiency. The control accuracy is reported as a function of the number of electrodes and the amount and diversity of training data providing guidance for the requirements in clinical practice. The results showed that KRR, a nonparametric statistical learning method, outperformed the other methods. However, simple transformations in the feature space could linearize the problem, so that linear models could achieve similar performance as KRR at much lower computational costs. Especially ME, a physiologically inspired extension of linear regression represents a promising candidate for the next generation of prosthetic devices.
An Expert System for the Evaluation of Cost Models

DTIC Science & Technology

1990-09-01

contrast to the condition of equal error variance, called homoscedasticity. (Reference: Applied Linear Regression Models by John Neter - page 423...normal. (Reference: Applied Linear Regression Models by John Neter - page 125) Click Here to continue -> Autocorrelation Click Here for the index - Index...over time. Error terms correlated over time are said to be autocorrelated or serially correlated. (REFERENCE: Applied Linear Regression Models by John
Interrelationships of somatic cell count, mastitis, and milk yield in a low somatic cell count herd.

PubMed

Deluyker, H A; Gay, J M; Weaver, L D

1993-11-01

In a high yielding low SCC herd, changes in milk yield associated with SCC and occurrence of clinical mastitis and differences in SCC with parity, clinical mastitis, and DIM were investigated. Milk yield data were obtained at every milking, and SCC was measured once every 48 h in 117 cows during the first 119 d postpartum. Effects of SCC and clinical mastitis on cumulative milk yield in the first 119 d postpartum were evaluated with least squares linear regression. Repeated measures ANOVA was used to detect changes in SCC. The SCC was highest at lactation onset, and cows with clinical mastitis had significantly higher SCC. During the 10 d prior to onset of clinical mastitis, SCC was higher in affected cows than in matched unaffected controls and surged just prior to diagnosis. During the 10-d period following a mastitis treatment, SCC differences between treated and control cows remained significant but became smaller with time and returned to the premastitis differences. Occurrence of clinical mastitis was associated with 5% milk yield loss. Cows with mean SCC > 245,000 cells/ml over the 119 d showed 6.2% yield loss compared with cows with SCC < or = 90,000 cells/ml. Cows with clinical mastitis had higher SCC prior to and following the end of treatment for mastitis than did controls. Clinical mastitis and SCC were associated with significant yield loss. Milk yield loss attributed to clinical mastitis was greater than that associated with elevated SCC (> 245,000 cells/ml) because a greater percentage of cows (26%) had clinical mastitis than elevated SCC (12.5%).
Phenotypic effects of subclinical paratuberculosis (Johne's disease) in dairy cattle.

PubMed

Pritchard, Tracey C; Coffey, Mike P; Bond, Karen S; Hutchings, Mike R; Wall, Eileen

2017-01-01

The effect of subclinical paratuberculosis (or Johne's disease) risk status on performance, health, and fertility was studied in 58,096 UK Holstein-Friesian cows with 156,837 lactations across lactations 1 to 3. Low-, medium-, and high-risk group categories were allocated to cows determined by a minimum of 4 ELISA milk tests taken at any time during their lactating life. Lactation curves of daily milk, protein, and fat yields and protein and fat percentage, together with log e -transformed somatic cell count, were estimated using a random regression model to quantify differences between risk groups. The effect of subclinical paratuberculosis risk groups on fertility, lactation-average somatic cell count, and mastitis were analyzed using linear regression fitting risk group as a fixed effect. Milk yield losses associated with high-risk cows compared with low-risk cows in lactations 1, 2, and 3 for mean daily yield were 0.34, 1.05, and 1.61kg; likewise, accumulated 305-d yields were 103, 316, and 485kg, respectively. The total loss was 904kg over the first 3 lactations. Protein and fat yield losses associated with high-risk cows were significant, but primarily a feature of decreasing milk yield. Similar trends were observed for both test-day and lactation-average somatic cell count measures with higher somatic cell counts from medium- and high-risk cows compared with low-risk cows, and differences were in almost all cases significant. Likewise, mastitis incidence was significantly higher in high-risk cows compared with low-risk cows in lactations 2 and 3. Whereas the few significant differences between risk groups among fertility traits were inconsistent with no clear trend. These results are expected to be conservative, as some animals that were considered negative may become positive after the timeframe of this study, particularly if the animal was tested when relatively young. However, the magnitude of milk yield losses together with higher somatic cell counts and an increase in mastitis incidence should motivate farmers to implement the appropriate control measures to reduce the spread of the disease. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Gas detection by correlation spectroscopy employing a multimode diode laser.

PubMed

Lou, Xiutao; Somesfalean, Gabriel; Zhang, Zhiguo

2008-05-01

A gas sensor based on the gas-correlation technique has been developed using a multimode diode laser (MDL) in a dual-beam detection scheme. Measurement of CO(2) mixed with CO as an interfering gas is successfully demonstrated using a 1570 nm tunable MDL. Despite overlapping absorption spectra and occasional mode hops, the interfering signals can be effectively excluded by a statistical procedure including correlation analysis and outlier identification. The gas concentration is retrieved from several pair-correlated signals by a linear-regression scheme, yielding a reliable and accurate measurement. This demonstrates the utility of the unsophisticated MDLs as novel light sources for gas detection applications.
Predicting apparent singlet oxygen quantum yields of dissolved black carbon and humic substances using spectroscopic indices.

PubMed

Du, Ziyan; He, Yingsheng; Fan, Jianing; Fu, Heyun; Zheng, Shourong; Xu, Zhaoyi; Qu, Xiaolei; Kong, Ao; Zhu, Dongqiang

2018-03-01

Dissolved black carbon (DBC) is ubiquitous in aquatic systems, being an important subgroup of the dissolved organic matter (DOM) pool. Nevertheless, its aquatic photoactivity remains largely unknown. In this study, a range of spectroscopic indices of DBC and humic substance (HS) samples were determined using UV-Vis spectroscopy, fluorescence spectroscopy, and proton nuclear magnetic resonance. DBC can be readily differentiated from HS using spectroscopic indices. It has lower average molecular weight, but higher aromaticity and lignin content. The apparent singlet oxygen quantum yield (Φ singlet oxygen ) of DBC under simulated sunlight varies from 3.46% to 6.13%, significantly higher than HS, 1.26%-3.57%, suggesting that DBC is the more photoactive component in the DOM pool. Despite drastically different formation processes and structural properties, the Φ singlet oxygen of DBC and HS can be well predicted by the same simple linear regression models using optical indices including spectral slope coefficient (S 275-295 ) and absorbance ratio (E 2 /E 3 ) which are proxies for the abundance of singlet oxygen sensitizers and for the significance of intramolecular charge transfer interactions. The regression models can be potentially used to assess the photoactivity of DOM at large scales with in situ water spectrophotometry or satellite remote sensing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Predawn respiration rates during flowering are highly predictive of yield response in Gossypium hirsutum when yield variability is water-induced.

PubMed

Snider, John L; Chastain, Daryl R; Meeks, Calvin D; Collins, Guy D; Sorensen, Ronald B; Byrd, Seth A; Perry, Calvin D

2015-07-01

Respiratory carbon evolution by leaves under abiotic stress is implicated as a major limitation to crop productivity; however, respiration rates of fully expanded leaves are positively associated with plant growth rates. Given the substantial sensitivity of plant growth to drought, it was hypothesized that predawn respiration rates (RPD) would be (1) more sensitive to drought than photosynthetic processes and (2) highly predictive of water-induced yield variability in Gossypium hirsutum. Two studies (at Tifton and Camilla Georgia) addressed these hypotheses. At Tifton, drought was imposed beginning at the onset of flowering (first flower) and continuing for three weeks (peak bloom) followed by a recovery period, and predawn water potential (ΨPD), RPD, net photosynthesis (AN) and maximum quantum yield of photosystem II (Fv/Fm) were measured throughout the study period. At Camilla, plants were exposed to five different irrigation regimes throughout the growing season, and average ΨPD and RPD were determined between first flower and peak bloom for all treatments. For both sites, fiber yield was assessed at crop maturity. The relationships between ΨPD, RPD and yield were assessed via non-linear regression. It was concluded for field-grown G. hirsutum that (1) RPD is exceptionally sensitive to progressive drought (more so than AN or Fv/Fm) and (2) average RPD from first flower to peak bloom is highly predictive of water-induced yield variability. Copyright © 2015 Elsevier GmbH. All rights reserved.
Predictors of mother and child DNA yields in buccal cell samples collected in pediatric cancer epidemiologic studies: a report from the Children’s Oncology group

PubMed Central

2013-01-01

Background Collection of high-quality DNA is essential for molecular epidemiology studies. Methods have been evaluated for optimal DNA collection in studies of adults; however, DNA collection in young children poses additional challenges. Here, we have evaluated predictors of DNA quantity in buccal cells collected for population-based studies of infant leukemia (N = 489 mothers and 392 children) and hepatoblastoma (HB; N = 446 mothers and 412 children) conducted through the Children’s Oncology Group. DNA samples were collected by mail using mouthwash (for mothers and some children) and buccal brush (for children) collection kits and quantified using quantitative real-time PCR. Multivariable linear regression models were used to identify predictors of DNA yield. Results Median DNA yield was higher for mothers in both studies compared with their children (14 μg vs. <1 μg). Significant predictors of DNA yield in children included case–control status (β = −0.69, 50% reduction, P = 0.01 for case vs. control children), brush collection type, and season of sample collection. Demographic factors were not strong predictors of DNA yield in mothers or children in this analysis. Conclusions The association with seasonality suggests that conditions during transport may influence DNA yield. The low yields observed in most children in these studies highlight the importance of developing alternative methods for DNA collection in younger age groups. PMID:23937514
Compound Identification Using Penalized Linear Regression on Metabolomics

PubMed Central

Liu, Ruiqi; Wu, Dongfeng; Zhang, Xiang; Kim, Seongho

2014-01-01

Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. Because the number of compounds in the reference library is much larger than the range of mass-to-charge ratio (m/z) values so that the data become high dimensional data suffering from singularity. For this reason, penalized linear regressions such as ridge regression and the lasso are used instead of the ordinary least squares regression. Furthermore, two-step approaches using the dot product and Pearson’s correlation along with the penalized linear regression are proposed in this study. PMID:27212894
Age-specific changes in the regulation of LH-dependent testosterone secretion: assessing responsiveness to varying endogenous gonadotropin output in normal men.

PubMed

Liu, Peter Y; Takahashi, Paul Y; Roebuck, Pamela D; Iranmanesh, Ali; Veldhuis, Johannes D

2005-09-01

Pulsatile and thus total testosterone (Te) secretion declines in older men, albeit for unknown reasons. Analytical models forecast that aging may reduce the capability of endogenous luteinizing hormone (LH) pulses to stimulate Leydig cell steroidogenesis. This notion has been difficult to test experimentally. The present study used graded doses of a selective gonadotropin releasing hormone (GnRH)-receptor antagonist to yield four distinct strata of pulsatile LH release in each of 18 healthy men ages 23-72 yr. Deconvolution analysis was applied to frequently sampled LH and Te concentration time series to quantitate pulsatile Te secretion over a 16-h interval. Log-linear regression was used to relate pulsatile LH secretion to attendant pulsatile Te secretion (LH-Te drive) across the four stepwise interventions in each subject. Linear regression of the 18 individual estimates of LH-Te feedforward dose-response slopes on age disclosed a strongly negative relationship (r = -0.721, P < 0.001). Accordingly, the present data support the thesis that aging in healthy men attenuates amplitude-dependent LH drive of burst-like Te secretion. The experimental strategy of graded suppression of neuroglandular outflow may have utility in estimating dose-response adaptations in other endocrine systems.
Plasma amino acid profile associated with fatty liver disease and co-occurrence of metabolic risk factors.

PubMed

Yamakado, Minoru; Tanaka, Takayuki; Nagao, Kenji; Imaizumi, Akira; Komatsu, Michiharu; Daimon, Takashi; Miyano, Hiroshi; Tani, Mizuki; Toda, Akiko; Yamamoto, Hiroshi; Horimoto, Katsuhisa; Ishizaka, Yuko

2017-11-03

Fatty liver disease (FLD) increases the risk of diabetes, cardiovascular disease, and steatohepatitis, which leads to fibrosis, cirrhosis, and hepatocellular carcinoma. Thus, the early detection of FLD is necessary. We aimed to find a quantitative and feasible model for discriminating the FLD, based on plasma free amino acid (PFAA) profiles. We constructed models of the relationship between PFAA levels in 2,000 generally healthy Japanese subjects and the diagnosis of FLD by abdominal ultrasound scan by multiple logistic regression analysis with variable selection. The performance of these models for FLD discrimination was validated using an independent data set of 2,160 subjects. The generated PFAA-based model was able to identify FLD patients. The area under the receiver operating characteristic curve for the model was 0.83, which was higher than those of other existing liver function-associated markers ranging from 0.53 to 0.80. The value of the linear discriminant in the model yielded the adjusted odds ratio (with 95% confidence intervals) for a 1 standard deviation increase of 2.63 (2.14-3.25) in the multiple logistic regression analysis with known liver function-associated covariates. Interestingly, the linear discriminant values were significantly associated with the progression of FLD, and patients with nonalcoholic steatohepatitis also exhibited higher values.
Patterns of shading tolerance determined from experimental ...

EPA Pesticide Factsheets

An extensive review of the experimental literature on seagrass shading evaluated the relationship between experimental light reductions, duration of experiment and seagrass response metrics to determine whether there were consistent statistical patterns. There were highly significant linear relationships of both percent biomass and percent shoot density reduction versus percent light reduction (versus controls), although unexplained variation in the data were high. Duration of exposure affected extent of response for both metrics, but was more clearly a factor in biomass response. Both biomass and shoot density showed linear responses to duration of light reduction for treatments 60%. Unexplained variation was again high, and greater for shoot density than biomass. With few exceptions, regressions of both biomass and shoot density on light reduction for individual species and for genera were statistically significant, but also tended to show high degrees of variability in data. Multivariate regressions that included both percent light reduction and duration of reduction as dependent variables increased the percentage of variation explained in almost every case. Analysis of response data by seagrass life history category (Colonizing, Opportunistic, Persistent) did not yield clearly separate response relationships in most cases. Biomass tended to show somewhat less variation in response to light reduction than shoot density, and of the two, may be the prefe
Control Variate Selection for Multiresponse Simulation.

DTIC Science & Technology

1987-05-01

M. H. Knuter, Applied Linear Regression Mfodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F., Probability, Allyn and Bacon...1982. Neter, J., V. Wasserman, and M. H. Knuter, Applied Linear Regression .fodels, Richard D. Erwin, Inc., Homewood, Illinois, 1983. Neuts, Marcel F...Aspects of J%,ultivariate Statistical Theory, John Wiley and Sons, New York, New York, 1982. dY Neter, J., W. Wasserman, and M. H. Knuter, Applied Linear Regression Mfodels
An Investigation of the Fit of Linear Regression Models to Data from an SAT[R] Validity Study. Research Report 2011-3

ERIC Educational Resources Information Center

Kobrin, Jennifer L.; Sinharay, Sandip; Haberman, Shelby J.; Chajewski, Michael

2011-01-01

This study examined the adequacy of a multiple linear regression model for predicting first-year college grade point average (FYGPA) using SAT[R] scores and high school grade point average (HSGPA). A variety of techniques, both graphical and statistical, were used to examine if it is possible to improve on the linear regression model. The results…
High correlations between MRI brain volume measurements based on NeuroQuant® and FreeSurfer.

PubMed

Ross, David E; Ochs, Alfred L; Tate, David F; Tokac, Umit; Seabaugh, John; Abildskov, Tracy J; Bigler, Erin D

2018-05-30

NeuroQuant ® (NQ) and FreeSurfer (FS) are commonly used computer-automated programs for measuring MRI brain volume. Previously they were reported to have high intermethod reliabilities but often large intermethod effect size differences. We hypothesized that linear transformations could be used to reduce the large effect sizes. This study was an extension of our previously reported study. We performed NQ and FS brain volume measurements on 60 subjects (including normal controls, patients with traumatic brain injury, and patients with Alzheimer's disease). We used two statistical approaches in parallel to develop methods for transforming FS volumes into NQ volumes: traditional linear regression, and Bayesian linear regression. For both methods, we used regression analyses to develop linear transformations of the FS volumes to make them more similar to the NQ volumes. The FS-to-NQ transformations based on traditional linear regression resulted in effect sizes which were small to moderate. The transformations based on Bayesian linear regression resulted in all effect sizes being trivially small. To our knowledge, this is the first report describing a method for transforming FS to NQ data so as to achieve high reliability and low effect size differences. Machine learning methods like Bayesian regression may be more useful than traditional methods. Copyright © 2018 Elsevier B.V. All rights reserved.

Quantile Regression in the Study of Developmental Sciences

PubMed Central

Petscher, Yaacov; Logan, Jessica A. R.

2014-01-01

Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
[Nitrogen status diagnosis and yield prediction of spring maize after green manure incorporation by using a digital camera].

PubMed

Bai, Jin-Shun; Cao, Wei-Dong; Xiong, Jing; Zeng, Nao-Hua; Shimizu, Katshyoshi; Rui, Yu-Kui

2013-12-01

In order to explore the feasibility of using the image processing technology to diagnose the nitrogen status and to predict the maize yield, a field experiment with different nitrogen rates with green manure incorporation was conducted. Maize canopy digital images over a range of growth stages were captured by digital camera. Maize nitrogen status and the relationships between image color indices derived by digital camera for maize at different growth stages and maize nitrogen status indicators were analyzed. These digital camera sourced image color indices at different growth stages for maize were also regressed with maize grain yield at maturity. The results showed that the plant nitrogen status for maize was improved by green manure application. The leaf chlorophyll content (SPAD value), aboveground biomass and nitrogen uptake for green manure treatments at different maize growth stages were all higher than that for chemical fertilization treatments. The correlations between spectral indices with plant nitrogen indicators for maize affected by green manure application were weaker than that affected by chemical fertilization. And the correlation coefficients for green manure application were ranged with the maize growth stages changes. The best spectral indices for diagnosis of plant nitrogen status after green manure incorporation were normalized blue value (B/(R+G+B)) at 12-leaf (V12) stage and normalized red value (R/(R+G+B)) at grain-filling (R4) stage individually. The coefficients of determination based on linear regression were 0. 45 and 0. 46 for B/(R+G+B) at V12 stage and R/(R+G+B) at R4 stage respectively, acting as a predictor of maize yield response to nitrogen affected by green manure incorporation. Our findings suggested that digital image technique could be a potential tool for in-season prediction of the nitrogen status and grain yield for maize after green manure incorporation when the suitable growth stages and spectral indices for diagnosis were selected.
A SEMIPARAMETRIC BAYESIAN MODEL FOR CIRCULAR-LINEAR REGRESSION

EPA Science Inventory

We present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is usefu...
The effects of heat stress in Italian Holstein dairy cattle.

PubMed

Bernabucci, U; Biffani, S; Buggiotti, L; Vitali, A; Lacetera, N; Nardone, A

2014-01-01

The data set for this study comprised 1,488,474 test-day records for milk, fat, and protein yields and fat and protein percentages from 191,012 first-, second-, and third-parity Holstein cows from 484 farms. Data were collected from 2001 through 2007 and merged with meteorological data from 35 weather stations. A linear model (M1) was used to estimate the effects of the temperature-humidity index (THI) on production traits. Least squares means from M1 were used to detect the THI thresholds for milk production in all parities by using a 2-phase linear regression procedure (M2). A multiple-trait repeatability test-model (M3) was used to estimate variance components for all traits and a dummy regression variable (t) was defined to estimate the production decline caused by heat stress. Additionally, the estimated variance components and M3 were used to estimate traditional and heat-tolerance breeding values (estimated breeding values, EBV) for milk yield and protein percentages at parity 1. An analysis of data (M2) indicated that the daily THI at which milk production started to decline for the 3 parities and traits ranged from 65 to 76. These THI values can be achieved with different temperature/humidity combinations with a range of temperatures from 21 to 36°C and relative humidity values from 5 to 95%. The highest negative effect of THI was observed 4 d before test day over the 3 parities for all traits. The negative effect of THI on production traits indicates that first-parity cows are less sensitive to heat stress than multiparous cows. Over the parities, the general additive genetic variance decreased for protein content and increased for milk yield and fat and protein yield. Additive genetic variance for heat tolerance showed an increase from the first to third parity for milk, protein, and fat yield, and for protein percentage. Genetic correlations between general and heat stress effects were all unfavorable (from -0.24 to -0.56). Three EBV per trait were calculated for each cow and bull (traditional EBV, traditional EBV estimated with the inclusion of THI covariate effect, and heat tolerance EBV) and the rankings of EBV for 283 bulls born after 1985 with at least 50 daughters were compared. When THI was included in the model, the ranking for 17 and 32 bulls changed for milk yield and protein percentage, respectively. The heat tolerance genetic component is not negligible, suggesting that heat tolerance selection should be included in the selection objectives. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Pseudo second order kinetics and pseudo isotherms for malachite green onto activated carbon: comparison of linear and non-linear regression methods.

PubMed

Kumar, K Vasanth; Sivanesan, S

2006-08-25

Pseudo second order kinetic expressions of Ho, Sobkowsk and Czerwinski, Blanachard et al. and Ritchie were fitted to the experimental kinetic data of malachite green onto activated carbon by non-linear and linear method. Non-linear method was found to be a better way of obtaining the parameters involved in the second order rate kinetic expressions. Both linear and non-linear regression showed that the Sobkowsk and Czerwinski and Ritchie's pseudo second order model were the same. Non-linear regression analysis showed that both Blanachard et al. and Ho have similar ideas on the pseudo second order model but with different assumptions. The best fit of experimental data in Ho's pseudo second order expression by linear and non-linear regression method showed that Ho pseudo second order model was a better kinetic expression when compared to other pseudo second order kinetic expressions. The amount of dye adsorbed at equilibrium, q(e), was predicted from Ho pseudo second order expression and were fitted to the Langmuir, Freundlich and Redlich Peterson expressions by both linear and non-linear method to obtain the pseudo isotherms. The best fitting pseudo isotherm was found to be the Langmuir and Redlich Peterson isotherm. Redlich Peterson is a special case of Langmuir when the constant g equals unity.
A slope-ratio precision-fed rooster assay for determination of relative metabolizable energy values for fats and oils.

PubMed

Aardsma, M P; Parsons, C M

2017-01-01

The precision-fed rooster assay (PFRA) frequently yields TME n values for fats and oils in excess of their gross energies. Six experiments were conducted to determine if the PFRA could be combined with a slope-ratio type assay to yield more useful lipid TME n values. In experiment (EXP) 1, refined corn oil (RCO) was fed to conventional and cecectomized roosters at zero, 5, 10, 15, and 20% of a ground corn diet. In EXP 2 through 6, lipids were fed to conventional roosters at zero, 5, and 10% in a ground corn diet. Palomys (a novel lipid), high stearidonic acid soybean oil (SDASO), 2 animal-vegetable blends (AV1, AV2), a vegetable-based oil blend (VB), and corn oil from an ethanol plant (DDGSCO) were evaluated and compared to refined soybean oil (RSO) or RCO as the reference lipid. Multiple linear regression of diet TME n on supplemental lipid level generated regression coefficients that were used to calculate relative bioavailability values (RBV). In EXP 1, RCO was a suitable reference material as TME n linearly increased up to 20% RCO inclusion. There were some minor differences in TME n of RCO between conventional and cecetomized bird types. In EXP 2, Palomys was found to have a lower (P < 0.05) RBV (87%) than RCO. In EXP 3, there were no significant differences between SDASO and RSO. In EXP 4, the RBV of AV2 (79%) was lower (P < 0.05) than RCO, while the RBV of AV1 was not different from RCO. The RBV of DDGSCO (116%) was higher (P < 0.05) than RCO in EXP 5. The RBV of VB (84%) was lower (P < 0.001) than RCO in EXP 6; however, this may be an underestimation for low levels of VB, as there was an interaction (P < 0.01) between lipid type and lipid supplementation level. These results indicate that the precision-fed slope-ratio rooster assay can detect differences among lipids and yields practically useful lipid TME n values. © 2016 Poultry Science Association Inc.
Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors

DTIC Science & Technology

2015-07-15

Long-term effects on cancer survivors’ quality of life of physical training versus physical training combined with cognitive-behavioral therapy ...COMPARISON OF NEURAL NETWORK AND LINEAR REGRESSION MODELS IN STATISTICALLY PREDICTING MENTAL AND PHYSICAL HEALTH STATUS OF BREAST...34Comparison of Neural Network and Linear Regression Models in Statistically Predicting Mental and Physical Health Status of Breast Cancer Survivors
Prediction of the Main Engine Power of a New Container Ship at the Preliminary Design Stage

NASA Astrophysics Data System (ADS)

Cepowski, Tomasz

2017-06-01

The paper presents mathematical relationships that allow us to forecast the estimated main engine power of new container ships, based on data concerning vessels built in 2005-2015. The presented approximations allow us to estimate the engine power based on the length between perpendiculars and the number of containers the ship will carry. The approximations were developed using simple linear regression and multivariate linear regression analysis. The presented relations have practical application for estimation of container ship engine power needed in preliminary parametric design of the ship. It follows from the above that the use of multiple linear regression to predict the main engine power of a container ship brings more accurate solutions than simple linear regression.
Estimation of Standard Error of Regression Effects in Latent Regression Models Using Binder's Linearization. Research Report. ETS RR-07-09

ERIC Educational Resources Information Center

Li, Deping; Oranje, Andreas

2007-01-01

Two versions of a general method for approximating standard error of regression effect estimates within an IRT-based latent regression model are compared. The general method is based on Binder's (1983) approach, accounting for complex samples and finite populations by Taylor series linearization. In contrast, the current National Assessment of…
Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets.

PubMed

Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan

2017-08-28

The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.
Age-related energy values of bakery meal for broiler chickens determined using the regression method.

PubMed

Stefanello, C; Vieira, S L; Xue, P; Ajuwon, K M; Adeola, O

2016-07-01

A study was conducted to determine the ileal digestible energy (IDE), ME, and MEn contents of bakery meal using the regression method and to evaluate whether the energy values are age-dependent in broiler chickens from zero to 21 d post hatching. Seven hundred and eighty male Ross 708 chicks were fed 3 experimental diets in which bakery meal was incorporated into a corn-soybean meal-based reference diet at zero, 100, or 200 g/kg by replacing the energy-yielding ingredients. A 3 × 3 factorial arrangement of 3 ages (1, 2, or 3 wk) and 3 dietary bakery meal levels were used. Birds were fed the same experimental diets in these 3 evaluated ages. Birds were grouped by weight into 10 replicates per treatment in a randomized complete block design. Apparent ileal digestibility and total tract retention of DM, N, and energy were calculated. Expression of mucin (MUC2), sodium-dependent phosphate transporter (NaPi-IIb), solute carrier family 7 (cationic amino acid transporter, Y(+) system, SLC7A2), glucose (GLUT2), and sodium-glucose linked transporter (SGLT1) genes were measured at each age in the jejunum by real-time PCR. Addition of bakery meal to the reference diet resulted in a linear decrease in retention of DM, N, and energy, and a quadratic reduction (P < 0.05) in N retention and ME. There was a linear increase in DM, N, and energy as birds' ages increased from 1 to 3 wk. Dietary bakery meal did not affect jejunal gene expression. Expression of genes encoding MUC2, NaPi-IIb, and SLC7A2 linearly increased (P < 0.05) with age. Regression-derived MEn of bakery meal linearly increased (P < 0.05) as the age of birds increased, with values of 2,710, 2,820, and 2,923 kcal/kg DM for 1, 2, and 3 wk, respectively. Based on these results, utilization of energy and nitrogen in the basal diet decreased when bakery meal was included and increased with age of broiler chickens. © 2016 Poultry Science Association Inc.
Development of LACIE CCEA-1 weather/wheat yield models. [regression analysis

NASA Technical Reports Server (NTRS)

Strommen, N. D.; Sakamoto, C. M.; Leduc, S. K.; Umberger, D. E. (Principal Investigator)

1979-01-01

The advantages and disadvantages of the casual (phenological, dynamic, physiological), statistical regression, and analog approaches to modeling for grain yield are examined. Given LACIE's primary goal of estimating wheat production for the large areas of eight major wheat-growing regions, the statistical regression approach of correlating historical yield and climate data offered the Center for Climatic and Environmental Assessment the greatest potential return within the constraints of time and data sources. The basic equation for the first generation wheat-yield model is given. Topics discussed include truncation, trend variable, selection of weather variables, episodic events, strata selection, operational data flow, weighting, and model results.
Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions.

PubMed

Ernst, Anja F; Albers, Casper J

2017-01-01

Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking.
Regression assumptions in clinical psychology research practice—a systematic review of common misconceptions

PubMed Central

Ernst, Anja F.

2017-01-01

Misconceptions about the assumptions behind the standard linear regression model are widespread and dangerous. These lead to using linear regression when inappropriate, and to employing alternative procedures with less statistical power when unnecessary. Our systematic literature review investigated employment and reporting of assumption checks in twelve clinical psychology journals. Findings indicate that normality of the variables themselves, rather than of the errors, was wrongfully held for a necessary assumption in 4% of papers that use regression. Furthermore, 92% of all papers using linear regression were unclear about their assumption checks, violating APA-recommendations. This paper appeals for a heightened awareness for and increased transparency in the reporting of statistical assumption checking. PMID:28533971
Estimating linear temporal trends from aggregated environmental monitoring data

USGS Publications Warehouse

Erickson, Richard A.; Gray, Brian R.; Eager, Eric A.

2017-01-01

Trend estimates are often used as part of environmental monitoring programs. These trends inform managers (e.g., are desired species increasing or undesired species decreasing?). Data collected from environmental monitoring programs is often aggregated (i.e., averaged), which confounds sampling and process variation. State-space models allow sampling variation and process variations to be separated. We used simulated time-series to compare linear trend estimations from three state-space models, a simple linear regression model, and an auto-regressive model. We also compared the performance of these five models to estimate trends from a long term monitoring program. We specifically estimated trends for two species of fish and four species of aquatic vegetation from the Upper Mississippi River system. We found that the simple linear regression had the best performance of all the given models because it was best able to recover parameters and had consistent numerical convergence. Conversely, the simple linear regression did the worst job estimating populations in a given year. The state-space models did not estimate trends well, but estimated population sizes best when the models converged. We found that a simple linear regression performed better than more complex autoregression and state-space models when used to analyze aggregated environmental monitoring data.
Comparing The Effectiveness of a90/95 Calculations (Preprint)

DTIC Science & Technology

2006-09-01

Nachtsheim, John Neter, William Li, Applied Linear Statistical Models , 5th ed., McGraw-Hill/Irwin, 2005 5. Mood, Graybill and Boes, Introduction...curves is based on methods that are only valid for ordinary linear regression. Requirements for a valid Ordinary Least-Squares Regression Model There... linear . For example is a linear model ; is not. 2. Uniform variance (homoscedasticity
Correlation and simple linear regression.

PubMed

Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

2003-06-01

In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Assessing disease stress and modeling yield losses in alfalfa

NASA Astrophysics Data System (ADS)

Guan, Jie

Alfalfa is the most important forage crop in the U.S. and worldwide. Fungal foliar diseases are believed to cause significant yield losses in alfalfa, yet, little quantitative information exists regarding the amount of crop loss. Different fungicides and application frequencies were used as tools to generate a range of foliar disease intensities in Ames and Nashua, IA. Visual disease assessments (disease incidence, disease severity, and percentage defoliation) were obtained weekly for each alfalfa growth cycle (two to three growing cycles per season). Remote sensing assessments were performed using a hand-held, multispectral radiometer to measure the amount and quality of sunlight reflected from alfalfa canopies. Factors such as incident radiation, sun angle, sensor height, and leaf wetness were all found to significantly affect the percentage reflectance of sunlight reflected from alfalfa canopies. The precision of visual and remote sensing assessment methods was quantified. Precision was defined as the intra-rater repeatability and inter-rater reliability of assessment methods. F-tests, slopes, intercepts, and coefficients of determination (R2) were used to compare assessment methods for precision. Results showed that among the three visual disease assessment methods (disease incidence, disease severity, and percentage defoliation), percentage defoliation had the highest intra-rater repeatability and inter-rater reliability. Remote sensing assessment method had better precision than the percentage defoliation assessment method based upon higher intra-rater repeatability and inter-rater reliability. Significant linear relationships between canopy reflectance (810 nm), percentage defoliation and yield were detected using linear regression and percentage reflectance (810 nm) assessments were found to have a stronger relationship with yield than percentage defoliation assessments. There were also significant linear relationships between percentage defoliation, dry weight, percentage reflectance (810 nm), and green leaf area index (GLAI). Percentage reflectance (810 nm) assessments had a stronger relationship with dry weight and green leaf area index than percentage defoliation assessments. Our research conclusively demonstrates that percentage reflectance measurements can be used to nondestructively assess green leaf area index which is a direct measure of plant health and an indirect measure of productivity. This research conclusively demonstrates that remote sensing is superior to visual assessment method to assess alfalfa stress and to model yield and GLAI in the alfalfa foliar disease pathosystem.
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

PubMed

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-02-01

A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

How many stakes are required to measure the mass balance of a glacier?

USGS Publications Warehouse

Fountain, A.G.; Vecchia, A.

1999-01-01

Glacier mass balance is estimated for South Cascade Glacier and Maclure Glacier using a one-dimensional regression of mass balance with altitude as an alternative to the traditional approach of contouring mass balance values. One attractive feature of regression is that it can be applied to sparse data sets where contouring is not possible and can provide an objective error of the resulting estimate. Regression methods yielded mass balance values equivalent to contouring methods. The effect of the number of mass balance measurements on the final value for the glacier showed that sample sizes as small as five stakes provided reasonable estimates, although the error estimates were greater than for larger sample sizes. Different spatial patterns of measurement locations showed no appreciable influence on the final value as long as different surface altitudes were intermittently sampled over the altitude range of the glacier. Two different regression equations were examined, a quadratic, and a piecewise linear spline, and comparison of results showed little sensitivity to the type of equation. These results point to the dominant effect of the gradient of mass balance with altitude of alpine glaciers compared to transverse variations. The number of mass balance measurements required to determine the glacier balance appears to be scale invariant for small glaciers and five to ten stakes are sufficient.
Satellite remote sensing of fine particulate air pollutants over Indian mega cities

NASA Astrophysics Data System (ADS)

Sreekanth, V.; Mahesh, B.; Niranjan, K.

2017-11-01

In the backdrop of the need for high spatio-temporal resolution data on PM2.5 mass concentrations for health and epidemiological studies over India, empirical relations between Aerosol Optical Depth (AOD) and PM2.5 mass concentrations are established over five Indian mega cities. These relations are sought to predict the surface PM2.5 mass concentrations from high resolution columnar AOD datasets. Current study utilizes multi-city public domain PM2.5 data (from US Consulate and Embassy's air monitoring program) and MODIS AOD, spanning for almost four years. PM2.5 is found to be positively correlated with AOD. Station-wise linear regression analysis has shown spatially varying regression coefficients. Similar analysis has been repeated by eliminating data from the elevated aerosol prone seasons, which has improved the correlation coefficient. The impact of the day to day variability in the local meteorological conditions on the AOD-PM2.5 relationship has been explored by performing a multiple regression analysis. A cross-validation approach for the multiple regression analysis considering three years of data as training dataset and one-year data as validation dataset yielded an R value of ∼0.63. The study was concluded by discussing the factors which can improve the relationship.
Selection of assessment methods for evaluating banana weevil Cosmopolites sordidus (Coleoptera: Curculionidae) damage on highland cooking banana (Musa spp., genome group AAA-EA).

PubMed

Gold, C S; Ragama, P E; Coe, R; Rukazambuga, N D T M

2005-04-01

Cosmopolites sordidus (Germar) is an important pest on bananas and plantains. Population build-up is slow and damage becomes increasingly important in successive crop cycles (ratoons). Yield loss results from plant loss, mat disappearance and reduced bunch size. Damage assessment requires destructive sampling and is most often done on corms of recently harvested plants. A wide range of damage assessment methods exist and there are no agreed protocols. It is critical to know what types of damage best reflect C. sordidus pest status through their relationships with yield loss. Multiple damage assessment parameters (i.e. for the corm periphery, cortex and central cylinder) were employed in two yield loss trials and a cultivar-screening trial in Uganda. Damage to the central cylinder had a greater effect on plant size and yield loss than damage to the cortex or corm periphery. In some cases, a combined assessment of damage to the central cylinder and cortex showed a better relationship with yield loss than an assessment of the central cylinder alone. Correlation, logistic and linear regression analyses showed weak to modest correlations between damage to the corm periphery and damage to the central cylinder. Thus, damage to the corm periphery is not a strong predictor of the more important damage to the central cylinder. Therefore, C. sordidus damage assessment should target the central cylinder and cortex.
On using summary statistics from an external calibration sample to correct for covariate measurement error.

PubMed

Guo, Ying; Little, Roderick J; McConnell, Daniel S

2012-01-01

Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
U.S. Army Armament Research, Development and Engineering Center Grain Evaluation Software to Numerically Predict Linear Burn Regression for Solid Propellant Grain Geometries

DTIC Science & Technology

2017-10-01

ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID PROPELLANT GRAIN GEOMETRIES Brian...author(s) and should not be construed as an official Department of the Army position, policy, or decision, unless so designated by other documentation...U.S. ARMY ARMAMENT RESEARCH, DEVELOPMENT AND ENGINEERING CENTER GRAIN EVALUATION SOFTWARE TO NUMERICALLY PREDICT LINEAR BURN REGRESSION FOR SOLID
Temperature in lowland Danish streams: contemporary patterns, empirical models and future scenarios

NASA Astrophysics Data System (ADS)

Lagergaard Pedersen, Niels; Sand-Jensen, Kaj

2007-01-01

Continuous temperature measurements at 11 stream sites in small lowland streams of North Zealand, Denmark over a year showed much higher summer temperatures and lower winter temperatures along the course of the stream with artificial lakes than in the stream without lakes. The influence of lakes was even more prominent in the comparisons of colder lake inlets and warmer outlets and led to the decline of cold-water and oxygen-demanding brown trout. Seasonal and daily temperature variations were, as anticipated, dampened by forest cover, groundwater input, input from sewage plants and high downstream discharges. Seasonal variations in daily water temperature could be predicted with high accuracy at all sites by a linear air-water regression model (r2: 0.903-0.947). The predictions improved in all instances (r2: 0.927-0.964) by a non-linear logistic regression according to which water temperatures do not fall below freezing and they increase less steeply than air temperatures at high temperatures because of enhanced heat loss from the stream by evaporation and back radiation. The predictions improved slightly (r2: 0.933-0.969) by a multiple regression model which, in addition to air temperature as the main predictor, included solar radiation at un-shaded sites, relative humidity, precipitation and discharge. Application of the non-linear logistic model for a warming scenario of 4-5 °C higher air temperatures in Denmark in 2070-2100 yielded predictions of temperatures rising 1.6-3.0 °C during winter and summer and 4.4-6.0 °C during spring in un-shaded streams with low groundwater input. Groundwater-fed springs are expected to follow the increase of mean air temperatures for the region. Great caution should be exercised in these temperature projections because global and regional climate scenarios remain open to discussion. Copyright
Combined chamber-tower approach: Using eddy covariance measurements to cross-validate carbon fluxes modeled from manual chamber campaigns

NASA Astrophysics Data System (ADS)

Brümmer, C.; Moffat, A. M.; Huth, V.; Augustin, J.; Herbst, M.; Kutsch, W. L.

2016-12-01

Manual carbon dioxide flux measurements with closed chambers at scheduled campaigns are a versatile method to study management effects at small scales in multiple-plot experiments. The eddy covariance technique has the advantage of quasi-continuous measurements but requires large homogeneous areas of a few hectares. To evaluate the uncertainties associated with interpolating from individual campaigns to the whole vegetation period, we installed both techniques at an agricultural site in Northern Germany. The presented comparison covers two cropping seasons, winter oilseed rape in 2012/13 and winter wheat in 2013/14. Modeling half-hourly carbon fluxes from campaigns is commonly performed based on non-linear regressions for the light response and respiration. The daily averages of net CO2 modeled from chamber data deviated from eddy covariance measurements in the range of ± 5 g C m-2 day-1. To understand the observed differences and to disentangle the effects, we performed four additional setups (expert versus default settings of the non-linear regressions based algorithm, purely empirical modeling with artificial neural networks versus non-linear regressions, cross-validating using eddy covariance measurements as campaign fluxes, weekly versus monthly scheduling of campaigns) to model the half-hourly carbon fluxes for the whole vegetation period. The good agreement of the seasonal course of net CO2 at plot and field scale for our agricultural site demonstrates that both techniques are robust and yield consistent results at seasonal time scale even for a managed ecosystem with high temporal dynamics in the fluxes. This allows combining the respective advantages of factorial experiments at plot scale with dense time series data at field scale. Furthermore, the information from the quasi-continuous eddy covariance measurements can be used to derive vegetation proxies to support the interpolation of carbon fluxes in-between the manual chamber campaigns.
Linear regression in astronomy. II

NASA Technical Reports Server (NTRS)

Feigelson, Eric D.; Babu, Gutti J.

1992-01-01

A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
A Constrained Linear Estimator for Multiple Regression

ERIC Educational Resources Information Center

Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

2010-01-01

"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…
On the design of classifiers for crop inventories

NASA Technical Reports Server (NTRS)

Heydorn, R. P.; Takacs, H. C.

1986-01-01

Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Estimates of genetic parameters and eigenvector indices for milk production of Holstein cows.

PubMed

Savegnago, R P; Rosa, G J M; Valente, B D; Herrera, L G G; Carneiro, R L R; Sesana, R C; El Faro, L; Munari, D P

2013-01-01

The objectives of the present study were to estimate genetic parameters of monthly test-day milk yield (TDMY) of the first lactation of Brazilian Holstein cows using random regression (RR), and to compare the genetic gains for milk production and persistency, derived from RR models, using eigenvector indices and selection indices that did not consider eigenvectors. The data set contained monthly TDMY of 3,543 first lactations of Brazilian Holstein cows calving between 1994 and 2011. The RR model included the fixed effect of the contemporary group (herd-month-year of test days), the covariate calving age (linear and quadratic effects), and a fourth-order regression on Legendre orthogonal polynomials of days in milk (DIM) to model the population-based mean curve. Additive genetic and nongenetic animal effects were fit as RR with 4 classes of residual variance random effect. Eigenvector indices based on the additive genetic RR covariance matrix were used to evaluate the genetic gains of milk yield and persistency compared with the traditional selection index (selection index based on breeding values of milk yield until 305 DIM). The heritability estimates for monthly TDMY ranged from 0.12 ± 0.04 to 0.31 ± 0.04. The estimates of additive genetic and nongenetic animal effects correlation were close to 1 at adjacent monthly TDMY, with a tendency to diminish as the time between DIM classes increased. The first eigenvector was related to the increase of the genetic response of the milk yield and the second eigenvector was related to the increase of the genetic gains of the persistency but it contributed to decrease the genetic gains for total milk yield. Therefore, using this eigenvector to improve persistency will not contribute to change the shape of genetic curve pattern. If the breeding goal is to improve milk production and persistency, complete sequential eigenvector indices (selection indices composite with all eigenvectors) could be used with higher economic values for persistency. However, if the breeding goal is to improve only milk yield, the traditional selection index is indicated. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

PubMed Central

Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

2012-01-01

Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, p<0.001). Univariate mixture model fits of FDGpre improved R2 from 0.17 to 0.52. Neither baseline FLT PET nor Cu-ATSM PET uptake contributed statistically significant multivariate regression coefficients. Conclusions Spatially resolved regression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748
Linear regression analysis of survival data with missing censoring indicators.

PubMed

Wang, Qihua; Dinse, Gregg E

2011-04-01

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
An Analysis of COLA (Cost of Living Adjustment) Allocation within the United States Coast Guard.

DTIC Science & Technology

1983-09-01

books Applied Linear Regression [Ref. 39], and Statistical Methods in Research and Production [Ref. 40], or any other book on regression. In the event...Indexes, Master’s Thesis, Air Force Institute of Technology, Wright-Patterson AFB, 1976. 39. Weisberg, Stanford, Applied Linear Regression , Wiley, 1980. 40
Testing hypotheses for differences between linear regression lines

Treesearch

Stanley J. Zarnoch

2009-01-01

Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Graphical Description of Johnson-Neyman Outcomes for Linear and Quadratic Regression Surfaces.

ERIC Educational Resources Information Center

Schafer, William D.; Wang, Yuh-Yin

A modification of the usual graphical representation of heterogeneous regressions is described that can aid in interpreting significant regions for linear or quadratic surfaces. The standard Johnson-Neyman graph is a bivariate plot with the criterion variable on the ordinate and the predictor variable on the abscissa. Regression surfaces are drawn…
Teaching the Concept of Breakdown Point in Simple Linear Regression.

ERIC Educational Resources Information Center

Chan, Wai-Sum

2001-01-01

Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…
Solvent-free synthesis, spectral correlations and antimicrobial activities of some aryl E 2-propen-1-ones

NASA Astrophysics Data System (ADS)

Sathiyamoorthi, K.; Mala, V.; Sakthinathan, S. P.; Kamalakkannan, D.; Suresh, R.; Vanangamudi, G.; Thirunarayanan, G.

2013-08-01

Totally 38 aryl E 2-propen-1-ones including nine substituted styryl 4-iodophenyl ketones have been synthesised using solvent-free SiO2-H3PO4 catalyzed Aldol condensation between respective methyl ketones and substituted benzaldehydes under microwave irradiation. The yields of the ketones are more than 80%. The synthesised chalcones were characterized by their analytical, physical and spectroscopic data. The spectral frequencies of synthesised substituted styryl 4-iodophenyl ketones have been correlated with Hammett substituent constants, F and R parameters using single and multi-linear regression analysis. The antimicrobial activities of 4-iodophenyl chalcones have been studied using Bauer-Kirby method.
Models of subjective response to in-flight motion data

NASA Technical Reports Server (NTRS)

Rudrapatna, A. N.; Jacobson, I. D.

1973-01-01

Mathematical relationships between subjective comfort and environmental variables in an air transportation system are investigated. As a first step in model building, only the motion variables are incorporated and sensitivities are obtained using stepwise multiple regression analysis. The data for these models have been collected from commercial passenger flights. Two models are considered. In the first, subjective comfort is assumed to depend on rms values of the six-degrees-of-freedom accelerations. The second assumes a Rustenburg type human response function in obtaining frequency weighted rms accelerations, which are used in a linear model. The form of the human response function is examined and the results yield a human response weighting function for different degrees of freedom.
Predicting RNA Duplex Dimerization Free-Energy Changes upon Mutations Using Molecular Dynamics Simulations.

PubMed

Sakuraba, Shun; Asai, Kiyoshi; Kameda, Tomoshi

2015-11-05

The dimerization free energies of RNA-RNA duplexes are fundamental values that represent the structural stability of RNA complexes. We report a comparative analysis of RNA-RNA duplex dimerization free-energy changes upon mutations, estimated from a molecular dynamics simulation and experiments. A linear regression for nine pairs of double-stranded RNA sequences, six base pairs each, yielded a mean absolute deviation of 0.55 kcal/mol and an R(2) value of 0.97, indicating quantitative agreement between simulations and experimental data. The observed accuracy indicates that the molecular dynamics simulation with the current molecular force field is capable of estimating the thermodynamic properties of RNA molecules.

Estimating monotonic rates from biological data using local linear regression.

PubMed

Olito, Colin; White, Craig R; Marshall, Dustin J; Barneche, Diego R

2017-03-01

Accessing many fundamental questions in biology begins with empirical estimation of simple monotonic rates of underlying biological processes. Across a variety of disciplines, ranging from physiology to biogeochemistry, these rates are routinely estimated from non-linear and noisy time series data using linear regression and ad hoc manual truncation of non-linearities. Here, we introduce the R package LoLinR, a flexible toolkit to implement local linear regression techniques to objectively and reproducibly estimate monotonic biological rates from non-linear time series data, and demonstrate possible applications using metabolic rate data. LoLinR provides methods to easily and reliably estimate monotonic rates from time series data in a way that is statistically robust, facilitates reproducible research and is applicable to a wide variety of research disciplines in the biological sciences. © 2017. Published by The Company of Biologists Ltd.
Locally linear regression for pose-invariant face recognition.

PubMed

Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

2007-07-01

The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs

Treesearch

Andrew F. Howard; Daniel A. Yaussy

1986-01-01

A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
Use of collagen hydrolysate as a complex nitrogen source for the synthesis of penicillin by Penicillium chrysogenum.

PubMed

Leonhartsberger, S; Lafferty, R M; Korneti, L

1993-09-01

Optimal conditions for both biomass formation and penicillin synthesis by a strain of Penicillium chrysogenum were determined when using a collagen-derived nitrogen source. Preliminary investigations were carried out in shaken flask cultures employing a planned experimental program termed the Graeco-Latin square technique (Auden et al., 1967). It was initially determined that up to 30% of a conventional complex nitrogen source such as cottonseed meal could be replaced by the collagen-derived nitrogen source without decreasing the productivity with respect to the penicillin yield. In the pilot scale experiments using a 30 l stirred tank type of bioreactor, higher penicillin yields were obtained when 70% of the conventional complex nitrogen source in the form of cottonseed meal was replaced by the collagen hydrolysate. Furthermore, the maximum rate of penicillin synthesis continued for over a longer period when using collagen hydrolysate as a complex nitrogen source. Penicillin synthesis rates were determined using a linear regression.
Accuracy of genomic selection in European maize elite breeding populations.

PubMed

Zhao, Yusheng; Gowda, Manje; Liu, Wenxin; Würschum, Tobias; Maurer, Hans P; Longin, Friedrich H; Ranc, Nicolas; Reif, Jochen C

2012-03-01

Genomic selection is a promising breeding strategy for rapid improvement of complex traits. The objective of our study was to investigate the prediction accuracy of genomic breeding values through cross validation. The study was based on experimental data of six segregating populations from a half-diallel mating design with 788 testcross progenies from an elite maize breeding program. The plants were intensively phenotyped in multi-location field trials and fingerprinted with 960 SNP markers. We used random regression best linear unbiased prediction in combination with fivefold cross validation. The prediction accuracy across populations was higher for grain moisture (0.90) than for grain yield (0.58). The accuracy of genomic selection realized for grain yield corresponds to the precision of phenotyping at unreplicated field trials in 3-4 locations. As for maize up to three generations are feasible per year, selection gain per unit time is high and, consequently, genomic selection holds great promise for maize breeding programs.
Random regression models using different functions to model milk flow in dairy cows.

PubMed

Laureano, M M M; Bignardi, A B; El Faro, L; Cardoso, V L; Tonhati, H; Albuquerque, L G

2014-09-12

We analyzed 75,555 test-day milk flow records from 2175 primiparous Holstein cows that calved between 1997 and 2005. Milk flow was obtained by dividing the mean milk yield (kg) of the 3 daily milking by the total milking time (min) and was expressed as kg/min. Milk flow was grouped into 43 weekly classes. The analyses were performed using a single-trait Random Regression Models that included direct additive genetic, permanent environmental, and residual random effects. In addition, the contemporary group and linear and quadratic effects of cow age at calving were included as fixed effects. Fourth-order orthogonal Legendre polynomial of days in milk was used to model the mean trend in milk flow. The additive genetic and permanent environmental covariance functions were estimated using random regression Legendre polynomials and B-spline functions of days in milk. The model using a third-order Legendre polynomial for additive genetic effects and a sixth-order polynomial for permanent environmental effects, which contained 7 residual classes, proved to be the most adequate to describe variations in milk flow, and was also the most parsimonious. The heritability in milk flow estimated by the most parsimonious model was of moderate to high magnitude.
Effect of Malmquist bias on correlation studies with IRAS data base

NASA Technical Reports Server (NTRS)

Verter, Frances

1993-01-01

The relationships between galaxy properties in the sample of Trinchieri et al. (1989) are reexamined with corrections for Malmquist bias. The linear correlations are tested and linear regressions are fit for log-log plots of L(FIR), L(H-alpha), and L(B) as well as ratios of these quantities. The linear correlations for Malmquist bias are corrected using the method of Verter (1988), in which each galaxy observation is weighted by the inverse of its sampling volume. The linear regressions are corrected for Malmquist bias by a new method invented here in which each galaxy observation is weighted by its sampling volume. The results of correlation and regressions among the sample are significantly changed in the anticipated sense that the corrected correlation confidences are lower and the corrected slopes of the linear regressions are lower. The elimination of Malmquist bias eliminates the nonlinear rise in luminosity that has caused some authors to hypothesize additional components of FIR emission.
A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

PubMed

Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

2013-06-01

Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
The extraction process optimization of antioxidant polysaccharides from Marshmallow (Althaea officinalis L.) roots.

PubMed

Pakrokh Ghavi, Peyman

2015-04-01

Response surface methodology (RSM) with a central composite rotatable design (CCRD) based on five levels was employed to model and optimize four experimental operating conditions of extraction temperature (10-90 °C) and time (6-30 h), particle size (6-24 mm) and water to solid (W/S, 10-50) ratio, obtaining polysaccharides from Althaea officinalis roots with high yield and antioxidant activity. For each response, a second-order polynomial model with high R(2) values (> 0.966) was developed using multiple linear regression analysis. Results showed that the most significant (P < 0.05) extraction conditions that affect the yield and antioxidant activity of extracted polysaccharides were the main effect of extraction temperature and the interaction effect of the particle size and W/S ratio. The optimum conditions to maximize yield (10.80%) and antioxidant activity (84.09%) for polysaccharides extraction from A. officinalis roots were extraction temperature 60.90 °C, extraction time 12.01 h, particle size 12.0mm and W/S ratio of 40.0. The experimental values were found to be in agreement with those predicted, indicating the models suitability for optimizing the polysaccharides extraction conditions. Copyright © 2015 Elsevier B.V. All rights reserved.
A primer for biomedical scientists on how to execute model II linear regression analysis.

PubMed

Ludbrook, John

2012-04-01

1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Relative efficiency of joint-model and full-conditional-specification multiple imputation when conditional models are compatible: The general location model.

PubMed

Seaman, Shaun R; Hughes, Rachael A

2018-06-01

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.
Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study.

PubMed

Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P

2014-06-26

To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2013-01-01

This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Relationships of concentrations of certain blood constituents with milk yield and age of cows in dairy herds.

PubMed

Kitchenham, B A; Rowlands, G J; Shorbagi, H

1975-05-01

Regression analyses were performed on data from 48 Compton metabolic profile tests relating the concentrations of certain constituents in the blood of dairy cows to their milk yield, age and stage of lactation. The common partial regression coefficients for milk yield, age and stage of lactation were estimated for each blood constituent. The relationships of greatest statistical significance were between the concentrations of inorganic phosphate and globulin and age, and the concentration of albumin and milk yield.
Analyzing Multilevel Data: An Empirical Comparison of Parameter Estimates of Hierarchical Linear Modeling and Ordinary Least Squares Regression

ERIC Educational Resources Information Center

Rocconi, Louis M.

2011-01-01

Hierarchical linear models (HLM) solve the problems associated with the unit of analysis problem such as misestimated standard errors, heterogeneity of regression and aggregation bias by modeling all levels of interest simultaneously. Hierarchical linear modeling resolves the problem of misestimated standard errors by incorporating a unique random…
Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

ERIC Educational Resources Information Center

Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

2006-01-01

Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Predictability of depression severity based on posterior alpha oscillations.

PubMed

Jiang, H; Popov, T; Jylänki, P; Bi, K; Yao, Z; Lu, Q; Jensen, O; van Gerven, M A J

2016-04-01

We aimed to integrate neural data and an advanced machine learning technique to predict individual major depressive disorder (MDD) patient severity. MEG data was acquired from 22 MDD patients and 22 healthy controls (HC) resting awake with eyes closed. Individual power spectra were calculated by a Fourier transform. Sources were reconstructed via beamforming technique. Bayesian linear regression was applied to predict depression severity based on the spatial distribution of oscillatory power. In MDD patients, decreased theta (4-8 Hz) and alpha (8-14 Hz) power was observed in fronto-central and posterior areas respectively, whereas increased beta (14-30 Hz) power was observed in fronto-central regions. In particular, posterior alpha power was negatively related to depression severity. The Bayesian linear regression model showed significant depression severity prediction performance based on the spatial distribution of both alpha (r=0.68, p=0.0005) and beta power (r=0.56, p=0.007) respectively. Our findings point to a specific alteration of oscillatory brain activity in MDD patients during rest as characterized from MEG data in terms of spectral and spatial distribution. The proposed model yielded a quantitative and objective estimation for the depression severity, which in turn has a potential for diagnosis and monitoring of the recovery process. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Application of database methods to the prediction of B3LYP-optimized polyhedral water cluster geometries and electronic energies

NASA Astrophysics Data System (ADS)

Anick, David J.

2003-12-01

A method is described for a rapid prediction of B3LYP-optimized geometries for polyhedral water clusters (PWCs). Starting with a database of 121 B3LYP-optimized PWCs containing 2277 H-bonds, linear regressions yield formulas correlating O-O distances, O-O-O angles, and H-O-H orientation parameters, with local and global cluster descriptors. The formulas predict O-O distances with a rms error of 0.85 pm to 1.29 pm and predict O-O-O angles with a rms error of 0.6° to 2.2°. An algorithm is given which uses the O-O and O-O-O formulas to determine coordinates for the oxygen nuclei of a PWC. The H-O-H formulas then determine positions for two H's at each O. For 15 test clusters, the gap between the electronic energy of the predicted geometry and the true B3LYP optimum ranges from 0.11 to 0.54 kcal/mol or 4 to 18 cal/mol per H-bond. Linear regression also identifies 14 parameters that strongly correlate with PWC electronic energy. These descriptors include the number of H-bonds in which both oxygens carry a non-H-bonding H, the number of quadrilateral faces, the number of symmetric angles in 5- and in 6-sided faces, and the square of the cluster's estimated dipole moment.
Assessing response of sediment load variation to climate change and human activities with six different approaches.

PubMed

Zhao, Guangju; Mu, Xingmin; Jiao, Juying; Gao, Peng; Sun, Wenyi; Li, Erhui; Wei, Yanhong; Huang, Jiacong

2018-05-23

Understanding the relative contributions of climate change and human activities to variations in sediment load is of great importance for regional soil, and river basin management. Considerable studies have investigated spatial-temporal variation of sediment load within the Loess Plateau; however, contradictory findings exist among methods used. This study systematically reviewed six quantitative methods: simple linear regression, double mass curve, sediment identity factor analysis, dam-sedimentation based method, the Sediment Delivery Distributed (SEDD) model, and the Soil Water Assessment Tool (SWAT) model. The calculation procedures and merits for each method were systematically explained. A case study in the Huangfuchuan watershed on the northern Loess Plateau has been undertaken. The results showed that sediment load had been reduced by 70.5% during the changing period from 1990 to 2012 compared to that of the baseline period from 1955 to 1989. Human activities accounted for an average of 93.6 ± 4.1% of the total decline in sediment load, whereas climate change contributed 6.4 ± 4.1%. Five methods produced similar estimates, but the linear regression yielded relatively different results. The results of this study provide a good reference for assessing the effects of climate change and human activities on sediment load variation by using different methods. Copyright © 2018. Published by Elsevier B.V.
Worldwide Trends of Urinary Stone Disease Treatment Over the Last Two Decades: A Systematic Review.

PubMed

Geraghty, Robert M; Jones, Patrick; Somani, Bhaskar K

2017-06-01

Numerous studies have reported on regional or national trends of stone disease treatment. However, no article has yet examined the global trends of intervention for stone disease. A systematic review of articles from 1996 to September 2016 for all English language articles reporting on trends of surgical treatment of stone disease was performed. Authors were contacted in the case of data not being clear. If the authors did not reply, data were estimated from graphs or tables. Results were analyzed using SPSS version 21, and trends were analyzed using linear regression. Our systematic review yielded 120 articles, of which 8 were included in the initial review. This reflected outcomes from six countries with available data: United Kingdom, United States, New Zealand, Australia, Canada, and Brazil. Overall ureteroscopy (URS) had a 251.8% increase in total number of treatments performed with the share of total treatments increasing by 17%. While the share of total treatments for percutaneous nephrolithotomy (PCNL) remained static, the share for extracorporeal shockwave lithotripsy and open surgery fell by 14.5% and 12%, respectively. There was significant linear regression between rising trends of total treatments year on year for URS (p < 0.001). In the last two decades, the share of total treatment for urolithiasis across the published literature has increased for URS, stable for PCNL, and decreased for lithotripsy and open surgery.

Circulating fibrinogen but not D-dimer level is associated with vital exhaustion in school teachers.

PubMed

Kudielka, Brigitte M; Bellingrath, Silja; von Känel, Roland

2008-07-01

Meta-analyses have established elevated fibrinogen and D-dimer levels in the circulation as biological risk factors for the development and progression of coronary artery disease (CAD). Here, we investigated whether vital exhaustion (VE), a known psychosocial risk factor for CAD, is associated with fibrinogen and D-dimer levels in a sample of apparently healthy school teachers. The teaching profession has been proposed as a potentially high stressful occupation due to enhanced psychosocial stress at the workplace. Plasma fibrinogen and D-dimer levels were measured in 150 middle-aged male and female teachers derived from the first year of the Trier-Teacher-Stress-Study. Log-transformed levels were analyzed using linear regression. Results yielded a significant association between VE and fibrinogen (p = 0.02) but not D-dimer controlling for relevant covariates. Further investigation of possible interaction effects resulted in a significant association between fibrinogen and the interaction term "VE x gender" (p = 0.05). In a secondary analysis, we reran linear regression models for males and females separately. Gender-specific results revealed that the association between fibrinogen and VE remained significant in males but not females. In sum, the present data support the notion that fibrinogen levels are positively related to VE. Elevated fibrinogen might be one biological pathway by which chronic work stress may impact on teachers' cardiovascular health in the long run.
Classical Testing in Functional Linear Models.

PubMed

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models

PubMed Central

Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab

2016-01-01

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
Direct and regression methods do not give different estimates of digestible and metabolizable energy of wheat for pigs.

PubMed

Bolarinwa, O A; Adeola, O

2012-12-01

Digestible and metabolizable energy contents of feed ingredients for pigs can be determined by direct or indirect methods. There are situations when only the indirect approach is suitable and the regression method is a robust indirect approach. This study was conducted to compare the direct and regression methods for determining the energy value of wheat for pigs. Twenty-four barrows with an average initial BW of 31 kg were assigned to 4 diets in a randomized complete block design. The 4 diets consisted of 969 g wheat/kg plus minerals and vitamins (sole wheat) for the direct method, corn (Zea mays)-soybean (Glycine max) meal reference diet (RD), RD + 300 g wheat/kg, and RD + 600 g wheat/kg. The 3 corn-soybean meal diets were used for the regression method and wheat replaced the energy-yielding ingredients, corn and soybean meal, so that the same ratio of corn and soybean meal across the experimental diets was maintained. The wheat used was analyzed to contain 883 g DM, 15.2 g N, and 3.94 Mcal GE/kg. Each diet was fed to 6 barrows in individual metabolism crates for a 5-d acclimation followed by a 5-d total but separate collection of feces and urine. The DE and ME for the sole wheat diet were 3.83 and 3.77 Mcal/kg DM, respectively. Because the sole wheat diet contained 969 g wheat/kg, these translate to 3.95 Mcal DE/kg DM and 3.89 Mcal ME/kg DM. The RD used for the regression approach yielded 4.00 Mcal DE and 3.91 Mcal ME/kg DM diet. Increasing levels of wheat in the RD linearly reduced (P < 0.05) DE and ME to 3.88 and 3.79 Mcal/kg DM diet, respectively. The regressions of wheat contribution to DE and ME in megacalories against the quantity of wheat DM intake in kilograms generated 3.96 Mcal DE and 3.88 Mcal ME/kg DM. In conclusion, values obtained for the DE and ME of wheat using the direct method (3.95 and 3.89 Mcal/kg DM) did not differ (0.78 < P < 0.89) from those obtained using the regression method (3.96 and 3.88 Mcal/kg DM).
Estimates of Flow Duration, Mean Flow, and Peak-Discharge Frequency Values for Kansas Stream Locations

USGS Publications Warehouse

Perry, Charles A.; Wolock, David M.; Artman, Joshua C.

2004-01-01

Streamflow statistics of flow duration and peak-discharge frequency were estimated for 4,771 individual locations on streams listed on the 1999 Kansas Surface Water Register. These statistics included the flow-duration values of 90, 75, 50, 25, and 10 percent, as well as the mean flow value. Peak-discharge frequency values were estimated for the 2-, 5-, 10-, 25-, 50-, and 100-year floods. Least-squares multiple regression techniques were used, along with Tobit analyses, to develop equations for estimating flow-duration values of 90, 75, 50, 25, and 10 percent and the mean flow for uncontrolled flow stream locations. The contributing-drainage areas of 149 U.S. Geological Survey streamflow-gaging stations in Kansas and parts of surrounding States that had flow uncontrolled by Federal reservoirs and used in the regression analyses ranged from 2.06 to 12,004 square miles. Logarithmic transformations of climatic and basin data were performed to yield the best linear relation for developing equations to compute flow durations and mean flow. In the regression analyses, the significant climatic and basin characteristics, in order of importance, were contributing-drainage area, mean annual precipitation, mean basin permeability, and mean basin slope. The analyses yielded a model standard error of prediction range of 0.43 logarithmic units for the 90-percent duration analysis to 0.15 logarithmic units for the 10-percent duration analysis. The model standard error of prediction was 0.14 logarithmic units for the mean flow. Regression equations used to estimate peak-discharge frequency values were obtained from a previous report, and estimates for the 2-, 5-, 10-, 25-, 50-, and 100-year floods were determined for this report. The regression equations and an interpolation procedure were used to compute flow durations, mean flow, and estimates of peak-discharge frequency for locations along uncontrolled flow streams on the 1999 Kansas Surface Water Register. Flow durations, mean flow, and peak-discharge frequency values determined at available gaging stations were used to interpolate the regression-estimated flows for the stream locations where available. Streamflow statistics for locations that had uncontrolled flow were interpolated using data from gaging stations weighted according to the drainage area and the bias between the regression-estimated and gaged flow information. On controlled reaches of Kansas streams, the streamflow statistics were interpolated between gaging stations using only gaged data weighted by drainage area.
Comparison of two-concentration with multi-concentration linear regressions: Retrospective data analysis of multiple regulated LC-MS bioanalytical projects.

PubMed

Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

2013-09-01

Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. Copyright © 2013 Elsevier B.V. All rights reserved.
A Linear Regression and Markov Chain Model for the Arabian Horse Registry

DTIC Science & Technology

1993-04-01

as a tax deduction? Yes No T-4367 68 26. Regardless of previous equine tax deductions, do you consider your current horse activities to be... (Mark one...E L T-4367 A Linear Regression and Markov Chain Model For the Arabian Horse Registry Accesion For NTIS CRA&I UT 7 4:iC=D 5 D-IC JA" LI J:13tjlC,3 lO...the Arabian Horse Registry, which needed to forecast its future registration of purebred Arabian horses . A linear regression model was utilized to
Short communication: Principal components and factor analytic models for test-day milk yield in Brazilian Holstein cattle.

PubMed

Bignardi, A B; El Faro, L; Rosa, G J M; Cardoso, V L; Machado, P F; Albuquerque, L G

2012-04-01

A total of 46,089 individual monthly test-day (TD) milk yields (10 test-days), from 7,331 complete first lactations of Holstein cattle were analyzed. A standard multivariate analysis (MV), reduced rank analyses fitting the first 2, 3, and 4 genetic principal components (PC2, PC3, PC4), and analyses that fitted a factor analytic structure considering 2, 3, and 4 factors (FAS2, FAS3, FAS4), were carried out. The models included the random animal genetic effect and fixed effects of the contemporary groups (herd-year-month of test-day), age of cow (linear and quadratic effects), and days in milk (linear effect). The residual covariance matrix was assumed to have full rank. Moreover, 2 random regression models were applied. Variance components were estimated by restricted maximum likelihood method. The heritability estimates ranged from 0.11 to 0.24. The genetic correlation estimates between TD obtained with the PC2 model were higher than those obtained with the MV model, especially on adjacent test-days at the end of lactation close to unity. The results indicate that for the data considered in this study, only 2 principal components are required to summarize the bulk of genetic variation among the 10 traits. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
No association of smoke-free ordinances with profits from bingo and charitable games in Massachusetts.

PubMed

Glantz, S A; Wilson-Loots, R

2003-12-01

Because it is widely played, claims that smoking restrictions will adversely affect bingo games is used as an argument against these policies. We used publicly available data from Massachusetts to assess the impact of 100% smoke-free ordinances on profits from bingo and other gambling sponsored by charitable organisations between 1985 and 2001. We conducted two analyses: (1) a general linear model implementation of a time series analysis with net profits (adjusted to 2001 dollars) as the dependent variable, and community (as a fixed effect), year, lagged net profits, and the length of time the ordinance had been in force as the independent variables; (2) multiple linear regression of total state profits against time, lagged profits, and the percentage of the entire state population in communities that allow charitable gaming but prohibit smoking. The general linear model analysis of data from individual communities showed that, while adjusted profits fell over time, this effect was not related to the presence of an ordinance. The analysis in terms of the fraction of the population living in communities with ordinances yielded the same result. Policymakers can implement smoke-free policies without concern that these policies will affect charitable gaming.
An improved multiple linear regression and data analysis computer program package

NASA Technical Reports Server (NTRS)

Sidik, S. M.

1972-01-01

NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Duration-dependent effects of clinically relevant oral alendronate doses on cortical bone toughness in beagle dogs

PubMed Central

Burr, David B.; Liu, Ziyue; Allen, Matthew R.

2014-01-01

Bisphosphonates (BPs) have been shown to significantly reduce bone toughness in vertebrae within one year when given at clinical doses to dogs. Although BPs also reduce toughness in cortical bone when given at high doses, their effect on cortical bone material properties when given at clinical doses is less clear. In part, this may be due to the use of small sample sizes that were powered to demonstrate differences in bone mineral density rather than bone’s material properties. Our lab has conducted several studies in which dogs were treated with alendronate at a clinically relevant dose. The goal of this study was to examine these published and unpublished data collectively to determine whether there is a significant time-dependent effect of alendronate on toughness of cortical bone. This analysis seemed particularly relevant given the recent occurrence of atypical femoral fractures in humans. Differences in the toughness of ribs taken from dogs derived from five separate experiments were measured. The dogs were orally administered saline (CON, 1 ml/kg/day) or alendronate (ALN) at a clinical dose (0.2 mg/kg/day). Treatment duration ranged from 3 months to 3 years. Groups were compared using ANOVA, and time trends analyzed with linear regression analysis. Linear regressions of the percent difference in toughness between CON and ALN at each time point revealed a significant reduction in toughness with longer exposure to ALN. The downward trend was primarily driven by a downward trend in post-yield toughness, whereas toughness in the pre-yield region was not changed relative to CON. These data suggest that a longer duration of treatment with clinical doses of ALN results in deterioration of cortical bone toughness in a time-dependent manner. As the duration of treatment is lengthened, the cortical bone exhibits increasingly brittle behavior. This may be important in assessing the role that long-term BP treatments play in the risk of atypical fractures of femoral cortical bone in humans. PMID:25445446
Folded concave penalized sparse linear regression: sparsity, statistical performance, and algorithmic theory for local solutions.

PubMed

Liu, Hongcheng; Yao, Tao; Li, Runze; Ye, Yinyu

2017-11-01

This paper concerns the folded concave penalized sparse linear regression (FCPSLR), a class of popular sparse recovery methods. Although FCPSLR yields desirable recovery performance when solved globally, computing a global solution is NP-complete. Despite some existing statistical performance analyses on local minimizers or on specific FCPSLR-based learning algorithms, it still remains open questions whether local solutions that are known to admit fully polynomial-time approximation schemes (FPTAS) may already be sufficient to ensure the statistical performance, and whether that statistical performance can be non-contingent on the specific designs of computing procedures. To address the questions, this paper presents the following threefold results: (i) Any local solution (stationary point) is a sparse estimator, under some conditions on the parameters of the folded concave penalties. (ii) Perhaps more importantly, any local solution satisfying a significant subspace second-order necessary condition (S 3 ONC), which is weaker than the second-order KKT condition, yields a bounded error in approximating the true parameter with high probability. In addition, if the minimal signal strength is sufficient, the S 3 ONC solution likely recovers the oracle solution. This result also explicates that the goal of improving the statistical performance is consistent with the optimization criteria of minimizing the suboptimality gap in solving the non-convex programming formulation of FCPSLR. (iii) We apply (ii) to the special case of FCPSLR with minimax concave penalty (MCP) and show that under the restricted eigenvalue condition, any S 3 ONC solution with a better objective value than the Lasso solution entails the strong oracle property. In addition, such a solution generates a model error (ME) comparable to the optimal but exponential-time sparse estimator given a sufficient sample size, while the worst-case ME is comparable to the Lasso in general. Furthermore, to guarantee the S 3 ONC admits FPTAS.
CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

NASA Astrophysics Data System (ADS)

Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

2007-07-01

Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach was justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatland sites in Finland and a tundra site in Siberia. The flux measurements were performed using transparent chambers on vegetated surfaces and opaque chambers on bare peat surfaces. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes and even lower for longer closure times. The degree of underestimation increased with increasing CO2 flux strength and is dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
A meta-analysis of lasalocid effects on rumen measures, beef and dairy performance, and carcass traits in cattle.

PubMed

Golder, H M; Lean, I J

2016-01-01

The effects of lasalocid on rumen measures, beef and dairy performance, and carcass traits were evaluated using meta-analysis. Meta-regression was used to investigate sources of heterogeneity. Ten studies (20 comparisons) were used in the meta-analysis on rumen measures. Lasalocid increased total VFA and ammonia concentrations by 6.46 and 1.44 m, respectively. Lasalocid increased propionate and decreased acetate and butyrate molar percentage (M%) by 4.62, 3.18, and 0.83%, respectively. Valerate M% and pH were not affected. Meta-regression found butyrate M% linearly increased with duration of lasalocid supplementation (DUR; = 0.017). When >200 mg/d was fed, propionate and valerate M% were higher and acetate M% was lower ( = 0.042, = 0.017, and = 0.005, respectively). Beef performance was assessed using 31 studies (67 comparisons). Lasalocid increased ADG by 40 g/d, improved feed-to-gain ratio (F:G) by 410 g/kg, and improved feed efficiency (FE; combined measure of G:F and the inverse of F:G). Lasalocid did not affect DMI, but heterogeneity in DMI was influenced by DUR ( = 0.004) and the linear effect of entry BW ( = 0.011). The combination of ≤100 vs. >100 d DUR and entry BW ≤275 vs. >275 kg showed that cattle ≤275 kg at entry fed lasalocid for >100 d had the lowest DMI. Heterogeneity of ADG was influenced by the linear effect of entry BW ( = 0.028) but not DUR. Combining entry BW ≤275 vs. >275 kg and DUR showed that cattle entering at >275 kg fed ≤100 d had the highest ADG. The FE ( = 0.025) and F:G ( = 0.015) linearly improved with dose, and entry BW >275 kg improved F:G ( = 0.038). Fourteen studies (25 comparisons) were used to assess carcass traits. Lasalocid increased HCW by 4.73 kg but not dressing percentage, mean fat cover, or marbling score. Heterogeneity of carcass traits was low and not affected by DUR or dose. Seven studies (11 comparisons) were used to assess dairy performance but the study power was relatively low and the evidence base is limited. Lasalocid decreased DMI in total mixed ration-fed cows by 0.89 kg/d but had no effect on milk yield, milk components, or component yields. Dose linearly decreased DMI ( = 0.049). The DUR did not affect heterogeneity of dairy measures. This work showed that lasalocid improved ADG, HCW, FE, and F:G for beef production. These findings may reflect improved energy efficiency from increased propionate M% and decreased acetate and butyrate M%. Large dairy studies are required for further evaluation of effects of lasalocid on dairy performance.
Interference and economic threshold level of little seed canary grass in wheat under different sowing times.

PubMed

Hussain, Saddam; Khaliq, Abdul; Matloob, Amar; Fahad, Shah; Tanveer, Asif

2015-01-01

Little seed canary grass (LCG) is a pernicious weed of wheat crop causing enormous yield losses. Information on the interference and economic threshold (ET) level of LCG is of prime significance to rationalize the use of herbicide for its effective management in wheat fields. The present study was conducted to quantify interference and ET density of LCG in mid-sown (20 November) and late-sown (10 December) wheat. Experiment was triplicated in randomized split-plot design with sowing dates as the main plots and LCG densities (10, 20, 30, and 40 plants m(-2)) as the subplots. Plots with two natural infestations of weeds including and excluding LCG were maintained for comparing its interference in pure stands with designated densities. A season-long weed-free treatment was also run. Results indicated that composite stand of weeds, including LCG, and density of 40 LCG plants m(-2) were more competitive with wheat, especially when crop was sown late in season. Maximum weed dry biomass was attained by composite stand of weeds including LCG followed by 40 LCG plants m(-2) under both sowing dates. Significant variations in wheat growth and yield were observed under the influence of different LCG densities as well as sowing dates. Presence of 40 LCG plants m(-2) reduced wheat yield by 28 and 34% in mid- and late-sown wheat crop, respectively. These losses were much greater than those for infestation of all weeds, excluding LCG. Linear regression model was effective in simulating wheat yield losses over a wide range of LCG densities, and the regression equations showed good fit to observed data. The ET levels of LCG were 6-7 and 2.2-3.3 plants m(-2) in mid- and late-sown wheat crop, respectively. Herbicide should be applied in cases when LCG density exceeds these levels under respective sowing dates.
The Impact of Age on Quality Measure Adherence in Colon Cancer

PubMed Central

Steele, Scott R.; Chen, Steven L.; Stojadinovic, Alexander; Nissan, Aviram; Zhu, Kangmin; Peoples, George E.; Bilchik, Anton

2012-01-01

BACKGROUND Recently lymph node yield (LNY) has been endorsed as a quality measure of CC resection adequacy. It is unclear whether this measure is relevant to all ages. We hypothesized that total lymph node yield (LNY) is negatively correlated with increasing age and overall survival (OS). STUDY DESIGN The Surveillance, Epidemiology and End Results (SEER) database was queried for all non-metastatic CC patients diagnosed from 1992–2004 (n=101,767), grouped by age (<40, 41–45, 46–50, and in 5-year increments until 86+ years). Proportions of patients meeting the 12 LNY minimum criterion were determined in each age group, and analyzed with multivariate linear regression adjusting for demographics and AJCC 6th Edition stage. Overall survival (OS) comparisons in each age category were based on the guideline of 12 LNY. RESULTS Mean LNY decreased with increasing age (18.7 vs. 11.4 nodes/patient, youngest vs. oldest group, P<0.001). The proportion of patients meeting the 12 LNY criterion also declined with each incremental age group (61.9% vs. 35.2% compliance, youngest vs. oldest, P<0.001). Multivariate regression demonstrated a negative effect of each additional year in age and log (LNY) with coefficient of −0.003 (95% CI −0.003 to −0.002). When stratified by age and nodal yield using the 12 LNY criterion, OS was lower for all age groups in Stage II CC with <12LNY, and each age group over 60 years with <12LNY for Stage III CC (P<0.05). CONCLUSIONS Every attempt to adhere to proper oncological principles should be made at time of CC resection regardless of age. The prognostic significance of the 12 LN minimum criterion should be applied even to elderly CC patients. PMID:21601492
Partial least squares analysis of rocket propulsion fuel data using diaphragm valve-based comprehensive two-dimensional gas chromatography coupled with flame ionization detection.

PubMed

Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E

2016-06-01

The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.
An evaluation of supervised classifiers for indirectly detecting salt-affected areas at irrigation scheme level

NASA Astrophysics Data System (ADS)

Muller, Sybrand Jacobus; van Niekerk, Adriaan

2016-07-01

Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.
Physician burnout, work engagement and the quality of patient care.

PubMed

Loerbroks, A; Glaser, J; Vu-Eickmann, P; Angerer, P

2017-07-01

Research suggests that burnout in physicians is associated with poorer patient care, but evidence is inconclusive. More recently, the concept of work engagement has emerged (i.e. the beneficial counterpart of burnout) and has been associated with better care. Evidence remains markedly sparse however. To examine the associations of burnout and work engagement with physicians' self-perceived quality of care. We drew on cross-sectional data from physicians in Germany. We used a six-item version of the Maslach Burnout Inventory measuring exhaustion and depersonalization. We employed the nine-item Utrecht Work Engagement Scale to assess work engagement and its subcomponents: vigour, dedication and absorption. We measured physicians' own perceptions of their quality of care by a six-item instrument covering practices and attitudes. We used continuous and categorized dependent and independent variables in linear and logistic regression analyses. There were 416 participants. In multivariable linear regression analyses, increasing burnout total scores were associated with poorer perceived quality of care [unstandardized regression coefficient (b) = 0.45, 95% confidence interval (CI) 0.37, 0.54]. This association was stronger for depersonalization (b = 0.37, 95% CI 0.29, 0.44) than for exhaustion (b = 0.26, 95% CI 0.18, 0.33). Increasing work engagement was associated with higher perceived quality care (b for the total score = -0.20, 95% CI -0.28, -0.11). This was confirmed for each subcomponent with stronger associations for vigour (b = -0.21, 95% CI -0.29, -0.13) and dedication (b = -0.16, 95% CI -0.24, -0.09) than for absorption (b = -0.12, 95% CI -0.20, -0.04). Logistic regression analyses yielded comparable results. Physician burnout was associated with self-perceived poorer patient care, while work engagement related to self-reported better care. Studies are needed to corroborate these findings, particularly for work engagement. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Biostatistics Series Module 6: Correlation and Linear Regression.

PubMed

Hazra, Avijit; Gogtay, Nithya

2016-01-01

Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

Biostatistics Series Module 6: Correlation and Linear Regression

PubMed Central

Hazra, Avijit; Gogtay, Nithya

2016-01-01

Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175
Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

ERIC Educational Resources Information Center

Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

2013-01-01

This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Comparison of the Chiron Quantiplex branched DNA (bDNA) assay and the Abbott Genostics solution hybridization assay for quantification of hepatitis B viral DNA.

PubMed

Kapke, G E; Watson, G; Sheffler, S; Hunt, D; Frederick, C

1997-01-01

Several assays for quantification of DNA have been developed and are currently used in research and clinical laboratories. However, comparison of assay results has been difficult owing to the use of different standards and units of measurements as well as differences between assays in dynamic range and quantification limits. Although a few studies have compared results generated by different assays, there has been no consensus on conversion factors and thorough analysis has been precluded by small sample size and limited dynamic range studied. In this study, we have compared the Chiron branched DNA (bDNA) and Abbott liquid hybridization assays for quantification of hepatitis B virus (HBV) DNA in clinical specimens and have derived conversion factors to facilitate comparison of assay results. Additivity and variance stabilizing (AVAS) regression, a form of non-linear regression analysis, was performed on assay results for specimens from HBV clinical trials. Our results show that there is a strong linear relationship (R2 = 0.96) between log Chiron and log Abbott assay results. Conversion factors derived from regression analyses were found to be non-constant and ranged from 6-40. Analysis of paired assay results below and above each assay's limit of quantification (LOQ) indicated that a significantly (P < 0.01) larger proportion of observations were below the Abbott assay LOQ but above the Chiron assay LOQ, indicating that the Chiron assay is significantly more sensitive than the Abbott assay. Testing of replicate specimens showed that the Chiron assay consistently yielded lower per cent coefficients of variance (% CVs) than the Abbott assay, indicating that the Chiron assay provides superior precision.
Progression-free survival as a surrogate endpoint for overall survival in glioblastoma: a literature-based meta-analysis from 91 trials

PubMed Central

Han, Kelong; Ren, Melanie; Wick, Wolfgang; Abrey, Lauren; Das, Asha; Jin, Jin; Reardon, David A.

2014-01-01

Background The aim of this study was to determine correlations between progression-free survival (PFS) and the objective response rate (ORR) with overall survival (OS) in glioblastoma and to evaluate their potential use as surrogates for OS. Method Published glioblastoma trials reporting OS and ORR and/or PFS with sufficient detail were included in correlative analyses using weighted linear regression. Results Of 274 published unique glioblastoma trials, 91 were included. PFS and OS hazard ratios were strongly correlated; R2 = 0.92 (95% confidence interval [CI], 0.71–0.99). Linear regression determined that a 10% PFS risk reduction would yield an 8.1% ± 0.8% OS risk reduction. R2 between median PFS and median OS was 0.70 (95% CI, 0.59–0.79), with a higher value in trials using Response Assessment in Neuro-Oncology (RANO; R2 = 0.96, n = 8) versus Macdonald criteria (R2 = 0.70; n = 83). No significant differences were demonstrated between temozolomide- and bevacizumab-containing regimens (P = .10) or between trials using RANO and Macdonald criteria (P = .49). The regression line slope between median PFS and OS was significantly higher in newly diagnosed versus recurrent disease (0.58 vs 0.35, P = .04). R2 for 6-month PFS with 1-year OS and median OS were 0.60 (95% CI, 0.37–0.77) and 0.64 (95% CI, 0.42–0.77), respectively. Objective response rate and OS were poorly correlated (R2 = 0.22). Conclusion In glioblastoma, PFS and OS are strongly correlated, indicating that PFS may be an appropriate surrogate for OS. Compared with OS, PFS offers earlier assessment and higher statistical power at the time of analysis. PMID:24335699
Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

NASA Astrophysics Data System (ADS)

Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

2016-06-01

Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).
Testing portable luminescence reader signals against late Pleistocene to modern OSL ages of coastal and desert dunefield sand in Israel

NASA Astrophysics Data System (ADS)

Roskin, Joel; Sivan, Dorit; Bookman, Revital; Porat, Naomi; López, Gloria I.

2017-04-01

Rapid assessment of luminescence signals of poly-mineral samples by a pulsed-photon portable OSL reader (PPSL) is useful for interpreting sedimentary sections during fieldwork, and can assist with targeted field sampling for later full OSL dating and prioritize laboratory work. This study investigates PPSL signal intensities in order to assess its usefulness in obtaining relative OSL ages from linear regressions created by interpolating newly generated PPSL values of samples with existing OSL ages from two extensive Nilotic-sourced dunefields. Eighteen OSL-dated sand samples from two quartz-dominated sand systems in Israel were studied:(1) the Mediterranean littoral-sourced coastal dunefields that formed since the middle Holocene; and (2) the inland north-western Negev desert dunefield that rapidly formed between the Last Glacial Maximum and the Holocene. Samples from three coastal dune profiles were also measured. Results show that the PPSL signals differ by several orders of magnitude between modern and late Pleistocene sediments. The coastal and desert sand have different OSL age - PPSL signal ratios. Coastal sand show better correlations between PPSL values and OSL ages. However, using regression curves for each dunefield to interpolate ages is less useful than expected as samples with different ages exhibit similar PPSL signals. The coastal dune profiles yielded low luminescence signal values depicting a modern profile chronology. This study demonstrates that a rapid assessment of the relative OSL ages across different and extensive dunefields is useful and may be achieved. However, the OSL ages obtained by linear regression are only a very rough age estimate. The reasons for not obtaining more reliable ages need to be better understood, as several variables can affect the PPSL signal such as mineral provenance, intrinsic grain properties, micro-dosimetry and moisture content.
Quantification of trace metals in infant formula premixes using laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Cama-Moncunill, Raquel; Casado-Gavalda, Maria P.; Cama-Moncunill, Xavier; Markiewicz-Keszycka, Maria; Dixit, Yash; Cullen, Patrick J.; Sullivan, Carl

2017-09-01

Infant formula is a human milk substitute generally based upon fortified cow milk components. In order to mimic the composition of breast milk, trace elements such as copper, iron and zinc are usually added in a single operation using a premix. The correct addition of premixes must be verified to ensure that the target levels in infant formulae are achieved. In this study, a laser-induced breakdown spectroscopy (LIBS) system was assessed as a fast validation tool for trace element premixes. LIBS is a promising emission spectroscopic technique for elemental analysis, which offers real-time analyses, little to no sample preparation and ease of use. LIBS was employed for copper and iron determinations of premix samples ranging approximately from 0 to 120 mg/kg Cu/1640 mg/kg Fe. LIBS spectra are affected by several parameters, hindering subsequent quantitative analyses. This work aimed at testing three matrix-matched calibration approaches (simple-linear regression, multi-linear regression and partial least squares regression (PLS)) as means for precision and accuracy enhancement of LIBS quantitative analysis. All calibration models were first developed using a training set and then validated with an independent test set. PLS yielded the best results. For instance, the PLS model for copper provided a coefficient of determination (R2) of 0.995 and a root mean square error of prediction (RMSEP) of 14 mg/kg. Furthermore, LIBS was employed to penetrate through the samples by repetitively measuring the same spot. Consequently, LIBS spectra can be obtained as a function of sample layers. This information was used to explore whether measuring deeper into the sample could reduce possible surface-contaminant effects and provide better quantifications.
Effects of land use on water quality and transport of selected constituents in streams in Mecklenburg County, North Carolina, 1994–98

USGS Publications Warehouse

Ferrell, Gloria M.

2001-01-01

Transport rates for total solids, total nitrogen, total phosphorus, biochemical oxygen demand, chromium, copper, lead, nickel, and zinc during 1994–98 were computed for six stormwater-monitoring sites in Mecklenburg County, North Carolina. These six stormwater-monitoring sites were operated by the Mecklenburg County Department of Environmental Protection, in cooperation with the City of Charlotte, and are located near the mouths of major streams. Constituent transport at the six study sites generally was dominated by nonpoint sources, except for nitrogen and phosphorus at two sites located downstream from the outfalls of major municipal wastewater-treatment plants.To relate land use to constituent transport, regression equations to predict constituent yield were developed by using water-quality data from a previous study of nine stormwater-monitoring sites on small streams in Mecklenburg County. The drainage basins of these nine stormwater sites have relatively homogeneous land-use characteristics compared to the six study sites. Mean annual construction activity, based on building permit files, was estimated for all stormwater-monitoring sites and included as an explanatory variable in the regression equations. These regression equations were used to predict constituent yield for the six study sites. Predicted yields generally were in agreement with computed yields. In addition, yields were predicted by using regression equations derived from a national urban water-quality database. Yields predicted from the regional regression equations generally were about an order of magnitude lower than computed yields.Regression analysis indicated that construction activity was a major contributor to transport of the constituents evaluated in this study except for total nitrogen and biochemical oxygen demand. Transport of total nitrogen and biochemical oxygen demand was dominated by point-source contributions. The two study basins that had the largest amounts of construction activity also had the highest total solids yields (1,300 and 1,500 tons per square mile per year). The highest total phosphorus yields (3.2 and 1.7 tons per square mile per year) attributable to nonpoint sources also occurred in these basins. Concentrations of chromium, copper, lead, nickel, and zinc were positively correlated with total solids concentrations at most of the study sites (Pearson product-moment correlation >0.50). The site having the highest median concentrations of chromium, copper, and nickel also was the site having the highest computed yield for total solids.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

PubMed

Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

2009-11-01

G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Guidelines and Procedures for Computing Time-Series Suspended-Sediment Concentrations and Loads from In-Stream Turbidity-Sensor and Streamflow Data

USGS Publications Warehouse

Rasmussen, Patrick P.; Gray, John R.; Glysson, G. Douglas; Ziegler, Andrew C.

2009-01-01

In-stream continuous turbidity and streamflow data, calibrated with measured suspended-sediment concentration data, can be used to compute a time series of suspended-sediment concentration and load at a stream site. Development of a simple linear (ordinary least squares) regression model for computing suspended-sediment concentrations from instantaneous turbidity data is the first step in the computation process. If the model standard percentage error (MSPE) of the simple linear regression model meets a minimum criterion, this model should be used to compute a time series of suspended-sediment concentrations. Otherwise, a multiple linear regression model using paired instantaneous turbidity and streamflow data is developed and compared to the simple regression model. If the inclusion of the streamflow variable proves to be statistically significant and the uncertainty associated with the multiple regression model results in an improvement over that for the simple linear model, the turbidity-streamflow multiple linear regression model should be used to compute a suspended-sediment concentration time series. The computed concentration time series is subsequently used with its paired streamflow time series to compute suspended-sediment loads by standard U.S. Geological Survey techniques. Once an acceptable regression model is developed, it can be used to compute suspended-sediment concentration beyond the period of record used in model development with proper ongoing collection and analysis of calibration samples. Regression models to compute suspended-sediment concentrations are generally site specific and should never be considered static, but they represent a set period in a continually dynamic system in which additional data will help verify any change in sediment load, type, and source.
Factor regression for interpreting genotype-environment interaction in bread-wheat trials.

PubMed

Baril, C P

1992-05-01

The French INRA wheat (Triticum aestivum L. em Thell.) breeding program is based on multilocation trials to produce high-yielding, adapted lines for a wide range of environments. Differential genotypic responses to variable environment conditions limit the accuracy of yield estimations. Factor regression was used to partition the genotype-environment (GE) interaction into four biologically interpretable terms. Yield data were analyzed from 34 wheat genotypes grown in four environments using 12 auxiliary agronomic traits as genotypic and environmental covariates. Most of the GE interaction (91%) was explained by the combination of only three traits: 1,000-kernel weight, lodging susceptibility and spike length. These traits are easily measured in breeding programs, therefore factor regression model can provide a convenient and useful prediction method of yield.
CO2 flux determination by closed-chamber methods can be seriously biased by inappropriate application of linear regression

NASA Astrophysics Data System (ADS)

Kutzbach, L.; Schneider, J.; Sachs, T.; Giebels, M.; Nykänen, H.; Shurpali, N. J.; Martikainen, P. J.; Alm, J.; Wilmking, M.

2007-11-01

Closed (non-steady state) chambers are widely used for quantifying carbon dioxide (CO2) fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764) conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t) evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions. Especially the effects of turbulence and pressure disturbances by the chamber deployment are suspected to have caused unexplainable curvatures. CO2 flux estimates by linear regression can be as low as 40% of the flux estimates of exponential regression for closure times of only two minutes. The degree of underestimation increased with increasing CO2 flux strength and was dependent on soil and vegetation conditions which can disturb not only the quantitative but also the qualitative evaluation of CO2 flux dynamics. The underestimation effect by linear regression was observed to be different for CO2 uptake and release situations which can lead to stronger bias in the daily, seasonal and annual CO2 balances than in the individual fluxes. To avoid serious bias of CO2 flux estimates based on closed chamber experiments, we suggest further tests using published datasets and recommend the use of nonlinear regression models for future closed chamber studies.
Organic Wheat Farming Improves Grain Zinc Concentration

PubMed Central

Helfenstein, Julian; Müller, Isabel; Grüter, Roman; Bhullar, Gurbir; Mandloi, Lokendra; Papritz, Andreas; Siegrist, Michael; Schulin, Rainer; Frossard, Emmanuel

2016-01-01

Zinc (Zn) nutrition is of key relevance in India, as a large fraction of the population suffers from Zn malnutrition and many soils contain little plant available Zn. In this study we compared organic and conventional wheat cropping systems with respect to DTPA (diethylene triamine pentaacetic acid)-extractable Zn as a proxy for plant available Zn, yield, and grain Zn concentration. We analyzed soil and wheat grain samples from 30 organic and 30 conventional farms in Madhya Pradesh (central India), and conducted farmer interviews to elucidate sociological and management variables. Total and DTPA-extractable soil Zn concentrations and grain yield (3400 kg ha-1) did not differ between the two farming systems, but with 32 and 28 mg kg-1 respectively, grain Zn concentrations were higher on organic than conventional farms (t = -2.2, p = 0.03). Furthermore, multiple linear regression analyses revealed that (a) total soil zinc and sulfur concentrations were the best predictors of DTPA-extractable soil Zn, (b) Olsen phosphate taken as a proxy for available soil phosphorus, exchangeable soil potassium, harvest date, training of farmers in nutrient management, and soil silt content were the best predictors of yield, and (c) yield, Olsen phosphate, grain nitrogen, farmyard manure availability, and the type of cropping system were the best predictors of grain Zn concentration. Results suggested that organic wheat contained more Zn despite same yield level due to higher nutrient efficiency. Higher nutrient efficiency was also seen in organic wheat for P, N and S. The study thus suggests that appropriate farm management can lead to competitive yield and improved Zn concentration in wheat grains on organic farms. PMID:27537548
Biochemical methane potential, biodegradability, alkali treatment and influence of chemical composition on methane yield of yard wastes.

PubMed

Gunaseelan, Victor Nallathambi

2016-03-01

In this study, the biochemical CH4 potential, rate, biodegradability, NaOH treatment and the influence of chemical composition on CH4 yield of yard wastes generated from seven trees were examined. All the plant parts were sampled for their chemical composition and subjected to the biochemical CH4 potential assay. The component parts exhibited significant variation in biochemical CH4 potential, which was reflected in their ultimate CH4 yields that ranged from 109 to 382 ml g(-1) volatile solids added and their rate constants that ranged from 0.042 to 0.173 d(-1). The biodegradability of the yard wastes ranged from 0.26 to 0.86. Variation in the biochemical CH4 potential of the yard wastes could be attributed to variation in the chemical composition of the different fractions. In the Thespesia yellow withered leaf, Tamarindus fruit pericarp and Albizia pod husk, NaOH treatment enhanced the ultimate CH4 yields by 17%, 77% and 63%, respectively, and biodegradability by 15%, 77% and 61%, respectively, compared with the untreated samples. The effectiveness of NaOH treatment varied for different yard wastes, depending on the amounts of acid detergent fibre content. Gliricidia petals, Prosopis leaf, inflorescence and immature pod, Tamarindus seeds, Albizia seeds, Cassia seeds and Delonix seeds exhibited CH4 yields higher than 300 ml g(-1) volatile solids added. Multiple linear regression models for predicting the ultimate CH4 yield and biodegradability of yard wastes were designed from the results of this work. © The Author(s) 2016.
Organic Wheat Farming Improves Grain Zinc Concentration.

PubMed

Helfenstein, Julian; Müller, Isabel; Grüter, Roman; Bhullar, Gurbir; Mandloi, Lokendra; Papritz, Andreas; Siegrist, Michael; Schulin, Rainer; Frossard, Emmanuel

2016-01-01

Zinc (Zn) nutrition is of key relevance in India, as a large fraction of the population suffers from Zn malnutrition and many soils contain little plant available Zn. In this study we compared organic and conventional wheat cropping systems with respect to DTPA (diethylene triamine pentaacetic acid)-extractable Zn as a proxy for plant available Zn, yield, and grain Zn concentration. We analyzed soil and wheat grain samples from 30 organic and 30 conventional farms in Madhya Pradesh (central India), and conducted farmer interviews to elucidate sociological and management variables. Total and DTPA-extractable soil Zn concentrations and grain yield (3400 kg ha-1) did not differ between the two farming systems, but with 32 and 28 mg kg-1 respectively, grain Zn concentrations were higher on organic than conventional farms (t = -2.2, p = 0.03). Furthermore, multiple linear regression analyses revealed that (a) total soil zinc and sulfur concentrations were the best predictors of DTPA-extractable soil Zn, (b) Olsen phosphate taken as a proxy for available soil phosphorus, exchangeable soil potassium, harvest date, training of farmers in nutrient management, and soil silt content were the best predictors of yield, and (c) yield, Olsen phosphate, grain nitrogen, farmyard manure availability, and the type of cropping system were the best predictors of grain Zn concentration. Results suggested that organic wheat contained more Zn despite same yield level due to higher nutrient efficiency. Higher nutrient efficiency was also seen in organic wheat for P, N and S. The study thus suggests that appropriate farm management can lead to competitive yield and improved Zn concentration in wheat grains on organic farms.
The role of climatic variables in winter cereal yields: a retrospective analysis.

PubMed

Luo, Qunying; Wen, Li

2015-02-01

This study examined the effects of observed climate including [CO2] on winter cereal [winter wheat (Triticum aestivum), barley (Hordeum vulgare) and oat (Avena sativa)] yields by adopting robust statistical analysis/modelling approaches (i.e. autoregressive fractionally integrated moving average, generalised addition model) based on long time series of historical climate data and cereal yield data at three locations (Moree, Dubbo and Wagga Wagga) in New South Wales, Australia. Research results show that (1) growing season rainfall was significantly, positively and non-linearly correlated with crop yield at all locations considered; (2) [CO2] was significantly, positively and non-linearly correlated with crop yields in all cases except wheat and barley yields at Wagga Wagga; (3) growing season maximum temperature was significantly, negatively and non-linearly correlated with crop yields at Dubbo and Moree (except for barley); and (4) radiation was only significantly correlated with oat yield at Wagga Wagga. This information will help to identify appropriate management adaptation options in dealing with the risk and in taking the opportunities of climate change.
Retrieving relevant factors with exploratory SEM and principal-covariate regression: A comparison.

PubMed

Vervloet, Marlies; Van den Noortgate, Wim; Ceulemans, Eva

2018-02-12

Behavioral researchers often linearly regress a criterion on multiple predictors, aiming to gain insight into the relations between the criterion and predictors. Obtaining this insight from the ordinary least squares (OLS) regression solution may be troublesome, because OLS regression weights show only the effect of a predictor on top of the effects of other predictors. Moreover, when the number of predictors grows larger, it becomes likely that the predictors will be highly collinear, which makes the regression weights' estimates unstable (i.e., the "bouncing beta" problem). Among other procedures, dimension-reduction-based methods have been proposed for dealing with these problems. These methods yield insight into the data by reducing the predictors to a smaller number of summarizing variables and regressing the criterion on these summarizing variables. Two promising methods are principal-covariate regression (PCovR) and exploratory structural equation modeling (ESEM). Both simultaneously optimize reduction and prediction, but they are based on different frameworks. The resulting solutions have not yet been compared; it is thus unclear what the strengths and weaknesses are of both methods. In this article, we focus on the extents to which PCovR and ESEM are able to extract the factors that truly underlie the predictor scores and can predict a single criterion. The results of two simulation studies showed that for a typical behavioral dataset, ESEM (using the BIC for model selection) in this regard is successful more often than PCovR. Yet, in 93% of the datasets PCovR performed equally well, and in the case of 48 predictors, 100 observations, and large differences in the strengths of the factors, PCovR even outperformed ESEM.
Comparison between a Weibull proportional hazards model and a linear model for predicting the genetic merit of US Jersey sires for daughter longevity.

PubMed

Caraviello, D Z; Weigel, K A; Gianola, D

2004-05-01

Predicted transmitting abilities (PTA) of US Jersey sires for daughter longevity were calculated using a Weibull proportional hazards sire model and compared with predictions from a conventional linear animal model. Culling data from 268,008 Jersey cows with first calving from 1981 to 2000 were used. The proportional hazards model included time-dependent effects of herd-year-season contemporary group and parity by stage of lactation interaction, as well as time-independent effects of sire and age at first calving. Sire variances and parameters of the Weibull distribution were estimated, providing heritability estimates of 4.7% on the log scale and 18.0% on the original scale. The PTA of each sire was expressed as the expected risk of culling relative to daughters of an average sire. Risk ratios (RR) ranged from 0.7 to 1.3, indicating that the risk of culling for daughters of the best sires was 30% lower than for daughters of average sires and nearly 50% lower than than for daughters of the poorest sires. Sire PTA from the proportional hazards model were compared with PTA from a linear model similar to that used for routine national genetic evaluation of length of productive life (PL) using cross-validation in independent samples of herds. Models were compared using logistic regression of daughters' stayability to second, third, fourth, or fifth lactation on their sires' PTA values, with alternative approaches for weighting the contribution of each sire. Models were also compared using logistic regression of daughters' stayability to 36, 48, 60, 72, and 84 mo of life. The proportional hazards model generally yielded more accurate predictions according to these criteria, but differences in predictive ability between methods were smaller when using a Kullback-Leibler distance than with other approaches. Results of this study suggest that survival analysis methodology may provide more accurate predictions of genetic merit for longevity than conventional linear models.
Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

NASA Astrophysics Data System (ADS)

Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

2017-11-01

This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation
A method for fitting regression splines with varying polynomial order in the linear mixed model.

PubMed

Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W

2006-02-15

The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.

Statistical Methods in Ai: Rare Event Learning Using Associative Rules and Higher-Order Statistics

NASA Astrophysics Data System (ADS)

Iyer, V.; Shetty, S.; Iyengar, S. S.

2015-07-01

Rare event learning has not been actively researched since lately due to the unavailability of algorithms which deal with big samples. The research addresses spatio-temporal streams from multi-resolution sensors to find actionable items from a perspective of real-time algorithms. This computing framework is independent of the number of input samples, application domain, labelled or label-less streams. A sampling overlap algorithm such as Brooks-Iyengar is used for dealing with noisy sensor streams. We extend the existing noise pre-processing algorithms using Data-Cleaning trees. Pre-processing using ensemble of trees using bagging and multi-target regression showed robustness to random noise and missing data. As spatio-temporal streams are highly statistically correlated, we prove that a temporal window based sampling from sensor data streams converges after n samples using Hoeffding bounds. Which can be used for fast prediction of new samples in real-time. The Data-cleaning tree model uses a nonparametric node splitting technique, which can be learned in an iterative way which scales linearly in memory consumption for any size input stream. The improved task based ensemble extraction is compared with non-linear computation models using various SVM kernels for speed and accuracy. We show using empirical datasets the explicit rule learning computation is linear in time and is only dependent on the number of leafs present in the tree ensemble. The use of unpruned trees (t) in our proposed ensemble always yields minimum number (m) of leafs keeping pre-processing computation to n × t log m compared to N2 for Gram Matrix. We also show that the task based feature induction yields higher Qualify of Data (QoD) in the feature space compared to kernel methods using Gram Matrix.
GIS Tools to Estimate Average Annual Daily Traffic

DOT National Transportation Integrated Search

2012-06-01

This project presents five tools that were created for a geographical information system to estimate Annual Average Daily : Traffic using linear regression. Three of the tools can be used to prepare spatial data for linear regression. One tool can be...
Multivariate classification of the infrared spectra of cell and tissue samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haaland, D.M.; Jones, H.D.; Thomas, E.V.

1997-03-01

Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Evaluation of weather-based rice yield models in India.

PubMed

Sudharsan, D; Adinarayana, J; Reddy, D Raji; Sreenivas, G; Ninomiya, S; Hirafuji, M; Kiura, T; Tanaka, K; Desai, U B; Merchant, S N

2013-01-01

The objective of this study was to compare two different rice simulation models--standalone (Decision Support System for Agrotechnology Transfer [DSSAT]) and web based (SImulation Model for RIce-Weather relations [SIMRIW])--with agrometeorological data and agronomic parameters for estimation of rice crop production in southern semi-arid tropics of India. Studies were carried out on the BPT5204 rice variety to evaluate two crop simulation models. Long-term experiments were conducted in a research farm of Acharya N G Ranga Agricultural University (ANGRAU), Hyderabad, India. Initially, the results were obtained using 4 years (1994-1997) of data with weather parameters from a local weather station to evaluate DSSAT simulated results with observed values. Linear regression models used for the purpose showed a close relationship between DSSAT and observed yield. Subsequently, yield comparisons were also carried out with SIMRIW and DSSAT, and validated with actual observed values. Realizing the correlation coefficient values of SIMRIW simulation values in acceptable limits, further rice experiments in monsoon (Kharif) and post-monsoon (Rabi) agricultural seasons (2009, 2010 and 2011) were carried out with a location-specific distributed sensor network system. These proximal systems help to simulate dry weight, leaf area index and potential yield by the Java based SIMRIW on a daily/weekly/monthly/seasonal basis. These dynamic parameters are useful to the farming community for necessary decision making in a ubiquitous manner. However, SIMRIW requires fine tuning for better results/decision making.
Variable sensitivity of US maize yield to high temperatures across developmental stages

NASA Astrophysics Data System (ADS)

Butler, E. E.; Huybers, P. J.

2013-12-01

The sensitivity of maize to high temperatures has been widely demonstrated. Furthermore, field work has indicated that reproductive development stages are particularly sensitive to stress, but this relationship has not been quantified across a wide geographic region. Here, the relationship between maize yield and temperature variations is examined as a function of developmental stage. US state-level data from the National Agriculture Statistics Service provide dates for six growing stages: planting, silking, doughing, dented, mature, and harvested. Temperatures that correspond to each developmental stage are then inferred from a network of weather station observations interpolated to the county level, and a multiple linear regression technique is employed to estimate the sensitivity of county yield outcomes to variations in growing-degree days and an analogous measure of high temperatures referred to as killing-degree days. Uncertainties in the transition times between county-level growth stages are accounted for. Results indicate that the silking and dented stages are generally the most sensitive to killing degree days, with silking the most sensitive stage in the US South and dented the most sensitive in the US North. These variable patterns of sensitivity aid in interpreting which weather events are of greatest significance to maize yields and provide some insight into how shifts in planting time or changes in developmental timing would influence the risks associated with exposure to high temperatures.
Estimating extent of mortality associated with the Douglas-fir beetle in the Central and Northern Rockies

Treesearch

Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini

1999-01-01

Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
[Prediction model of health workforce and beds in county hospitals of Hunan by multiple linear regression].

PubMed

Ling, Ru; Liu, Jiawang

2011-12-01

To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
A new approach to correct the QT interval for changes in heart rate using a nonparametric regression model in beagle dogs.

PubMed

Watanabe, Hiroyuki; Miyazaki, Hiroyasu

2006-01-01

Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Linear regression analysis: part 14 of a series on evaluation of scientific publications.

PubMed

Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

2010-11-01

Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
Morphodynamic data assimilation used to understand changing coasts

USGS Publications Warehouse

Plant, Nathaniel G.; Long, Joseph W.

2015-01-01

Morphodynamic data assimilation blends observations with model predictions and comes in many forms, including linear regression, Kalman filter, brute-force parameter estimation, variational assimilation, and Bayesian analysis. Importantly, data assimilation can be used to identify sources of prediction errors that lead to improved fundamental understanding. Overall, models incorporating data assimilation yield better information to the people who must make decisions impacting safety and wellbeing in coastal regions that experience hazards due to storms, sea-level rise, and erosion. We present examples of data assimilation associated with morphologic change. We conclude that enough morphodynamic predictive capability is available now to be useful to people, and that we will increase our understanding and the level of detail of our predictions through assimilation of observations and numerical-statistical models.
An extended Kalman-Bucy filter for atmospheric temperature profile retrieval with a passive microwave sounder

NASA Technical Reports Server (NTRS)

Ledsham, W. H.; Staelin, D. H.

1978-01-01

An extended Kalman-Bucy filter has been implemented for atmospheric temperature profile retrievals from observations made using the Scanned Microwave Spectrometer (SCAMS) instrument carried on the Nimbus 6 satellite. This filter has the advantage that it requires neither stationary statistics in the underlying processes nor linear production of the observed variables from the variables to be estimated. This extended Kalman-Bucy filter has yielded significant performance improvement relative to multiple regression retrieval methods. A multi-spot extended Kalman-Bucy filter has also been developed in which the temperature profiles at a number of scan angles in a scanning instrument are retrieved simultaneously. These multi-spot retrievals are shown to outperform the single-spot Kalman retrievals.
Yield and yield gaps in central U.S. corn production systems

USDA-ARS?s Scientific Manuscript database

The magnitude of yield gaps (YG) (potential yield – farmer yield) provides some indication of the prospects for increasing crop yield. Quantile regression analysis was applied to county maize (Zea mays L.) yields (1972 – 2011) from Kentucky, Iowa and Nebraska (irrigated) (total of 115 counties) to e...
Yield gaps and yield relationships in US soybean production systems

USDA-ARS?s Scientific Manuscript database

The magnitude of yield gaps (YG) (potential yield – farmer yield) provides some indication of the prospects for increasing crop yield to meet the food demands of future populations. Quantile regression analysis was applied to county soybean [Glycine max (L.) Merrill] yields (1971 – 2011) from Kentuc...
Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

PubMed

Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

2016-01-01

Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p < 0.001) when using a linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p < 0.001) and slopes (p < 0.001) of the individual growth trajectories. We also identified important serial correlation within the structure of the data (ρ = 0.66; 95 % CI 0.64 to 0.68; p < 0.001), which we modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19,598, respectively). While the regression parameters are more complex to interpret in the former, we argue that inference for any problem depends more on the estimated curve or differences in curves rather than the coefficients. Moreover, use of cubic regression splines provides biological meaningful growth velocity and acceleration curves despite increased complexity in coefficient interpretation. Through this stepwise approach, we provide a set of tools to model longitudinal childhood data for non-statisticians using linear mixed-effect models.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

NASA Astrophysics Data System (ADS)

Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

2017-05-01

This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert Manfred; Volden, Thomas R.

2010-01-01

The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Urinary trichloroacetic acid levels and semen quality: A hospital-based cross-sectional study in Wuhan, China

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Shao-Hua; The Ministry of Education Key Laboratory of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan; Li, Yu-Feng

Toxicological studies indicate an association between exposure to disinfection by-products (DBPs) and impaired male reproductive health in animals. However, epidemiological evidence in humans is still limited. We conducted a hospital-based cross-sectional study to investigate the effect of exposure to DBPs on semen quality in humans. Between May 2008 and July 2008, we recruited 418 male partners in sub-fertile couples seeking infertility medical instruction or assisted reproduction services from the Tongji Hospital in Wuhan, China. Major semen parameters analyzed included sperm concentration, motility, and morphology. Exposure to DBPs was estimated by their urinary creatinine-adjusted trichloroacetic (TCAA) concentrations that were measured withmore » the gas chromatography/electron capture detection method. We used linear regression to assess the relationship between exposure to DBPs and semen quality. According to the World Health Organization criteria (<20 million/mL for sperm concentration and <50% motile for sperm motility) and threshold value recommended by Guzick (<9% for sperm morphology), there were 265 men with all parameters at or above the reference values, 33 men below the reference sperm concentration, 151 men below the reference sperm motility, and 6 men below the reference sperm morphology. The mean (median) urinary creatinine-adjusted TCAA concentration was 9.2 (5.1) {mu}g/g creatinine. Linear regression analyses indicated no significant association of sperm concentration, sperm count, and sperm morphology with urinary TCAA levels. Compared with those in the lowest quartile of creatinine-adjusted urinary TCAA concentrations, subjects in the second and third quartiles had a decrease of 5.1% (95% CI: 0.6%, 9.7%) and 4.7% (95% CI: 0.2%, 9.2%) in percent motility, respectively. However, these associations were not significant after adjustment for age, abstinence time, and smoking status. The present study provides suggestive but inconclusive evidence of the relationship between decreased sperm motility and increased urinary TCAA levels. The effect of exposure to DBPs on human male reproductive health in Chinese populations still warrants further investigations. - Research highlights: {yields} No association between DBPs exposure and semen quality was found. {yields} Effects of DBPs exposure on male reproductive health need further investigations. {yields} Intra-individual variability of urinary TCAA should be considered in the future.« less
Evidence for a weakening strength of temperature-corn yield relation in the United States during 1980–2010

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leng, Guoyong

Temperature is known to be correlated with crop yields, causing reduction of crop yield with climate warming without adaptations or CO2 fertilization effects. The historical temperature-crop yield relation has often been used for informing future changes. This relationship, however, may change over time following alternations in other environmental factors. Results show that the strength of the relationship between the interannual variability of growing season temperature and corn yield (RGST_CY) has declined in the United States between 1980 and 2010 with a loss in the statistical significance. The regression slope which represents the anomalies in corn yield that occur in associationmore » with 1 degree temperature anomaly has decreased significantly from -6.9%/K of the first half period to -2.4%/K~-3.5%/K of the second half period. This implies that projected corn yield reduction will be overestimated by a fact of 2 in a given warming scenario, if the corn-temperature relation is derived from the earlier historical period. Changes in RGST_CY are mainly observed in Midwest Corn Belt and central High Plains, and are well reproduced by 11 process-based crop models. In Midwest rain-fed systems, the decrease of negative temperature effects coincides with an increase in water availability by precipitation. In irrigated areas where water stress is minimized, the decline of beneficial temperature effects is significantly related to the increase in extreme hot days. The results indicate that an extrapolation of historical yield response to temperature may bias the assessment of agriculture vulnerability to climate change. Efforts to reduce climate impacts on agriculture should pay attention not only to climate change, but also to changes in climate-crop yield relations. There are some caveats that should be acknowledged as the analysis is restricted to the changes in the linear relation between growing season mean temperature and corn yield for the specific study period.« less
Evidence for a weakening strength of temperature-corn yield relation in the United States during 1980-2010.

PubMed

Leng, Guoyong

2017-12-15

Temperature is known to be correlated with crop yields, causing reduction of crop yield with climate warming without adaptations or CO 2 fertilization effects. The historical temperature-crop yield relation has often been used for informing future changes. This relationship, however, may change over time following alternations in other environmental factors. Results show that the strength of the relationship between the interannual variability of growing season temperature and corn yield (R GST_CY ) has declined in the United States between 1980 and 2010 with a loss in the statistical significance. The regression slope which represents the anomalies in corn yield that occur in association with 1 degree temperature anomaly has decreased significantly from -6.9%/K of the first half period to -2.4%/K--3.5%/K of the second half period. This implies that projected corn yield reduction will be overestimated by a fact of 2 in a given warming scenario, if the corn-temperature relation is derived from the earlier historical period. Changes in R GST_CY are mainly observed in Midwest Corn Belt and central High Plains, but are partly reproduced by 11 process-based crop models. In Midwest rain-fed systems, the decrease of negative temperature effects coincides with an increase in water availability by precipitation. In irrigated areas where water stress is minimized, the decline of beneficial temperature effects is significantly related to the increase in extreme hot days. The results indicate that an extrapolation of historical yield response to temperature may bias the assessment of agriculture vulnerability to climate change. Efforts to reduce climate impacts on agriculture should pay attention not only to climate change, but also to changes in climate-crop yield relations. There are some caveats that should be acknowledged as the analysis is restricted to the changes in the linear relation between growing season mean temperature and corn yield for the specific study period. Copyright © 2017 Elsevier B.V. All rights reserved.
The genetic relationship between commencement of luteal activity and calving interval, body condition score, production, and linear type traits in Holstein-Friesian dairy cattle.

PubMed

Royal, M D; Pryce, J E; Woolliams, J A; Flint, A P F

2002-11-01

The decline of fertility in the UK dairy herd and the unfavorable genetic correlation (r(a)) between fertility and milk yield has necessitated the broadening of breeding goals to include fertility. The coefficient of genetic variation present in fertility is of similar magnitude to that present in production traits; however, traditional measurements of fertility (such as calving interval, days open, nonreturn rate) have low heritability (h2 < 0.05), and recording is often poor, hindering identification of genetically superior animals. An alternative approach is to use endocrine measurements of fertility such as interval to commencement of luteal activity postpartum (CLA), which has a higher h2 (0.16 to 0.23) and is free from management bias. Although CLA has favorable phenotypic correlations with traditional measures of fertility, if it is to be used in a selection index, the genetic correlation (ra) of this trait with fertility and other components of the index must be estimated. The aim of the analyses reported here was to obtain information on the ra between lnCLA and calving interval (CI), average body condition score (BCS; one to nine, an indicator of energy balance estimated from records taken at different months of lactation), production and a number of linear type traits. Genetic models were fitted using ASREML, and r(a) were inferred from genetic regression of lnCLA on sire-predicted transmitting abilities (PTA) for the trait concerned by multiplying the regression coefficient (b) by the ratio of the genetic standard deviations. The inferred r(a) between lnCLA and CI and average BCS were 0.36 and -0.84, respectively. Genetic correlations between InCLA and milk fat and protein yields were all positive and ranged between 0.33 and 0.69. Genetic correlations between InCLA and linear type traits reflecting body structure ranged from -0.25 to 0.15, and between udder characteristics they ranged from -0.16 to 0.05. Thus, incorporation of endocrine parameters of fertility, such as CIA, into a fertility index may offer the potential to improve the accuracy of breeding value prediction for fertility, thus allowing producers to make more informed selection decisions.

Genetic parameters for test-day yield of milk, fat and protein in buffaloes estimated by random regression models.

PubMed

Aspilcueta-Borquis, Rúsbel R; Araujo Neto, Francisco R; Baldi, Fernando; Santos, Daniel J A; Albuquerque, Lucia G; Tonhati, Humberto

2012-08-01

The test-day yields of milk, fat and protein were analysed from 1433 first lactations of buffaloes of the Murrah breed, daughters of 113 sires from 12 herds in the state of São Paulo, Brazil, born between 1985 and 2007. For the test-day yields, 10 monthly classes of lactation days were considered. The contemporary groups were defined as the herd-year-month of the test day. Random additive genetic, permanent environmental and residual effects were included in the model. The fixed effects considered were the contemporary group, number of milkings (1 or 2 milkings), linear and quadratic effects of the covariable cow age at calving and the mean lactation curve of the population (modelled by third-order Legendre orthogonal polynomials). The random additive genetic and permanent environmental effects were estimated by means of regression on third- to sixth-order Legendre orthogonal polynomials. The residual variances were modelled with a homogenous structure and various heterogeneous classes. According to the likelihood-ratio test, the best model for milk and fat production was that with four residual variance classes, while a third-order Legendre polynomial was best for the additive genetic effect for milk and fat yield, a fourth-order polynomial was best for the permanent environmental effect for milk production and a fifth-order polynomial was best for fat production. For protein yield, the best model was that with three residual variance classes and third- and fourth-order Legendre polynomials were best for the additive genetic and permanent environmental effects, respectively. The heritability estimates for the characteristics analysed were moderate, varying from 0·16±0·05 to 0·29±0·05 for milk yield, 0·20±0·05 to 0·30±0·08 for fat yield and 0·18±0·06 to 0·27±0·08 for protein yield. The estimates of the genetic correlations between the tests varied from 0·18±0·120 to 0·99±0·002; from 0·44±0·080 to 0·99±0·004; and from 0·41±0·080 to 0·99±0·004, for milk, fat and protein production, respectively, indicating that whatever the selection criterion used, indirect genetic gains can be expected throughout the lactation curve.
Evaluation of trends in wheat yield models

NASA Technical Reports Server (NTRS)

Ferguson, M. C.

1982-01-01

Trend terms in models for wheat yield in the U.S. Great Plains for the years 1932 to 1976 are evaluated. The subset of meteorological variables yielding the largest adjusted R(2) is selected using the method of leaps and bounds. Latent root regression is used to eliminate multicollinearities, and generalized ridge regression is used to introduce bias to provide stability in the data matrix. The regression model used provides for two trends in each of two models: a dependent model in which the trend line is piece-wise continuous, and an independent model in which the trend line is discontinuous at the year of the slope change. It was found that the trend lines best describing the wheat yields consisted of combinations of increasing, decreasing, and constant trend: four combinations for the dependent model and seven for the independent model.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.

PubMed

Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H

2006-01-01

Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

PubMed

Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

2017-01-01

This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second degree where the parabola is its graphical representation.
SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS

EPA Science Inventory

As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
A simplified competition data analysis for radioligand specific activity determination.

PubMed

Venturino, A; Rivera, E S; Bergoc, R M; Caro, R A

1990-01-01

Non-linear regression and two-step linear fit methods were developed to determine the actual specific activity of 125I-ovine prolactin by radioreceptor self-displacement analysis. The experimental results obtained by the different methods are superposable. The non-linear regression method is considered to be the most adequate procedure to calculate the specific activity, but if its software is not available, the other described methods are also suitable.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions

PubMed Central

Fernandes, Bruno J. T.; Roque, Alexandre

2018-01-01

Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care. PMID:29651366
Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

NASA Astrophysics Data System (ADS)

Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

2009-08-01

In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Improving Prediction Accuracy for WSN Data Reduction by Applying Multivariate Spatio-Temporal Correlation

PubMed Central

Carvalho, Carlos; Gomes, Danielo G.; Agoulmine, Nazim; de Souza, José Neuman

2011-01-01

This paper proposes a method based on multivariate spatial and temporal correlation to improve prediction accuracy in data reduction for Wireless Sensor Networks (WSN). Prediction of data not sent to the sink node is a technique used to save energy in WSNs by reducing the amount of data traffic. However, it may not be very accurate. Simulations were made involving simple linear regression and multiple linear regression functions to assess the performance of the proposed method. The results show a higher correlation between gathered inputs when compared to time, which is an independent variable widely used for prediction and forecasting. Prediction accuracy is lower when simple linear regression is used, whereas multiple linear regression is the most accurate one. In addition to that, our proposal outperforms some current solutions by about 50% in humidity prediction and 21% in light prediction. To the best of our knowledge, we believe that we are probably the first to address prediction based on multivariate correlation for WSN data reduction. PMID:22346626
Quantifying ruminal nitrogen metabolism using the omasal sampling technique in cattle--a meta-analysis.

PubMed

Broderick, G A; Huhtanen, P; Ahvenjärvi, S; Reynal, S M; Shingfield, K J

2010-07-01

Mixed model analysis of data from 32 studies (122 diets) was used to evaluate the precision and accuracy of the omasal sampling technique for quantifying ruminal-N metabolism and to assess the relationships between nonammonia-N flow at the omasal canal and milk protein yield. Data were derived from experiments in cattle fed North American diets (n=36) based on alfalfa silage, corn silage, and corn grain and Northern European diets (n=86) composed of grass silage and barley-based concentrates. In all studies, digesta flow was quantified using a triple-marker approach. Linear regressions were used to predict microbial-N flow to the omasum from intake of dry matter (DM), organic matter (OM), or total digestible nutrients. Efficiency of microbial-N synthesis increased with DM intake and there were trends for increased efficiency with elevated dietary concentrations of crude protein (CP) and rumen-degraded protein (RDP) but these effects were small. Regression of omasal rumen-undegraded protein (RUP) flow on CP intake indicated that an average 32% of dietary CP escaped and 68% was degraded in the rumen. The slope from regression of observed omasal flows of RUP on flows predicted by the National Research Council (2001) model indicated that NRC predicted greater RUP supply. Measured microbial-N flow was, on average, 26% greater than that predicted by the NRC model. Zero ruminal N-balance (omasal CP flow=CP intake) was obtained at dietary CP and RDP concentrations of 147 and 106 g/kg of DM, corresponding to ruminal ammonia-N and milk urea N concentrations of 7.1 and 8.3mg/100mL, respectively. Milk protein yield was positively related to the efficiency of microbial-N synthesis and measured RUP concentration. Improved efficiency of microbial-N synthesis and reduced ruminal CP degradability were positively associated with efficiency of capture of dietary N as milk N. In conclusion, the results of this study indicate that the omasal sampling technique yields valuable estimates of RDP, RUP, and ruminal microbial protein supply in cattle. Copyright (c) 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Active sensing: An innovative tool for evaluating grain yield and nitrogen use efficiency of multiple wheat genotypes

NASA Astrophysics Data System (ADS)

Naser, Mohammed Abdulridha

Precision agricultural practices have significantly contributed to the improvement of crop productivity and profitability. Remote sensing based indices, such as Normalized Difference Vegetative Index (NDVI) have been used to obtain crop information. It is used to monitor crop development and to provide rapid and nondestructive estimates of plant biomass, nitrogen (N) content and grain yield. Remote sensing tools are helping improve nitrogen use efficiency (NUE) through nitrogen management and could also be useful for high NUE genotype selection. The objectives of this study were: (i) to determine if active sensor based NDVI readings can differentiate wheat genotypes, (ii) to determine if NDVI readings can be used to classify wheat genotypes into grain yield productivity classes, (iii) to identify and quantify the main sources of variation in NUE across wheat genotypes, and (iv) to determine if normalized difference vegetation index (NDVI) could characterize variability in NUE across wheat genotypes. This study was conducted in north eastern Colorado for two years, 2010 and 2011. The NDVI readings were taken weekly during the winter wheat growing season from March to late June, in 2010 and 2011 and NUE were calculated as partial factor productivity and as partial nitrogen balance at the end of the season. For objectives i and ii, the correlation between NDVI and grain yield was determined using Pearson's product-moment correlation coefficient (r) and linear regression analysis was used to explain the relationship between NDVI and grain yield. The K-means clustering algorithm was used to classify mean NDVI and mean grain yield into three classes. For objectives iii and iv, the parameters related to NUE were also calculated to measure their relative importance in genotypic variation of NUE and power regression analysis between NDVI and NUE was used to characterize the relationship between NDVI and NUE. The results indicate more consistent association between grain yield and NDVI and between NDVI and NUE later in the season, after anthesis and during mid-grain filling stage under dryland and a poor association in wheat grown in irrigated conditions. The results suggest that below saturation of NDVI values (about 0.9), (i.e. prior to full canopy closure and after the beginning of senescence or most of the season under dryland conditions) NDVI could assess grain yield and NUE. The results also indicate that nitrogen uptake efficiency was the main source of variation of NUE among genotypes grown in site-years with lower yield. Overall, results from this study demonstrate that NDVI readings successfully classified wheat genotypes into grain yield classes across dryland and irrigated conditions and characterized variability in NUE across wheat genotypes.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

NASA Astrophysics Data System (ADS)

Gorgees, HazimMansoor; Mahdi, FatimahAssim

2018-05-01

This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

PubMed

Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

2016-04-01

Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.

PubMed

Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong

2017-01-01

This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
A simple bias correction in linear regression for quantitative trait association under two-tail extreme selection.

PubMed

Kwan, Johnny S H; Kung, Annie W C; Sham, Pak C

2011-09-01

Selective genotyping can increase power in quantitative trait association. One example of selective genotyping is two-tail extreme selection, but simple linear regression analysis gives a biased genetic effect estimate. Here, we present a simple correction for the bias.
Chemical performance of multi-environment trials in lens (Lens culinaris M.).

PubMed

Karadavut, Ufuk; Palta, Cetin

2010-01-15

Genotype-environment (GE) interaction has been a major effect to determine stable lens (Lens culinaris (Medik.) Merr.) cultivars for chemical composition in Turkey. Utilization of the lines depends on their agronomic traits and stability of the chemical composition in diverse environments. The objectives of this study were: (i) to evaluate the influence of year and location on the chemical composition of lens genotypes; and (ii) to determine which cultivar is the most stable. Genotypes were evaluated over 3 years (2005, 2006 and 2007) at four locations in Turkey. Effects of year had the largest impact on all protein contents. GE interaction was analyzed by using linear regression techniques. Stability was estimated using the Eberhart and Russell method. 'Kişlik Kirmizi51' was the most stable cultivar for grain yield. The highest protein was obtained from 'Kişlik Kirmizi51' (4.6%) across environments. According to stability analysis, 'Firat 87' had the most stable chemical composition. This genotype had a regression coefficient (b(i) = 1) around unity, and deviations from regression values (delta(ij) = 0) around zero. Chemical composition was affected by year in this study. Temperature might have an effect on protein, oil, carbohydrate, fibre and ash. Firat 87 could be recommended for favourable environments. Copyright (c) 2009 Society of Chemical Industry.
Random regression analyses using B-splines to model growth of Australian Angus cattle

PubMed Central

Meyer, Karin

2005-01-01

Regression on the basis function of B-splines has been advocated as an alternative to orthogonal polynomials in random regression analyses. Basic theory of splines in mixed model analyses is reviewed, and estimates from analyses of weights of Australian Angus cattle from birth to 820 days of age are presented. Data comprised 84 533 records on 20 731 animals in 43 herds, with a high proportion of animals with 4 or more weights recorded. Changes in weights with age were modelled through B-splines of age at recording. A total of thirteen analyses, considering different combinations of linear, quadratic and cubic B-splines and up to six knots, were carried out. Results showed good agreement for all ages with many records, but fluctuated where data were sparse. On the whole, analyses using B-splines appeared more robust against "end-of-range" problems and yielded more consistent and accurate estimates of the first eigenfunctions than previous, polynomial analyses. A model fitting quadratic B-splines, with knots at 0, 200, 400, 600 and 821 days and a total of 91 covariance components, appeared to be a good compromise between detailedness of the model, number of parameters to be estimated, plausibility of results, and fit, measured as residual mean square error. PMID:16093011
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis

PubMed Central

Gianola, Daniel; Fariello, Maria I.; Naya, Hugo; Schön, Chris-Carolin

2016-01-01

Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. PMID:27520956
Statistically extracted fundamental watershed variables for estimating the loads of total nitrogen in small streams

USGS Publications Warehouse

Kronholm, Scott C.; Capel, Paul D.; Terziotti, Silvia

2016-01-01

Accurate estimation of total nitrogen loads is essential for evaluating conditions in the aquatic environment. Extrapolation of estimates beyond measured streams will greatly expand our understanding of total nitrogen loading to streams. Recursive partitioning and random forest regression were used to assess 85 geospatial, environmental, and watershed variables across 636 small (<585 km2) watersheds to determine which variables are fundamentally important to the estimation of annual loads of total nitrogen. Initial analysis led to the splitting of watersheds into three groups based on predominant land use (agricultural, developed, and undeveloped). Nitrogen application, agricultural and developed land area, and impervious or developed land in the 100-m stream buffer were commonly extracted variables by both recursive partitioning and random forest regression. A series of multiple linear regression equations utilizing the extracted variables were created and applied to the watersheds. As few as three variables explained as much as 76 % of the variability in total nitrogen loads for watersheds with predominantly agricultural land use. Catchment-scale national maps were generated to visualize the total nitrogen loads and yields across the USA. The estimates provided by these models can inform water managers and help identify areas where more in-depth monitoring may be beneficial.
Nutritional characteristics of camelina meal for 3-week-old broiler chickens.

PubMed

Pekel, A Y; Kim, J I; Chapple, C; Adeola, O

2015-03-01

Limited information on nutritional characteristics on camelina meal for broiler chickens limits its use in diets of broiler chickens. The objectives of this study were to determine the ileal digestible energy (IDE), ME, and MEn contents of 2 different camelina meal (CM1 and CM2) samples for 3-wk-old broiler chickens using the regression method and to determine glucosinolate compounds in the camelina meal samples. The CM1 and CM2 were incorporated into a corn-soybean meal-based reference diet at 3 levels (0, 100, or 200 g/kg) by replacing the energy-yielding ingredients. These 5 diets (reference diet, and 100 and 200 g/kg camelina meal from each of CM1 and CM2) were fed to 320 male Ross 708 broilers from d 21 to 28 post hatching with 8 birds per cage and 8 replicates per treatment in a randomized complete block design. Excreta were collected twice daily from d 25 to 28, and jejunal digesta and ileal digesta from the Meckel's diverticulum to approximately 2 cm proximal to the ileocecal junction were collected on d 28. The total glucosinolate content for CM1 and CM2 were 24.2 and 22.7 nmol/mg, respectively. Jejunal digesta viscosity was linearly increased (P<0.001) from 2.2 to 4.1 cP with increasing dietary camelina meal levels. There were linear effects (P<0.001) of CM1 and CM2 substitution on final weight, weight gain, feed intake, and G:F. The inclusion of CM1 and CM2 linearly decreased (P<0.001) ileal digestibility of DM, energy, and IDE. The supplementation of CM1 and CM2 linearly decreased (P<0.001) the retention of DM, nitrogen, and energy; ME, and MEn. By regressing the CM1 and CM2-associated IDE intake in kilocalories against kilograms of CM1 and CM2 intake, the IDE regression equation was Y=-10+1,429×CM1+2,125×CM2, r2=0.55, which indicates that IDE values were 1,429 kcal/kg of DM for CM1 and 2,125 kcal/kg of DM for CM2. The ME regression was Y=5+882×CM1+925×CM2, r2=0.54, which implies ME values of 882 kcal/kg of DM for CM1 and 925 kcal/kg of DM for CM2. MEn regression was Y=2+795×CM1+844×CM2, r2=0.52, which implies MEn values of 795 kcal/kg of DM for CM1 and 844 kcal/kg of DM for CM2. Based on these results, utilization of energy and nitrogen in camelina meal by broiler chickens is low and the high viscosity observed in jejunal digesta as well as the total glucosinolate in camelina meal may have contributed to the poor energy and nitrogen utilization. © 2015 Poultry Science Association Inc.

Osmium isotope and highly siderophile element systematics of lunar impact melt breccias: Implications for the late accretion history of the Moon and Earth

USGS Publications Warehouse

Puchtel, I.S.; Walker, R.J.; James, O.B.; Kring, D.A.

2008-01-01

To characterize the compositions of materials accreted to the Earth-Moon system between about 4.5 and 3.8 Ga, we have determined Os isotopic compositions and some highly siderophile element (HSE: Re, Os, Ir, Ru, Pt, and Pd) abundances in 48 subsamples of six lunar breccias. These are: Apollo 17 poikilitic melt breccias 72395 and 76215; Apollo 17 aphanitic melt breccias 73215 and 73255; Apollo 14 polymict breccia 14321; and lunar meteorite NWA482, a crystallized impact melt. Plots of Ir versus other HSE define excellent linear correlations, indicating that all data sets likely represent dominantly two-component mixtures of a low-HSE target, presumably endogenous component, and a high-HSE, presumably exogenous component. Linear regressions of these trends yield intercepts that are statistically indistinguishable from zero for all HSE, except for Ru and Pd in two samples. The slopes of the linear regressions are insensitive to target rock contributions of Ru and Pd of the magnitude observed; thus, the trendline slopes approximate the elemental ratios present in the impactor components contributed to these rocks. The 187Os/188Os and regression-derived elemental ratios for the Apollo 17 aphanitic melt breccias and the lunar meteorite indicate that the impactor components in these samples have close affinities to chondritic meteorites. The HSE in the Apollo 17 aphanitic melt breccias, however, might partially or entirely reflect the HSE characteristics of HSE-rich granulitic breccia clasts that were incorporated in the impact melt at the time of its creation. In this case, the HSE characteristics of these rocks may reflect those of an impactor that predated the impact event that led to the creation of the melt breccias. The impactor components in the Apollo 17 poikilitic melt breccias and in the Apollo 14 breccia have higher 187Os/188Os, Pt/Ir, and Ru/Ir and lower Os/Ir than most chondrites. These compositions suggest that the impactors they represent were chemically distinct from known chondrite types, and possibly represent a type of primitive material not currently delivered to Earth as meteorites. ?? 2008 Elsevier Ltd.
A Common Mechanism for Resistance to Oxime Reactivation of Acetylcholinesterase Inhibited by Organophosphorus Compounds

DTIC Science & Technology

2013-01-01

application of the Hammett equation with the constants rph in the chemistry of organophosphorus compounds, Russ. Chem. Rev. 38 (1969) 795–811. [13...of oximes and OP compounds and the ability of oximes to reactivate OP- inhibited AChE. Multiple linear regression equations were analyzed using...phosphonate pairs, 21 oxime/ phosphoramidate pairs and 12 oxime/phosphate pairs. The best linear regression equation resulting from multiple regression anal
Comparison of a New Cobinamide-Based Method to a Standard Laboratory Method for Measuring Cyanide in Human Blood

PubMed Central

Swezey, Robert; Shinn, Walter; Green, Carol; Drover, David R.; Hammer, Gregory B.; Schulman, Scott R.; Zajicek, Anne; Jett, David A.; Boss, Gerry R.

2013-01-01

Most hospital laboratories do not measure blood cyanide concentrations, and samples must be sent to reference laboratories. A simple method is needed for measuring cyanide in hospitals. The authors previously developed a method to quantify cyanide based on the high binding affinity of the vitamin B12 analog, cobinamide, for cyanide and a major spectral change observed for cyanide-bound cobinamide. This method is now validated in human blood, and the findings include a mean inter-assay accuracy of 99.1%, precision of 8.75% and a lower limit of quantification of 3.27 µM cyanide. The method was applied to blood samples from children treated with sodium nitroprusside and it yielded measurable results in 88 of 172 samples (51%), whereas the reference laboratory yielded results in only 19 samples (11%). In all 19 samples, the cobinamide-based method also yielded measurable results. The two methods showed reasonable agreement when analyzed by linear regression, but not when analyzed by a standard error of the estimate or paired t-test. Differences in results between the two methods may be because samples were assayed at different times on different sample types. The cobinamide-based method is applicable to human blood, and can be used in hospital laboratories and emergency rooms. PMID:23653045
A Pilot Investigation of the Relationship between Climate Variability and Milk Compounds under the Bootstrap Technique

PubMed Central

Marami Milani, Mohammad Reza; Hense, Andreas; Rahmani, Elham; Ploeger, Angelika

2015-01-01

This study analyzes the linear relationship between climate variables and milk components in Iran by applying bootstrapping to include and assess the uncertainty. The climate parameters, Temperature Humidity Index (THI) and Equivalent Temperature Index (ETI) are computed from the NASA-Modern Era Retrospective-Analysis for Research and Applications (NASA-MERRA) reanalysis (2002–2010). Milk data for fat, protein (measured on fresh matter bases), and milk yield are taken from 936,227 milk records for the same period, using cows fed by natural pasture from April to September. Confidence intervals for the regression model are calculated using the bootstrap technique. This method is applied to the original times series, generating statistically equivalent surrogate samples. As a result, despite the short time data and the related uncertainties, an interesting behavior of the relationships between milk compound and the climate parameters is visible. During spring only, a weak dependency of milk yield and climate variations is obvious, while fat and protein concentrations show reasonable correlations. In summer, milk yield shows a similar level of relationship with ETI, but not with temperature and THI. We suggest this methodology for studies in the field of the impacts of climate change and agriculture, also environment and food with short-term data. PMID:28231215
Relationship Between Crop Losses and Initial Population Densities of Meloidogyne arenaria in Winter-Grown Oriental Melon in Korea

PubMed Central

Kim, D.G.; Ferris, H.

2002-01-01

To determine the economic threshold level, oriental melon (Cucumis melo L. cv. Geumssaragi-euncheon) grafted on Shintozoa (Cucurbita maxima × Cu. moschata) was planted in plots (2 × 3 m) under a plastic film in February with a range of initial population densities (Pi) of Meloidogyne arenaria. The relationships of early, late, and total yield to Pi measured in September and January were adequately described by both linear regression and the Seinhorst damage model. Initial nematode densities in September in excess of 14 second-stage juveniles (J2)/100 cm³ soil caused losses in total yields that exceeded the economic threshold and indicate the need for fosthiazate nematicide treatment at current costs. Differences in yield-loss relationships to Pi between early- and late-season harvests enhance the resolution of the management decision and suggest approaches for optimizing returns. Determination of population levels for advisory purposes can be based on assay samples taken several months before planting, which allows time for implementation of management procedures. We introduce (i) an amendment of the economic threshold definition to reflect efficacy of the nematode management procedure under consideration, and (ii) the concept of profit limit as the nematode population at which net returns from the system will become negative. PMID:19265907
The potential for using canopy spectral reflectance as an indirect selection tool for yield improvement in winter wheat

NASA Astrophysics Data System (ADS)

Prasad, Bishwajit

Scope and methods of study. Complementing breeding effort by deploying alternative methods of identifying higher yielding genotypes in a wheat breeding program is important for obtaining greater genetic gains. Spectral reflectance indices (SRI) are one of the many indirect selection tools that have been reported to be associated with different physiological process of wheat. A total of five experiments (a set of 25 released cultivars from winter wheat breeding programs of the U.S. Great Plains and four populations of randomly derived recombinant inbred lines having 25 entries in each population) were conducted in two years under Great Plains winter wheat rainfed environments at Oklahoma State University research farms. Grain yield was measured in each experiment and biomass was measured in three experiments at three growth stages (booting, heading, and grainfilling). Canopy spectral reflectance was measured at three growth stages and eleven SRI were calculated. Correlation (phenotypic and genetic) between grain yield and SRI, biomass and SRI, heritability (broad sense) of the SRI and yield, response to selection and correlated response, relative selection efficiency of the SRI, and efficiency in selecting the higher yielding genotypes by the SRI were assessed. Findings and conclusions. The genetic correlation coefficients revealed that the water based near infrared indices (WI and NWI) were strongly associated with grain yield and biomass production. The regression analysis detected a linear relationship between the water based indices with grain yield and biomass. The two newly developed indices (NWI-3 and NWI-4) gave higher broad sense heritability than grain yield, higher direct response to selection compared to grain yield, correlated response equal to or higher than direct response for grain yield, relative selection efficiency greater than one, and higher efficiency in selecting higher yielding genotypes. Based on the overall genetic analysis required to establish any trait as an efficient indirect selection tool, the water based SRI (especially NWI-3 and NWI-4) have the potential to complement the classical breeding effort for selecting genotypes with higher yield potential in a winter wheat breeding program.
Understanding Preprocedure Patient Flow in IR.

PubMed

Zafar, Abdul Mueed; Suri, Rajeev; Nguyen, Tran Khanh; Petrash, Carson Cope; Fazal, Zanira

2016-08-01

To quantify preprocedural patient flow in interventional radiology (IR) and to identify potential contributors to preprocedural delays. An administrative dataset was used to compute time intervals required for various preprocedural patient-flow processes. These time intervals were compared across on-time/delayed cases and inpatient/outpatient cases by Mann-Whitney U test. Spearman ρ was used to assess any correlation of the rank of a procedure on a given day and the procedure duration to the preprocedure time. A linear-regression model of preprocedure time was used to further explore potential contributing factors. Any identified reason(s) for delay were collated. P < .05 was considered statistically significant. Of the total 1,091 cases, 65.8% (n = 718) were delayed. Significantly more outpatient cases started late compared with inpatient cases (81.4% vs 45.0%; P < .001, χ(2) test). The multivariate linear regression model showed outpatient status, length of delay in arrival, and longer procedure times to be significantly associated with longer preprocedure times. Late arrival of patients (65.9%), unavailability of physicians (18.4%), and unavailability of procedure room (13.0%) were the three most frequently identified reasons for delay. The delay was multifactorial in 29.6% of cases (n = 213). Objective measurement of preprocedural IR patient flow demonstrated considerable waste and highlighted high-yield areas of possible improvement. A data-driven approach may aid efficient delivery of IR care. Copyright © 2016 SIR. Published by Elsevier Inc. All rights reserved.
Characterizing the performance of the Conway-Maxwell Poisson generalized linear model.

PubMed

Francis, Royce A; Geedipally, Srinivas Reddy; Guikema, Seth D; Dhavala, Soma Sekhar; Lord, Dominique; LaRocca, Sarah

2012-01-01

Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway-Maxwell Poisson (COM-Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM-Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM-Poisson GLM, and (2) estimate the prediction accuracy of the COM-Poisson GLM using simulated data sets. The results of the study indicate that the COM-Poisson GLM is flexible enough to model under-, equi-, and overdispersed data sets with different sample mean values. The results also show that the COM-Poisson GLM yields accurate parameter estimates. The COM-Poisson GLM provides a promising and flexible approach for performing count data regression. © 2011 Society for Risk Analysis.
Sufficient Forecasting Using Factor Models

PubMed Central

Fan, Jianqing; Xue, Lingzhou; Yao, Jiawei

2017-01-01

We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables. PMID:29731537
Calculating stage duration statistics in multistage diseases.

PubMed

Komarova, Natalia L; Thalhauser, Craig J

2011-01-01

Many human diseases are characterized by multiple stages of progression. While the typical sequence of disease progression can be identified, there may be large individual variations among patients. Identifying mean stage durations and their variations is critical for statistical hypothesis testing needed to determine if treatment is having a significant effect on the progression, or if a new therapy is showing a delay of progression through a multistage disease. In this paper we focus on two methods for extracting stage duration statistics from longitudinal datasets: an extension of the linear regression technique, and a counting algorithm. Both are non-iterative, non-parametric and computationally cheap methods, which makes them invaluable tools for studying the epidemiology of diseases, with a goal of identifying different patterns of progression by using bioinformatics methodologies. Here we show that the regression method performs well for calculating the mean stage durations under a wide variety of assumptions, however, its generalization to variance calculations fails under realistic assumptions about the data collection procedure. On the other hand, the counting method yields reliable estimations for both means and variances of stage durations. Applications to Alzheimer disease progression are discussed.
Harmonization of the Bayer ADVIA Centaur and Abbott AxSYM automated B-type natriuretic peptide assay in patients on hemodialysis.

PubMed

Barak, Mira; Weinberger, Ronit; Marcusohn, Jerom; Froom, Paul

2005-01-01

There are two fully automated high-throughput clinical instruments for brain natriuretic peptide (BNP) assays, the Bayer ADVIA Centaur assay, and the Abbott AxSYM assay. Although both recommend a cut-off value of 100 pg/mL, we are unaware of previous studies that have compared the unadjusted results of the two methods, required for proper evaluation of patients undergoing this test on different platforms. From 43 hemodialysis patients, 80 paired samples were collected by venipuncture into plastic evacuated tubes containing EDTA. The Bayer assay yielded lower values than the Abbott assay, with linear regression of 0.53 x Abbott assay (95% confidence interval, 0.50-0.56) being forced through 0, demonstrating an r(2)-value of 0.954. Regression for the Abbott assay was 1.79 x Bayer assay (95% CI, 1.69-1.89). The cut-off values for abnormal BNP results analyzed on the Abbott system are not identical to those on the Bayer system, and this needs to be taken into account when comparing studies on the clinical utility of these systems.
Specialization Agreements in the Council for Mutual Economic Assistance

DTIC Science & Technology

1988-02-01

proportions to stabilize variance (S. Weisberg, Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134). If the dependent...27, 1986, p. 3. Weisberg, S., Applied Linear Regression , 2nd ed., John Wiley & Sons, New York, 1985, p. 134. Wiles, P. J., Communist International
Radio Propagation Prediction Software for Complex Mixed Path Physical Channels

DTIC Science & Technology

2006-08-14

63 4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz 69 4.4.7. Projected Scaling to...4.4.6. Applied Linear Regression Analysis in the Frequency Range 1-50 MHz In order to construct a comprehensive numerical algorithm capable of
INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION

EPA Science Inventory

Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Data Transformations for Inference with Linear Regression: Clarifications and Recommendations

ERIC Educational Resources Information Center

Pek, Jolynn; Wong, Octavia; Wong, C. M.

2017-01-01

Data transformations have been promoted as a popular and easy-to-implement remedy to address the assumption of normally distributed errors (in the population) in linear regression. However, the application of data transformations introduces non-ignorable complexities which should be fully appreciated before their implementation. This paper adds to…
USING LINEAR AND POLYNOMIAL MODELS TO EXAMINE THE ENVIRONMENTAL STABILITY OF VIRUSES

EPA Science Inventory

The article presents the development of model equations for describing the fate of viral infectivity in environmental samples. Most of the models were based upon the use of a two-step linear regression approach. The first step employs regression of log base 10 transformed viral t...
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis

ERIC Educational Resources Information Center

Camilleri, Liberato; Cefai, Carmel

2013-01-01

Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Optimization of pressurized liquid extraction of inositols from pine nuts (Pinus pinea L.).

PubMed

Ruiz-Aceituno, L; Rodríguez-Sánchez, S; Sanz, J; Sanz, M L; Ramos, L

2014-06-15

Pressurized liquid extraction (PLE) has been used for the first time to extract bioactive inositols from pine nuts. The influence of extraction time, temperature and cycles of extraction in the yield and composition of the extract was studied. A quadratic lineal model using multiple linear regression in the stepwise mode was used to evaluate possible trends in the process. Under optimised PLE conditions (50°C, 18 min, 3 cycles of 1.5 mL water each one) at 10 MPa, a noticeable reduction in extraction time and solvent volume, compared with solid-liquid extraction (SLE; room temperature, 2h, 2 cycles of 5 mL water each one) was achieved; 5.7 mg/g inositols were extracted by PLE, whereas yields of only 3.7 mg/g were obtained by SLE. Subsequent incubation of PLE extracts with Saccharomyces cerevisiae (37°C, 5h) allowed the removal of other co-extracted low molecular weight carbohydrates which may interfere in the bioactivity of inositols. Copyright © 2014 Elsevier Ltd. All rights reserved.
A highly fluorescent hydrophilic ionic liquid as a potential probe for the sensing of biomacromolecules.

PubMed

Chen, Xu-Wei; Liu, Jia-Wei; Wang, Jian-Hua

2011-02-17

With respect to the conventional imidazolium ionic liquids which generally create very weak fluorescence with quantum yields at extremely low levels of 0.005-0.02, a symmetrical hydrophilic ionic liquid 1,3-butylimidazolium chloride (BBimCl) was found to be highly fluorescent with λ(em) at 388 nm when excited at λ(ex) < 340 nm. The very high quantum yield of BBimCl in aqueous medium, derived to be 0.523 when excited at 315 nm, was attributed to its symmetrical plane conjugating structure. In the presence of hemoglobin, the fluorescence of BBimCl could be significantly quenched, resulting from the coordinating interaction between the iron atom in the heme group of hemoglobin and the cationic imidazolium moiety. This feature of the present hydrophilic ionic liquid makes it a promising fluorescence probe candidate for the sensitive sensing of hemoglobin. A linear regression was observed within 3 × 10(-7) to 5 × 10(-6) mol L(-1) for hemoglobin, and a detection limit of 7.3 × 10(-8) mol L(-1) was derived.
An automated real-time free phenytoin assay to replace the obsolete Abbott TDx method.

PubMed

Williams, Christopher; Jones, Richard; Akl, Pascale; Blick, Kenneth

2014-01-01

Phenytoin is a commonly used anticonvulsant that is highly protein bound with a narrow therapeutic range. The unbound fraction, free phenytoin (FP), is responsible for pharmacologic effects; therefore, it is essential to measure both FP and total serum phenytoin levels. Historically, the Abbott TDx method has been widely used for the measurement of FP and was the method used in our laboratory. However, the FP TDx assay was recently discontinued by the manufacturer, so we had to develop an alternative methodology. We evaluated the Beckman-Coulter DxC800 based FP method for linearity, analytical sensitivity, and precision. The analytical measurement range of the method was 0.41 to 5.30 microg/mL. Within-run and between-run precision studies yielded CVs of 3.8% and 5.5%, respectively. The method compared favorably with the TDx method, yielding the following regression equation: DxC800 = 0.9**TDx + 0.10; r2 = 0.97 (n = 97). The new FP assay appears to be an acceptable alternative to the TDx method.

Hydrogel keratophakia: a microkeratome dissection in the monkey model.

PubMed Central

Beekhuis, W H; McCarey, B E; Waring, G O; van Rij, G

1986-01-01

High water content intracorneal implants were fabricated from Vistamarc hydrogel (Vistakon, Inc.) at 58%, 68%, and 72% water content and a range of powers from +7.25 to +17.00 dioptres. The Barraquer microkeratome technique was used to implant the lens at 59.0 +/- 9% (+/- SD) depth in the corneas of 14 rhesus monkey eyes. The contralateral eye served as a control. Three eyes were lost to the study because of complications. The remaining 11 animals were followed up for 51 +/- 2 weeks with the refractive yield being 118 +/- 34% and the keratometric yield being 92 +/- 30%. The measured and theoretically expected refractive changes have a linear regression line correlation coefficient of 0.74, whereas the respective keratometric data had a correlation coefficient of 0.04. The measured refraction became stable within 2 to 3 dioptres after 20 postoperative weeks. The hydrogels were well tolerated within the corneal tissue. There was a minimum of interface problems except along the edge of the implant. Implants with abruptly cut edges versus a fine wedge tended to have more light scattering collagen at the implant margin. PMID:3954976
Simple and multiple linear regression: sample size considerations.

PubMed

Hanley, James A

2016-11-01

The suggested "two subjects per variable" (2SPV) rule of thumb in the Austin and Steyerberg article is a chance to bring out some long-established and quite intuitive sample size considerations for both simple and multiple linear regression. This article distinguishes two of the major uses of regression models that imply very different sample size considerations, neither served well by the 2SPV rule. The first is etiological research, which contrasts mean Y levels at differing "exposure" (X) values and thus tends to focus on a single regression coefficient, possibly adjusted for confounders. The second research genre guides clinical practice. It addresses Y levels for individuals with different covariate patterns or "profiles." It focuses on the profile-specific (mean) Y levels themselves, estimating them via linear compounds of regression coefficients and covariates. By drawing on long-established closed-form variance formulae that lie beneath the standard errors in multiple regression, and by rearranging them for heuristic purposes, one arrives at quite intuitive sample size considerations for both research genres. Copyright Â© 2016 Elsevier Inc. All rights reserved.
A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression

PubMed Central

Jiang, Feng; Han, Ji-zhong

2018-01-01

Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression.

PubMed

Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong

2018-01-01

Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
Single-Grain (U-Th)/He Ages of Phosphates from St. Severin Chondrite

NASA Astrophysics Data System (ADS)

Min, K. K.; Reiners, P. W.; Shuster, D. L.

2010-12-01

Thermal evolution of chondrites provides valuable information on the heat budget, internal structure and dimensions of their parent bodies once existed before disruption. St. Severin LL6 ordinary chondrite is known to have experienced relatively slow cooling compared to H chondrites. The timings of primary cooling and subsequent thermal metamorphism were constrained by U/Pb (4.55 Ga), Sm/Nd (4.55 Ga), Rb/Sr (4.51 Ga) and K/Ar (4.4 Ga) systems. However, cooling history after the thermal metamorphism in a low temperature range (<200 °C) is poorly understood. In order to constrain the low-T thermal history of this meteorite, we performed (1) single-grain (U-Th)/He dating for five chlorapatite and fourteen merrillite aggregates from St. Severin, (2) examination of textural and chemical features of the phosphate aggregates using a scanning electron microscope (SEM), and (3) proton-irradiation followed by 4He and 3He diffusion experiments for single grains of chlorapatite and merrillite from Guarena meteorite, for general characterization of He diffusivity in these major U-Th reservoirs in meteorites. The α-recoil-uncorrected ages from St. Severin are distributed in a wide range of 333 ± 6 Ma and 4620 ± 1307 Ma. The probability density plot of these data shows a typical younging-skewed age distribution with a prominent peak at ~ 4.3 Ga. The weighted mean of the nine oldest samples is 4.284 ± 0.130 Ga, which is consistent with the peak of the probability plot. The linear dimensions of the phosphates are generally in the range of ~50 µm to 200 µm. The α recoil correction factor (FT) based on the morphology of the phosphate yields improbably old ages (>4.6 Ga), suggesting that within the sample aggregates, significant amounts of the α particles ejected from phosphates were implanted into the adjacent phases and therefore that this correction may not be appropriate in this case. The minimum FT value of 0.95 is calculated based on the peak (U-Th)/He age and 40Ar/39Ar data which provide the upper limit of the α-recoil-corrected (U-Th)/He ages. From these data, we conclude that the St. Severin cooled through the closure temperatures of chlorapatite and merrillite during ~4.3 - 4.4 Ga. The radiogenic 4He and proton-induced 3He diffusion experiments yield two well-defined linear trends in Arrhenius plot for chlorapatite (r = 43 µm) and merrillite (r = 59 µm) grains. The linear regression of 3He data for chlorapatite yields Ea = 128.1 ± 2.4 kJ/mol, and ln(Do/a2) = 11.6 ± 0.5 ln(s-1) which are generally consistent with the terrestrial Durango apatite and meteoritic Acapulco apatite. Linear regression to the merrillite data corresponds to Ea = 135.1 ± 2.5 kJ/mol, and ln(Do/a2) = 5.73 ± 0.37 ln(s-1). The new data indicate that diffusive retentivity of He within merrillite is significantly higher than that of chlorapatite, which has implications for quantitative interpretation of He ages measured in meteoritic phosphates.
Comparison of statistical models for analyzing wheat yield time series.

PubMed

Michel, Lucie; Makowski, David

2013-01-01

The world's population is predicted to exceed nine billion by 2050 and there is increasing concern about the capability of agriculture to feed such a large population. Foresight studies on food security are frequently based on crop yield trends estimated from yield time series provided by national and regional statistical agencies. Various types of statistical models have been proposed for the analysis of yield time series, but the predictive performances of these models have not yet been evaluated in detail. In this study, we present eight statistical models for analyzing yield time series and compare their ability to predict wheat yield at the national and regional scales, using data provided by the Food and Agriculture Organization of the United Nations and by the French Ministry of Agriculture. The Holt-Winters and dynamic linear models performed equally well, giving the most accurate predictions of wheat yield. However, dynamic linear models have two advantages over Holt-Winters models: they can be used to reconstruct past yield trends retrospectively and to analyze uncertainty. The results obtained with dynamic linear models indicated a stagnation of wheat yields in many countries, but the estimated rate of increase of wheat yield remained above 0.06 t ha⁻¹ year⁻¹ in several countries in Europe, Asia, Africa and America, and the estimated values were highly uncertain for several major wheat producing countries. The rate of yield increase differed considerably between French regions, suggesting that efforts to identify the main causes of yield stagnation should focus on a subnational scale.
No association of smoke-free ordinances with profits from bingo and charitable games in Massachusetts

PubMed Central

Glantz, S; Wilson-Loots, R

2003-01-01

Background: Because it is widely played, claims that smoking restrictions will adversely affect bingo games is used as an argument against these policies. We used publicly available data from Massachusetts to assess the impact of 100% smoke-free ordinances on profits from bingo and other gambling sponsored by charitable organisations between 1985 and 2001. Methods: We conducted two analyses: (1) a general linear model implementation of a time series analysis with net profits (adjusted to 2001 dollars) as the dependent variable, and community (as a fixed effect), year, lagged net profits, and the length of time the ordinance had been in force as the independent variables; (2) multiple linear regression of total state profits against time, lagged profits, and the percentage of the entire state population in communities that allow charitable gaming but prohibit smoking. Results: The general linear model analysis of data from individual communities showed that, while adjusted profits fell over time, this effect was not related to the presence of an ordinance. The analysis in terms of the fraction of the population living in communities with ordinances yielded the same result. Conclusion: Policymakers can implement smoke-free policies without concern that these policies will affect charitable gaming. PMID:14660778
Intramuscular Pressure Measurement During Locomotion in Humans

NASA Technical Reports Server (NTRS)

Ballard, Ricard E.

1996-01-01

To assess the usefulness of intramuscular pressure (IMP) measurement for studying muscle function during gait, IMP was recorded in the soleus and tibialis anterior muscles of ten volunteers during, treadmill walking, and running using transducer-tipped catheters. Soleus IMP exhibited single peaks during late-stance phase of walking (181 +/- 69 mmHg, mean +/- S.E.) and running (269 +/- 95 mmHg). Tibialis anterior IMP showed a biphasic response, with the largest peak (90 +/- 15 mmHg during walking and 151 +/- 25 mmHg during running) occurring shortly after heel strike. IMP magnitude increased with gait speed in both muscles. Linear regression of soleus IMP against ankle joint torque obtained by a dynamometer in two subjects produced linear relationships (r = 0.97). Application of these relationships to IMP data yielded estimated peak soleus moment contributions of 0.95-165 Nm/Kg during walking, and 1.43-2.70 Nm/Kg during running. IMP results from local muscle tissue deformations caused by muscle force development and thus, provides a direct, practical index of muscle function during locomotion in humans.
Leg intramuscular pressures during locomotion in humans

NASA Technical Reports Server (NTRS)

Ballard, R. E.; Watenpaugh, D. E.; Breit, G. A.; Murthy, G.; Holley, D. C.; Hargens, A. R.

1998-01-01

To assess the usefulness of intramuscular pressure (IMP) measurement for studying muscle function during gait, IMP was recorded in the soleus and tibialis anterior muscles of 10 volunteers during treadmill walking and running by using transducer-tipped catheters. Soleus IMP exhibited single peaks during late-stance phase of walking [181 +/- 69 (SE) mmHg] and running (269 +/- 95 mmHg). Tibialis anterior IMP showed a biphasic response, with the largest peak (90 +/- 15 mmHg during walking and 151 +/- 25 mmHg during running) occurring shortly after heel strike. IMP magnitude increased with gait speed in both muscles. Linear regression of soleus IMP against ankle joint torque obtained by a dynamometer produced linear relationships (n = 2, r = 0.97 for both). Application of these relationships to IMP data yielded estimated peak soleus moment contributions of 0.95-1.65 N . m/kg during walking, and 1.43-2.70 N . m/kg during running. Phasic elevations of IMP during exercise are probably generated by local muscle tissue deformations due to muscle force development. Thus profiles of IMP provide a direct, reproducible index of muscle function during locomotion in humans.
Spectrophotometric evaluation of stability constants of 1:1 weak complexes from continuous variation data.

PubMed

Sayago, Ana; Asuero, Agustin G

2006-09-14

A bilogarithmic hyperbolic cosine method for the spectrophotometric evaluation of stability constants of 1:1 weak complexes from continuous variation data has been devised and applied to literature data. A weighting scheme, however, is necessary in order to take into account the transformation for linearization. The method may be considered a useful alternative to methods in which one variable is involved on both sides of the basic equation (i.e. Heller and Schwarzenbach, Likussar and Adsul and Ramanathan). Classical least squares lead in those instances to biased and approximate stability constants and limiting absorbance values. The advantages of the proposed method are: the method gives a clear indication of the existence of only one complex in solution, it is flexible enough to allow for weighting of measurements and the computation procedure yield the best value of logbeta11 and its limit of error. The agreement between the values obtained by applying the weighted hyperbolic cosine method and the non-linear regression (NLR) method is good, being in both cases the mean quadratic error at a minimum.
Environmentally Dependent Density-Distance Relationship of Dispersing Culex tarsalis in a Southern California Desert Region.

PubMed

Antonić, Oleg; Sudarić-Bogojević, Mirta; Lothrop, Hugh; Merdić, Enrih

2014-09-01

The direct inclusion of environmental factors into the empirical model that describes a density-distance relationship (DDR) is demonstrated on dispersal data obtained in a capture-mark-release-recapture experiment (CMRR) with Culex tarsalis conducted around the community of Mecca, CA. Empirical parameters of standard (environmentally independent) DDR were expressed as linear functions of environmental variables: relative orientation (azimuthal deviation of north) of release point (relative to recapture point) and proportions of habitat types surrounding each recapture point. The yielded regression model (R(2) = 0.5373, after optimization on the best subset of linear terms) suggests that spatial density of recaptured individuals after 12 days of a CMRR experiment significantly depended on 1) distance from release point, 2) orientation of recapture points in relation to release point (preferring dispersal toward the south, probably due to wind drift and position of periodically flooded habitats suitable for species egg clutches), and 3) habitat spectrum in surroundings of recapture points (increasing and decreasing population density in desert and urban environment, respectively).
Bivariate categorical data analysis using normal linear conditional multinomial probability model.

PubMed

Sun, Bingrui; Sutradhar, Brajendra

2015-02-10

Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio-based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio-based association model treats the odds ratios involved in the joint probabilities as 'working' parameters, which are consequently estimated through certain arbitrary 'working' regression models. Also, this later odds ratio-based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre-specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi-likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well-known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.
Riemannian multi-manifold modeling and clustering in brain networks

NASA Astrophysics Data System (ADS)

Slavakis, Konstantinos; Salsabilian, Shiva; Wack, David S.; Muldoon, Sarah F.; Baidoo-Williams, Henry E.; Vettel, Jean M.; Cieslak, Matthew; Grafton, Scott T.

2017-08-01

This paper introduces Riemannian multi-manifold modeling in the context of brain-network analytics: Brainnetwork time-series yield features which are modeled as points lying in or close to a union of a finite number of submanifolds within a known Riemannian manifold. Distinguishing disparate time series amounts thus to clustering multiple Riemannian submanifolds. To this end, two feature-generation schemes for brain-network time series are put forth. The first one is motivated by Granger-causality arguments and uses an auto-regressive moving average model to map low-rank linear vector subspaces, spanned by column vectors of appropriately defined observability matrices, to points into the Grassmann manifold. The second one utilizes (non-linear) dependencies among network nodes by introducing kernel-based partial correlations to generate points in the manifold of positivedefinite matrices. Based on recently developed research on clustering Riemannian submanifolds, an algorithm is provided for distinguishing time series based on their Riemannian-geometry properties. Numerical tests on time series, synthetically generated from real brain-network structural connectivity matrices, reveal that the proposed scheme outperforms classical and state-of-the-art techniques in clustering brain-network states/structures.
Simultaneous Myocardial Strain and Dark-Blood Perfusion Imaging Using a Displacement-Encoded MRI Pulse Sequence

PubMed Central

Le, Yuan; Stein, Ashley; Berry, Colin; Kellman, Peter; Bennett, Eric E.; Taylor, Joni; Lucas, Katherine; Kopace, Rael; Chefd’Hotel, Christophe; Lorenz, Christine H.; Croisille, Pierre; Wen, Han

2010-01-01

The purpose of this study is to develop and evaluate a displacement-encoded pulse sequence for simultaneous perfusion and strain imaging. Displacement-encoded images in 2–3 myocardial slices were repeatedly acquired using a single shot pulse sequence for 3 to 4 minutes, which covers a bolus infusion of Gd. The magnitudes of the images were T1 weighted and provided quantitative measures of perfusion, while the phase maps yielded strain measurements. In an acute coronary occlusion swine protocol (n=9), segmental perfusion measurements were validated against microsphere reference standard with a linear regression (slope 0.986, R2 = 0.765, Bland-Altman standard deviation = 0.15 ml/min/g). In a group of ST-elevation myocardial infarction(STEMI) patients (n=11), the scan success rate was 76%. Short-term contrast washout rate and perfusion are highly correlated (R2=0.72), and the pixel-wise relationship between circumferential strain and perfusion was better described with a sigmoidal Hill curve than linear functions. This study demonstrates the feasibility of measuring strain and perfusion from a single set of images. PMID:20544714
Analysis of Binary Adherence Data in the Setting of Polypharmacy: A Comparison of Different Approaches

PubMed Central

Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.

2009-01-01

Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Genetic Programming Transforms in Linear Regression Situations

NASA Astrophysics Data System (ADS)

Castillo, Flor; Kordon, Arthur; Villa, Carlos

The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Naval Research Logistics Quarterly. Volume 28. Number 3,

DTIC Science & Technology

1981-09-01

denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Automating approximate Bayesian computation by local linear regression.

PubMed

Thornton, Kevin R

2009-07-07

In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.
Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

NASA Astrophysics Data System (ADS)

Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

2017-12-01

The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

PubMed

Haoliang Yuan; Yuan Yan Tang

2017-04-01

Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

Simple linear and multivariate regression models.

PubMed

Rodríguez del Águila, M M; Benítez-Parejo, N

2011-01-01

In biomedical research it is common to find problems in which we wish to relate a response variable to one or more variables capable of describing the behaviour of the former variable by means of mathematical models. Regression techniques are used to this effect, in which an equation is determined relating the two variables. While such equations can have different forms, linear equations are the most widely used form and are easy to interpret. The present article describes simple and multiple linear regression models, how they are calculated, and how their applicability assumptions are checked. Illustrative examples are provided, based on the use of the freely accessible R program. Copyright © 2011 SEICAP. Published by Elsevier Espana. All rights reserved.
Estimates of nitrate loads and yields from groundwater to streams in the Chesapeake Bay watershed based on land use and geology

USGS Publications Warehouse

Terziotti, Silvia; Capel, Paul D.; Tesoriero, Anthony J.; Hopple, Jessica A.; Kronholm, Scott C.

2018-03-07

The water quality of the Chesapeake Bay may be adversely affected by dissolved nitrate carried in groundwater discharge to streams. To estimate the concentrations, loads, and yields of nitrate from groundwater to streams for the Chesapeake Bay watershed, a regression model was developed based on measured nitrate concentrations from 156 small streams with watersheds less than 500 square miles (mi2 ) at baseflow. The regression model has three predictive variables: geologic unit, percent developed land, and percent agricultural land. Comparisons of estimated and actual values within geologic units were closely matched. The coefficient of determination (R2 ) for the model was 0.6906. The model was used to calculate baseflow nitrate concentrations at over 83,000 National Hydrography Dataset Plus Version 2 catchments and aggregated to 1,966 total 12-digit hydrologic units in the Chesapeake Bay watershed. The modeled output geospatial data layers provided estimated annual loads and yields of nitrate from groundwater into streams. The spatial distribution of annual nitrate yields from groundwater estimated by this method was compared to the total watershed yields of all sources estimated from a Chesapeake Bay SPAtially Referenced Regressions On Watershed attributes (SPARROW) water-quality model. The comparison showed similar spatial patterns. The regression model for groundwater contribution had similar but lower yields, suggesting that groundwater is an important source of nitrogen for streams in the Chesapeake Bay watershed.
Optimization of isotherm models for pesticide sorption on biopolymer-nanoclay composite by error analysis.

PubMed

Narayanan, Neethu; Gupta, Suman; Gajbhiye, V T; Manjaiah, K M

2017-04-01

A carboxy methyl cellulose-nano organoclay (nano montmorillonite modified with 35-45 wt % dimethyl dialkyl (C 14 -C 18 ) amine (DMDA)) composite was prepared by solution intercalation method. The prepared composite was characterized by infrared spectroscopy (FTIR), X-Ray diffraction spectroscopy (XRD) and scanning electron microscopy (SEM). The composite was utilized for its pesticide sorption efficiency for atrazine, imidacloprid and thiamethoxam. The sorption data was fitted into Langmuir and Freundlich isotherms using linear and non linear methods. The linear regression method suggested best fitting of sorption data into Type II Langmuir and Freundlich isotherms. In order to avoid the bias resulting from linearization, seven different error parameters were also analyzed by non linear regression method. The non linear error analysis suggested that the sorption data fitted well into Langmuir model rather than in Freundlich model. The maximum sorption capacity, Q 0 (μg/g) was given by imidacloprid (2000) followed by thiamethoxam (1667) and atrazine (1429). The study suggests that the degree of determination of linear regression alone cannot be used for comparing the best fitting of Langmuir and Freundlich models and non-linear error analysis needs to be done to avoid inaccurate results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Serum levels of the immune activation marker neopterin change with age and gender and are modified by race, BMI, and percentage of body fat.

PubMed

Spencer, Monique E; Jain, Alka; Matteini, Amy; Beamer, Brock A; Wang, Nae-Yuh; Leng, Sean X; Punjabi, Naresh M; Walston, Jeremy D; Fedarko, Neal S

2010-08-01

Neopterin, a GTP metabolite expressed by macrophages, is a marker of immune activation. We hypothesize that levels of this serum marker alter with donor age, reflecting increased chronic immune activation in normal aging. In addition to age, we assessed gender, race, body mass index (BMI), and percentage of body fat (%fat) as potential covariates. Serum was obtained from 426 healthy participants whose age ranged from 18 to 87 years. Anthropometric measures included %fat and BMI. Neopterin concentrations were measured by competitive ELISA. The paired associations between neopterin and age, BMI, or %fat were analyzed by Spearman's correlation or by linear regression of log-transformed neopterin, whereas overall associations were modeled by multiple regression of log-transformed neopterin as a function of age, gender, race, BMI, %fat, and interaction terms. Across all participants, neopterin exhibited a positive association with age, BMI, and %fat. Multiple regression modeling of neopterin in women and men as a function of age, BMI, and race revealed that each covariate contributed significantly to neopterin values and that optimal modeling required an interaction term between race and BMI. The covariate %fat was highly correlated with BMI and could be substituted for BMI to yield similar regression coefficients. The association of age and gender with neopterin levels and their modification by race, BMI, or %fat reflect the biology underlying chronic immune activation and perhaps gender differences in disease incidence, morbidity, and mortality.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

PubMed Central

Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

2017-01-01

Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
A two-scale scattering model with application to the JONSWAP '75 aircraft microwave scatterometer experiment

NASA Technical Reports Server (NTRS)

Wentz, F. J.

1977-01-01

The general problem of bistatic scattering from a two scale surface was evaluated. The treatment was entirely two-dimensional and in a vector formulation independent of any particular coordinate system. The two scale scattering model was then applied to backscattering from the sea surface. In particular, the model was used in conjunction with the JONSWAP 1975 aircraft scatterometer measurements to determine the sea surface's two scale roughness distributions, namely the probability density of the large scale surface slope and the capillary wavenumber spectrum. Best fits yield, on the average, a 0.7 dB rms difference between the model computations and the vertical polarization measurements of the normalized radar cross section. Correlations between the distribution parameters and the wind speed were established from linear, least squares regressions.
Determination of teicoplanin concentrations in serum by high-pressure liquid chromatography.

PubMed Central

Joos, B; Lüthy, R

1987-01-01

An isocratic reversed-phase high-pressure liquid chromatographic method for the determination of six components of the teicoplanin complex in biological fluid was developed. By using fluorescence detection after precolumn derivatization with fluorescamine, the assay is specific and highly sensitive, with reproducibility studies yielding coefficients of variation ranging from 1.5 to 8.5% (at 5 to 80 micrograms/ml). Response was linear from 2.5 to 80 micrograms/ml (r = 0.999); the recovery from spiked human serum was 76%. An external quality control was performed to compare this high-pressure liquid chromatographic method (H) with a standard microbiological assay (M); no significant deviation from slope = 1 and intercept = 0 was found by regression analysis (H = 1.03M - 0.45; n = 15). PMID:2957953
Developing a Study Orientation Questionnaire in Mathematics for primary school students.

PubMed

Maree, Jacobus G; Van der Walt, Martha S; Ellis, Suria M

2009-04-01

The Study Orientation Questionnaire in Mathematics (Primary) is being developed as a diagnostic measure for South African teachers and counsellors to help primary school students improve their orientation towards the study of mathematics. In this study, participants were primary school students in the North-West Province of South Africa. During the standardisation in 2007, 1,013 students (538 boys: M age = 12.61; SD = 1.53; 555 girls: M age = 11.98; SD = 1.35; 10 missing values) were assessed. Factor analysis yielded three factors. Analysis also showed satisfactory reliability coefficients and item-factor correlations. Step-wise linear regression indicated that three factors (Mathematics anxiety, Study attitude in mathematics, and Study habits in mathematics) contributed significantly (R2 = .194) to predicting achievement in mathematics as measured by the Basic Mathematics Questionnaire (Primary).
Using Parametric Cost Models to Estimate Engineering and Installation Costs of Selected Electronic Communications Systems

DTIC Science & Technology

1994-09-01

Institute of Technology, Wright- Patterson AFB OH, January 1994. 4. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 5...Technology, Wright-Patterson AFB OH 5 April 1994. 29. Neter, John and others. Applied Linear Regression Models. Boston: Irwin, 1989. 30. Office of
An Evaluation of the Automated Cost Estimating Integrated Tools (ACEIT) System

DTIC Science & Technology

1989-09-01

residual and it is described as the residual divided by its standard deviation (13:App A,17). Neter, Wasserman, and Kutner, in Applied Linear Regression Models...others. Applied Linear Regression Models. Homewood IL: Irwin, 1983. 19. Raduchel, William J. "A Professional’s Perspective on User-Friendliness," Byte
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

ERIC Educational Resources Information Center

Cooper, Paul D.

2010-01-01

A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Conjoint Analysis: A Study of the Effects of Using Person Variables.

ERIC Educational Resources Information Center

Fraas, John W.; Newman, Isadore

Three statistical techniques--conjoint analysis, a multiple linear regression model, and a multiple linear regression model with a surrogate person variable--were used to estimate the relative importance of five university attributes for students in the process of selecting a college. The five attributes include: availability and variety of…
Fitting program for linear regressions according to Mahon (1996)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trappitsch, Reto G.

2018-01-09

This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
How Robust Is Linear Regression with Dummy Variables?

ERIC Educational Resources Information Center

Blankmeyer, Eric

2006-01-01

Researchers in education and the social sciences make extensive use of linear regression models in which the dependent variable is continuous-valued while the explanatory variables are a combination of continuous-valued regressors and dummy variables. The dummies partition the sample into groups, some of which may contain only a few observations.…
Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method

ERIC Educational Resources Information Center

Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev

2018-01-01

The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…
An Introduction to Graphical and Mathematical Methods for Detecting Heteroscedasticity in Linear Regression.

ERIC Educational Resources Information Center

Thompson, Russel L.

Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
On the null distribution of Bayes factors in linear regression

USDA-ARS?s Scientific Manuscript database

We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Common pitfalls in statistical analysis: Linear regression analysis

PubMed Central

Aggarwal, Rakesh; Ranganathan, Priya

2017-01-01

In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

PubMed

Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

2015-08-01

Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Evaluation of weather-based rice yield models in India

NASA Astrophysics Data System (ADS)

Sudharsan, D.; Adinarayana, J.; Reddy, D. Raji; Sreenivas, G.; Ninomiya, S.; Hirafuji, M.; Kiura, T.; Tanaka, K.; Desai, U. B.; Merchant, S. N.

2013-01-01

The objective of this study was to compare two different rice simulation models—standalone (Decision Support System for Agrotechnology Transfer [DSSAT]) and web based (SImulation Model for RIce-Weather relations [SIMRIW])—with agrometeorological data and agronomic parameters for estimation of rice crop production in southern semi-arid tropics of India. Studies were carried out on the BPT5204 rice variety to evaluate two crop simulation models. Long-term experiments were conducted in a research farm of Acharya N G Ranga Agricultural University (ANGRAU), Hyderabad, India. Initially, the results were obtained using 4 years (1994-1997) of data with weather parameters from a local weather station to evaluate DSSAT simulated results with observed values. Linear regression models used for the purpose showed a close relationship between DSSAT and observed yield. Subsequently, yield comparisons were also carried out with SIMRIW and DSSAT, and validated with actual observed values. Realizing the correlation coefficient values of SIMRIW simulation values in acceptable limits, further rice experiments in monsoon (Kharif) and post-monsoon (Rabi) agricultural seasons (2009, 2010 and 2011) were carried out with a location-specific distributed sensor network system. These proximal systems help to simulate dry weight, leaf area index and potential yield by the Java based SIMRIW on a daily/weekly/monthly/seasonal basis. These dynamic parameters are useful to the farming community for necessary decision making in a ubiquitous manner. However, SIMRIW requires fine tuning for better results/decision making.

Factors influencing platelet clumping during peripheral blood hematopoietic stem cell collection

PubMed Central

Mathur, Gagan; Bell, Sarah L.; Collins, Laura; Nelson, Gail A.; Knudson, C. Michael; Schlueter, Annette J.

2018-01-01

BACKGROUND Platelet clumping is a common occurrence during peripheral blood hematopoietic stem cell (HSC) collection using the Spectra Optia mononuclear cell (MNC) protocol. If clumping persists, it may prevent continuation of the collection and interfere with proper MNC separation. This study is the first to report the incidence of clumping, identify precollection factors associated with platelet clumping, and describe the degree to which platelet clumping interferes with HSC product yield. STUDY DESIGN AND METHODS In total, 258 HSC collections performed on 116 patients using the Optia MNC protocol were reviewed. Collections utilized heparin in anticoagulant citrate dextrose to facilitate large-volume leukapheresis. Linear and logistic regression models were utilized to determine which precollection factors were predictive of platelet clumping and whether clumping was associated with product yield or collection efficiency. RESULTS Platelet clumping was observed in 63% of collections. Multivariable analysis revealed that a lower white blood cell count was an independent predictor of clumping occurrence. Chemotherapy mobilization and a lower peripheral blood CD34+ cell count were predictors of the degree of clumping. Procedures with clumping had higher collection efficiency but lower blood volume processed on average, resulting in no difference in collection yields. Citrate toxicity did not correlate with clumping. CONCLUSION Although platelet clumping is a common technical problem seen during HSC collection, the total CD34+ cell-collection yields were not affected by clumping. WBC count, mobilization approach, and peripheral blood CD34+ cell count can help predict clumping and potentially drive interventions to proactively manage clumping. PMID:28150319
Microwave pretreatment of switchgrass for bioethanol production

NASA Astrophysics Data System (ADS)

Keshwani, Deepak Radhakrishin

Lignocellulosic materials are promising alternative feedstocks for bioethanol production. These materials include agricultural residues, cellulosic waste such as newsprint and office paper, logging residues, and herbaceous and woody crops. However, the recalcitrant nature of lignocellulosic biomass necessitates a pretreatment step to improve the yield of fermentable sugars. The overall goal of this dissertation is to expand the current state of knowledge on microwave-based pretreatment of lignocellulosic biomass. Existing research on bioenergy and value-added applications of switchgrass is reviewed in Chapter 2. Switchgrass is an herbaceous energy crop native to North America and has high biomass productivity, potentially low requirements for agricultural inputs and positive environmental impacts. Based on results from test plots, yields in excess of 20 Mg/ha have been reported. Environmental benefits associated with switchgrass include the potential for carbon sequestration, nutrient recovery from run-off, soil remediation and provision of habitats for grassland birds. Published research on pretreatment of switchgrass reported glucose yields ranging from 70-90% and xylose yields ranging from 70-100% after hydrolysis and ethanol yields ranging from 72-92% after fermentation. Other potential value-added uses of switchgrass include gasification, bio-oil production, newsprint production and fiber reinforcement in thermoplastic composites. Research on microwave-based pretreatment of switchgrass and coastal bermudagrass is presented in Chapter 3. Pretreatments were carried out by immersing the biomass in dilute chemical reagents and exposing the slurry to microwave radiation at 250 watts for residence times ranging from 5 to 20 minutes. Preliminary experiments identified alkalis as suitable chemical reagents for microwave-based pretreatment. An evaluation of different alkalis identified sodium hydroxide as the most effective alkali reagent. Under optimum pretreatment conditions, 82% glucose and 63% xylose yields were achieved for switchgrass, and 87% glucose and 59% xylose yields were achieved for coastal bermudagrass following enzymatic hydrolysis of the pretreated biomass. The optimum enzyme loadings were 15 FPU/g and 20 CBU/g for switchgrass and 10 FPU/g and 20 CBU/g for coastal bermudagrass. Dielectric properties for dilute sodium hydroxide solutions were measured and compared to solid loss, lignin reduction and reducing sugar levels in hydrolyzates. Results indicate that the dielectric loss tangent of alkali solutions is a potential indicator of the severity of microwave-based pretreatments. Modeling of pretreatment processes can be a valuable tool in process simulations of bioethanol production from lignocellulosic biomass. Chapter 4 discusses three different approaches that were used to model delignification and carbohydrate loss during microwave-based pretreatment of switchgrass: statistical linear regression modeling, kinetic modeling using a time-dependent rate coefficient, and a Mamdani-type fuzzy inference system. The dielectric loss tangent of the alkali reagent and pretreatment time were used as predictors in all models. The statistical linear regression model for delignification gave comparable root mean square error (RMSE) values for training and testing data and predictions were approximately within 1% of experimental values. The kinetic model for delignification and xylan loss gave comparable RMSE values for training and testing data sets and predictions were approximately within 2% of experimental values. The kinetic model for cellulose loss was not as effective and predictions were only within 5-7% of experimental values. The time-dependent rate coefficients of the kinetic models calculated from experimental data were consistent with the heterogeneity (or lack thereof) of individual biomass components. The Mamdani-type fuzzy inference system was shown to be an effective means to model pretreatment processes and gave the most accurate predictions (<3%) for cellulose loss.
Evaluation of linear regression techniques for atmospheric applications: the importance of appropriate weighting

NASA Astrophysics Data System (ADS)

Wu, Cheng; Zhen Yu, Jian

2018-03-01

Linear regression techniques are widely used in atmospheric science, but they are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The five techniques are ordinary least squares (OLS), Deming regression (DR), orthogonal distance regression (ODR), weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne twister (MT) pseudorandom number generator. The numerical simulations are also improved by (a) refining the parameterization of nonlinear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, and (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found that an improper λ in DR can lead to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. If a priori error in one of the variables is unknown, or the measurement error described cannot be trusted, DR, WODR and YR can provide the least biases in slope and intercept among all tested regression techniques. For these reasons, DR, WODR and YR are recommended for atmospheric studies when both X and Y data have measurement errors. An Igor Pro-based program (Scatter Plot) was developed to facilitate the implementation of error-in-variables regressions.
A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis.

PubMed

Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga

2006-08-01

A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Wavelet regression model in forecasting crude oil price

NASA Astrophysics Data System (ADS)

Hamid, Mohd Helmie; Shabri, Ani

2017-05-01

This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Partitioning sources of variation in vertebrate species richness

USGS Publications Warehouse

Boone, R.B.; Krohn, W.B.

2000-01-01

Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

PubMed

Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

2009-01-01

This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).
The Norwegian Healthier Goats program--modeling lactation curves using a multilevel cubic spline regression model.

PubMed

Nagel-Alne, G E; Krontveit, R; Bohlin, J; Valle, P S; Skjerve, E; Sølverød, L S

2014-07-01

In 2001, the Norwegian Goat Health Service initiated the Healthier Goats program (HG), with the aim of eradicating caprine arthritis encephalitis, caseous lymphadenitis, and Johne's disease (caprine paratuberculosis) in Norwegian goat herds. The aim of the present study was to explore how control and eradication of the above-mentioned diseases by enrolling in HG affected milk yield by comparison with herds not enrolled in HG. Lactation curves were modeled using a multilevel cubic spline regression model where farm, goat, and lactation were included as random effect parameters. The data material contained 135,446 registrations of daily milk yield from 28,829 lactations in 43 herds. The multilevel cubic spline regression model was applied to 4 categories of data: enrolled early, control early, enrolled late, and control late. For enrolled herds, the early and late notations refer to the situation before and after enrolling in HG; for nonenrolled herds (controls), they refer to development over time, independent of HG. Total milk yield increased in the enrolled herds after eradication: the total milk yields in the fourth lactation were 634.2 and 873.3 kg in enrolled early and enrolled late herds, respectively, and 613.2 and 701.4 kg in the control early and control late herds, respectively. Day of peak yield differed between enrolled and control herds. The day of peak yield came on d 6 of lactation for the control early category for parities 2, 3, and 4, indicating an inability of the goats to further increase their milk yield from the initial level. For enrolled herds, on the other hand, peak yield came between d 49 and 56, indicating a gradual increase in milk yield after kidding. Our results indicate that enrollment in the HG disease eradication program improved the milk yield of dairy goats considerably, and that the multilevel cubic spline regression was a suitable model for exploring effects of disease control and eradication on milk yield. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Comparison of Statistical Models for Analyzing Wheat Yield Time Series

PubMed Central

Michel, Lucie; Makowski, David

2013-01-01

The world's population is predicted to exceed nine billion by 2050 and there is increasing concern about the capability of agriculture to feed such a large population. Foresight studies on food security are frequently based on crop yield trends estimated from yield time series provided by national and regional statistical agencies. Various types of statistical models have been proposed for the analysis of yield time series, but the predictive performances of these models have not yet been evaluated in detail. In this study, we present eight statistical models for analyzing yield time series and compare their ability to predict wheat yield at the national and regional scales, using data provided by the Food and Agriculture Organization of the United Nations and by the French Ministry of Agriculture. The Holt-Winters and dynamic linear models performed equally well, giving the most accurate predictions of wheat yield. However, dynamic linear models have two advantages over Holt-Winters models: they can be used to reconstruct past yield trends retrospectively and to analyze uncertainty. The results obtained with dynamic linear models indicated a stagnation of wheat yields in many countries, but the estimated rate of increase of wheat yield remained above 0.06 t ha−1 year−1 in several countries in Europe, Asia, Africa and America, and the estimated values were highly uncertain for several major wheat producing countries. The rate of yield increase differed considerably between French regions, suggesting that efforts to identify the main causes of yield stagnation should focus on a subnational scale. PMID:24205280
Simulation of broadband ground motion including nonlinear soil effects for a magnitude 6.5 earthquake on the Seattle fault, Seattle, Washington

USGS Publications Warehouse

Hartzell, S.; Leeds, A.; Frankel, A.; Williams, R.A.; Odum, J.; Stephenson, W.; Silva, W.

2002-01-01

The Seattle fault poses a significant seismic hazard to the city of Seattle, Washington. A hybrid, low-frequency, high-frequency method is used to calculate broadband (0-20 Hz) ground-motion time histories for a M 6.5 earthquake on the Seattle fault. Low frequencies (1 Hz) are calculated by a stochastic method that uses a fractal subevent size distribution to give an ω-2 displacement spectrum. Time histories are calculated for a grid of stations and then corrected for the local site response using a classification scheme based on the surficial geology. Average shear-wave velocity profiles are developed for six surficial geologic units: artificial fill, modified land, Esperance sand, Lawton clay, till, and Tertiary sandstone. These profiles together with other soil parameters are used to compare linear, equivalent-linear, and nonlinear predictions of ground motion in the frequency band 0-15 Hz. Linear site-response corrections are found to yield unreasonably large ground motions. Equivalent-linear and nonlinear calculations give peak values similar to the 1994 Northridge, California, earthquake and those predicted by regression relationships. Ground-motion variance is estimated for (1) randomization of the velocity profiles, (2) variation in source parameters, and (3) choice of nonlinear model. Within the limits of the models tested, the results are found to be most sensitive to the nonlinear model and soil parameters, notably the over consolidation ratio.
Remotely sensed rice yield prediction using multi-temporal NDVI data derived from NOAA's-AVHRR.

PubMed

Huang, Jingfeng; Wang, Xiuzhen; Li, Xinxing; Tian, Hanqin; Pan, Zhuokun

2013-01-01

Grain-yield prediction using remotely sensed data have been intensively studied in wheat and maize, but such information is limited in rice, barley, oats and soybeans. The present study proposes a new framework for rice-yield prediction, which eliminates the influence of the technology development, fertilizer application, and management improvement and can be used for the development and implementation of provincial rice-yield predictions. The technique requires the collection of remotely sensed data over an adequate time frame and a corresponding record of the region's crop yields. Longer normalized-difference-vegetation-index (NDVI) time series are preferable to shorter ones for the purposes of rice-yield prediction because the well-contrasted seasons in a longer time series provide the opportunity to build regression models with a wide application range. A regression analysis of the yield versus the year indicated an annual gain in the rice yield of 50 to 128 kg ha(-1). Stepwise regression models for the remotely sensed rice-yield predictions have been developed for five typical rice-growing provinces in China. The prediction models for the remotely sensed rice yield indicated that the influences of the NDVIs on the rice yield were always positive. The association between the predicted and observed rice yields was highly significant without obvious outliers from 1982 to 2004. Independent validation found that the overall relative error is approximately 5.82%, and a majority of the relative errors were less than 5% in 2005 and 2006, depending on the study area. The proposed models can be used in an operational context to predict rice yields at the provincial level in China. The methodologies described in the present paper can be applied to any crop for which a sufficient time series of NDVI data and the corresponding historical yield information are available, as long as the historical yield increases significantly.
Remotely Sensed Rice Yield Prediction Using Multi-Temporal NDVI Data Derived from NOAA's-AVHRR

PubMed Central

Huang, Jingfeng; Wang, Xiuzhen; Li, Xinxing; Tian, Hanqin; Pan, Zhuokun

2013-01-01

Grain-yield prediction using remotely sensed data have been intensively studied in wheat and maize, but such information is limited in rice, barley, oats and soybeans. The present study proposes a new framework for rice-yield prediction, which eliminates the influence of the technology development, fertilizer application, and management improvement and can be used for the development and implementation of provincial rice-yield predictions. The technique requires the collection of remotely sensed data over an adequate time frame and a corresponding record of the region's crop yields. Longer normalized-difference-vegetation-index (NDVI) time series are preferable to shorter ones for the purposes of rice-yield prediction because the well-contrasted seasons in a longer time series provide the opportunity to build regression models with a wide application range. A regression analysis of the yield versus the year indicated an annual gain in the rice yield of 50 to 128 kg ha−1. Stepwise regression models for the remotely sensed rice-yield predictions have been developed for five typical rice-growing provinces in China. The prediction models for the remotely sensed rice yield indicated that the influences of the NDVIs on the rice yield were always positive. The association between the predicted and observed rice yields was highly significant without obvious outliers from 1982 to 2004. Independent validation found that the overall relative error is approximately 5.82%, and a majority of the relative errors were less than 5% in 2005 and 2006, depending on the study area. The proposed models can be used in an operational context to predict rice yields at the provincial level in China. The methodologies described in the present paper can be applied to any crop for which a sufficient time series of NDVI data and the corresponding historical yield information are available, as long as the historical yield increases significantly. PMID:23967112
Linear regression metamodeling as a tool to summarize and present simulation model results.

PubMed

Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

2013-10-01

Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Structure-function relationships using spectral-domain optical coherence tomography: comparison with scanning laser polarimetry.

PubMed

Aptel, Florent; Sayous, Romain; Fortoul, Vincent; Beccat, Sylvain; Denis, Philippe

2010-12-01

To evaluate and compare the regional relationships between visual field sensitivity and retinal nerve fiber layer (RNFL) thickness as measured by spectral-domain optical coherence tomography (OCT) and scanning laser polarimetry. Prospective cross-sectional study. One hundred and twenty eyes of 120 patients (40 with healthy eyes, 40 with suspected glaucoma, and 40 with glaucoma) were tested on Cirrus-OCT, GDx VCC, and standard automated perimetry. Raw data on RNFL thickness were extracted for 256 peripapillary sectors of 1.40625 degrees each for the OCT measurement ellipse and 64 peripapillary sectors of 5.625 degrees each for the GDx VCC measurement ellipse. Correlations between peripapillary RNFL thickness in 6 sectors and visual field sensitivity in the 6 corresponding areas were evaluated using linear and logarithmic regression analysis. Receiver operating curve areas were calculated for each instrument. With spectral-domain OCT, the correlations (r(2)) between RNFL thickness and visual field sensitivity ranged from 0.082 (nasal RNFL and corresponding visual field area, linear regression) to 0.726 (supratemporal RNFL and corresponding visual field area, logarithmic regression). By comparison, with GDx-VCC, the correlations ranged from 0.062 (temporal RNFL and corresponding visual field area, linear regression) to 0.362 (supratemporal RNFL and corresponding visual field area, logarithmic regression). In pairwise comparisons, these structure-function correlations were generally stronger with spectral-domain OCT than with GDx VCC and with logarithmic regression than with linear regression. The largest areas under the receiver operating curve were seen for OCT superior thickness (0.963 ± 0.022; P < .001) in eyes with glaucoma and for OCT average thickness (0.888 ± 0.072; P < .001) in eyes with suspected glaucoma. The structure-function relationship was significantly stronger with spectral-domain OCT than with scanning laser polarimetry, and was better expressed logarithmically than linearly. Measurements with these 2 instruments should not be considered to be interchangeable. Copyright © 2010 Elsevier Inc. All rights reserved.
A Simulation-Based Comparison of Several Stochastic Linear Regression Methods in the Presence of Outliers.

ERIC Educational Resources Information Center

Rule, David L.

Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…
Unit Cohesion and the Surface Navy: Does Cohesion Affect Performance

DTIC Science & Technology

1989-12-01

v. 68, 1968. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. Rand Corporation R-2607...Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Regression Models, 2d ed., Boston, MA: Irwin, 1989. SAS User’s Guide: Basics, Version 5 ed
Comparison of Selection Procedures and Validation of Criterion Used in Selection of Significant Control Variates of a Simulation Model

DTIC Science & Technology

1990-03-01

and M.H. Knuter. Applied Linear Regression Models. Homewood IL: Richard D. Erwin Inc., 1983. Pritsker, A. Alan B. Introduction to Simulation and SLAM...Control Variates in Simulation," European Journal of Operational Research, 42: (1989). Neter, J., W. Wasserman, and M.H. Xnuter. Applied Linear Regression Models
Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

ERIC Educational Resources Information Center

Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

2013-01-01

Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Calibrated Peer Review for Interpreting Linear Regression Parameters: Results from a Graduate Course

ERIC Educational Resources Information Center

Enders, Felicity B.; Jenkins, Sarah; Hoverman, Verna

2010-01-01

Biostatistics is traditionally a difficult subject for students to learn. While the mathematical aspects are challenging, it can also be demanding for students to learn the exact language to use to correctly interpret statistical results. In particular, correctly interpreting the parameters from linear regression is both a vital tool and a…
Some Applied Research Concerns Using Multiple Linear Regression Analysis.

ERIC Educational Resources Information Center

Newman, Isadore; Fraas, John W.

The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…

Using Simple Linear Regression to Assess the Success of the Montreal Protocol in Reducing Atmospheric Chlorofluorocarbons

ERIC Educational Resources Information Center

Nelson, Dean

2009-01-01

Following the Guidelines for Assessment and Instruction in Statistics Education (GAISE) recommendation to use real data, an example is presented in which simple linear regression is used to evaluate the effect of the Montreal Protocol on atmospheric concentration of chlorofluorocarbons. This simple set of data, obtained from a public archive, can…
Quantum State Tomography via Linear Regression Estimation

PubMed Central

Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan

2013-01-01

A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519
Applications of statistics to medical science, III. Correlation and regression.

PubMed

Watanabe, Hiroshi

2012-01-01

In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
A phenomenological biological dose model for proton therapy based on linear energy transfer spectra.

PubMed

Rørvik, Eivind; Thörnqvist, Sara; Stokkevåg, Camilla H; Dahle, Tordis J; Fjaera, Lars Fredrik; Ytre-Hauge, Kristian S

2017-06-01

The relative biological effectiveness (RBE) of protons varies with the radiation quality, quantified by the linear energy transfer (LET). Most phenomenological models employ a linear dependency of the dose-averaged LET (LET d ) to calculate the biological dose. However, several experiments have indicated a possible non-linear trend. Our aim was to investigate if biological dose models including non-linear LET dependencies should be considered, by introducing a LET spectrum based dose model. The RBE-LET relationship was investigated by fitting of polynomials from 1st to 5th degree to a database of 85 data points from aerobic in vitro experiments. We included both unweighted and weighted regression, the latter taking into account experimental uncertainties. Statistical testing was performed to decide whether higher degree polynomials provided better fits to the data as compared to lower degrees. The newly developed models were compared to three published LET d based models for a simulated spread out Bragg peak (SOBP) scenario. The statistical analysis of the weighted regression analysis favored a non-linear RBE-LET relationship, with the quartic polynomial found to best represent the experimental data (P = 0.010). The results of the unweighted regression analysis were on the borderline of statistical significance for non-linear functions (P = 0.053), and with the current database a linear dependency could not be rejected. For the SOBP scenario, the weighted non-linear model estimated a similar mean RBE value (1.14) compared to the three established models (1.13-1.17). The unweighted model calculated a considerably higher RBE value (1.22). The analysis indicated that non-linear models could give a better representation of the RBE-LET relationship. However, this is not decisive, as inclusion of the experimental uncertainties in the regression analysis had a significant impact on the determination and ranking of the models. As differences between the models were observed for the SOBP scenario, both non-linear LET spectrum- and linear LET d based models should be further evaluated in clinically realistic scenarios. © 2017 American Association of Physicists in Medicine.
Proton radius from electron scattering data

NASA Astrophysics Data System (ADS)

Higinbotham, Douglas W.; Kabir, Al Amin; Lin, Vincent; Meekins, David; Norum, Blaine; Sawatzky, Brad

2016-05-01

Background: The proton charge radius extracted from recent muonic hydrogen Lamb shift measurements is significantly smaller than that extracted from atomic hydrogen and electron scattering measurements. The discrepancy has become known as the proton radius puzzle. Purpose: In an attempt to understand the discrepancy, we review high-precision electron scattering results from Mainz, Jefferson Lab, Saskatoon, and Stanford. Methods: We make use of stepwise regression techniques using the F test as well as the Akaike information criterion to systematically determine the predictive variables to use for a given set and range of electron scattering data as well as to provide multivariate error estimates. Results: Starting with the precision, low four-momentum transfer (Q2) data from Mainz (1980) and Saskatoon (1974), we find that a stepwise regression of the Maclaurin series using the F test as well as the Akaike information criterion justify using a linear extrapolation which yields a value for the proton radius that is consistent with the result obtained from muonic hydrogen measurements. Applying the same Maclaurin series and statistical criteria to the 2014 Rosenbluth results on GE from Mainz, we again find that the stepwise regression tends to favor a radius consistent with the muonic hydrogen radius but produces results that are extremely sensitive to the range of data included in the fit. Making use of the high-Q2 data on GE to select functions which extrapolate to high Q2, we find that a Padé (N =M =1 ) statistical model works remarkably well, as does a dipole function with a 0.84 fm radius, GE(Q2) =(1+Q2/0.66 GeV2) -2 . Conclusions: Rigorous applications of stepwise regression techniques and multivariate error estimates result in the extraction of a proton charge radius that is consistent with the muonic hydrogen result of 0.84 fm; either from linear extrapolation of the extremely-low-Q2 data or by use of the Padé approximant for extrapolation using a larger range of data. Thus, based on a purely statistical analysis of electron scattering data, we conclude that the electron scattering results and the muonic hydrogen results are consistent. It is the atomic hydrogen results that are the outliers.
Regression of non-linear coupling of noise in LIGO detectors

NASA Astrophysics Data System (ADS)

Da Silva Costa, C. F.; Billman, C.; Effler, A.; Klimenko, S.; Cheng, H.-P.

2018-03-01

In 2015, after their upgrade, the advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detectors started acquiring data. The effort to improve their sensitivity has never stopped since then. The goal to achieve design sensitivity is challenging. Environmental and instrumental noise couple to the detector output with different, linear and non-linear, coupling mechanisms. The noise regression method we use is based on the Wiener–Kolmogorov filter, which uses witness channels to make noise predictions. We present here how this method helped to determine complex non-linear noise couplings in the output mode cleaner and in the mirror suspension system of the LIGO detector.
QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions.

PubMed

Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan

2012-12-01

A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Linear and nonlinear genetic relationships between type traits and productive life in US dairy goats.

PubMed

Castañeda-Bustos, V J; Montaldo, H H; Valencia-Posadas, M; Shepard, L; Pérez-Elizalde, S; Hernández-Mendo, O; Torres-Hernández, G

2017-02-01

Linear or nonlinear genetic relationships between productive life and functional productive life at 72 mo, with final score (SCO), stature, strength, dairyness (DAI), teat diameter, rear legs (side view), rump angle, rump width (RUW), fore udder attachment (FUA), rear udder height, rear udder arch, udder depth (UDD), suspensory ligament (SUS), and teat placement, as well as heritabilities and correlations were estimated from multibreed US dairy goat records. Productive life was defined as the total days in production until 72 mo of age (PL72) for goats having the opportunity to express the trait. Functional productive life (FPL72) was analyzed by incorporating first lactation milk yield, fat yield, protein yield, and SCO in the statistical model. Heritabilities and correlations were estimated using linear mixed models with pedigree additive genetic relationships and ASReml software. Nonlinearity of genetic relationships was assessed based on second-degree polynomial (quadratic) regression models, with the breeding values of PL72 or FPL72 as responses and the breeding values for each type trait (linear and quadratic) as predictor variables. Heritability estimates were 0.19, 0.14, 0.18, 0.20, 0.14, 0.07, 0.28, 0.20, 0.15, 0.13, 0.25, 0.18, 0.20, 0.21, 0.21, and 0.32 for PL72, FPL72, SCO, stature, strength, DAI, teat diameter, rear legs, rump angle, RUW, FUA, rear udder height, rear udder arch, UDD, SUS, and teat placement, respectively. The type traits SCO, RUW, and FUA were the most correlated with PL72 and FPL72, so these may be used as selection criteria to increase longevity in dairy goats. An increase in the coefficient of determination >1% for the second degree, compared with that for the linear model for either PL72 or FPL72, was taken as evidence of a nonlinear genetic relationship. Using this criterion, PL72 showed maximum values at intermediate scores in DAI, UDD, and RUW, and maximum values at extreme scores in FUA and SUS, whereas FPL72 showed maximum values at intermediate scores in DAI and UDD, and maximum values at extreme scores in FUA, RUW, and SUS. Selecting for increased SCO, RUW, and FUA will lead to an increase of FPL72 in goats. Consideration of nonlinear relationships between DAI, FUA, RUW, SUS, and UDD may help in the design of more efficient breeding programs for dairy goats using conformation traits. The Authors. Published by the Federation of Animal Science Societies and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
The linear relationship between cigarette tar and nicotine yields: regulatory implications for smoke constituent ratios.

PubMed

St Charles, F K; Cook, C J; Clayton, P M

2011-02-01

Cigarette smoke analyte yields are often expressed as ratios relative to tar or nicotine yields, usually to compare different products or to estimate human uptake of smoke in relation to nicotine uptake measurements. The method, however, can lead to distorted interpretations, especially in the case of ratios from ultra-low tar yield cigarettes. In brief, as tar yields decrease below the 5–6 mg per cigarette range, the tar-to-nicotine ratio (TNR) decreases rapidly in a non-linear fashion. If, however, the nicotine yield, rather than the ratio, is plotted versus the tar yield, the non-linearity disappears and a straight line is obtained, with a slight positive intercept for nicotine on the ordinate. Unlike the ratio, the slope appears to depend only on the concentration of the nicotine in the blend and does not appear to vary with smoking parameters such as puff volume, puff interval or length smoked or with cigarette design parameters such as length, circumference or the amount of filtration or filter ventilation. Therefore, such a slope is analogous to the TNR although, unlike that ratio, it is invariant. Even more simply, the concentration of the nicotine in the blend, at least for American blend-style cigarettes, provides a similar index.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

ERIC Educational Resources Information Center

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

PubMed Central

Zhu, Liping; Huang, Mian; Li, Runze

2012-01-01

This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536
Mended chiral symmetry and the linear sigma model in one-loop order

DOE Office of Scientific and Technical Information (OSTI.GOV)

Scadron, M.D.

1992-02-28

In this paper it is shown that the linear {sigma}-model in one loop order in the chiral limit recovers meson masses m{sub {pi}} = 0, m{sub {sigma}} = 2m{sub qk} (NJL), m {sub {rho}} = {radical}2 g{sub {rho}}f{pi} (KSRF), along with couplings g{sigma}{pi}{pi} = m{sup 2}{sub {sigma}}/2f{pi}, g{rho}{pi}{pi} = g{sub {rho}} (VMD universality) and Weinberg's mended chiral symmetry decay width relation {Gamma}{sub {sigma}} = (9/2){Gamma}{sub {rho}}. The linear {sigma}-model combined quark and meson loops also properly predict the radiative decays {pi}{sup 0} {yields} 2{gamma} {yields} e{nu}{gamma} and {delta}{sup 0} (983) {yields} 2{gamma}.
Why is it so difficult to determine the yield of indoor cannabis plantations? A case study from the Netherlands.

PubMed

Vanhove, Wouter; Maalsté, Nicole; Van Damme, Patrick

2017-07-01

Together, the Netherlands and Belgium are the largest indoor cannabis producing countries in Europe. In both countries, legal prosecution procedure of convicted illicit cannabis growers usually includes recovery of the profits gained. However, it is not easy to make a reliable estimation of the latter profits, due to the wide range of factors that determine indoor cannabis yields and eventual selling prices. In the Netherlands, since 2005, a reference model is used that assumes a constant yield (g) per plant for a given indoor cannabis plant density. Later, in 2011, a new model was developed in Belgium for yield estimation of Belgian indoor cannabis plantations that assumes a constant yield per m 2 of growth surface, provided that a number of growth conditions are met. Indoor cannabis plantations in the Netherlands and Belgium share similar technical characteristics. As a result, for indoor cannabis plantations in both countries, both aforementioned yield estimation models should yield similar yield estimations. By means of a real-case study from the Netherlands, we show that the reliability of both models is hampered by a number of flaws and unmet preconditions. The Dutch model is based on a regression equation that makes use of ill-defined plant development stages, assumes a linear plant growth, does not discriminate between different plantation size categories and does not include other important yield determining factors (such as fertilization). The Belgian model addresses some of the latter shortcomings, but its applicability is constrained by a number of pre-conditions including plantation size between 50 and 1000 plants; cultivation in individual pots with peat soil; 600W (electrical power) assimilation lamps; constant temperature between 20°C and 30°C; adequate fertilizer application and plants unaffected by pests and diseases. Judiciary in both the Netherlands and Belgium require robust indoor cannabis yield models for adequate legal prosecution of illicit indoor cannabis growth operations. To that aim, the current models should be optimized whereas the validity of their application should be examined case by case. Copyright © 2017 Elsevier B.V. All rights reserved.
Turning maneuvers in sharks: Predicting body curvature from axial morphology.

PubMed

Porter, Marianne E; Roque, Cassandra M; Long, John H

2009-08-01

Given the diversity of vertebral morphologies among fishes, it is tempting to propose causal links between axial morphology and body curvature. We propose that shape and size of the vertebrae, intervertebral joints, and the body will more accurately predict differences in body curvature during swimming rather than a single meristic such as total vertebral number alone. We examined the correlation between morphological features and maximum body curvature seen during routine turns in five species of shark: Triakis semifasciata, Heterodontus francisci, Chiloscyllium plagiosum, Chiloscyllium punctatum, and Hemiscyllium ocellatum. We quantified overall body curvature using three different metrics. From a separate group of size-matched individuals, we measured 16 morphological features from precaudal vertebrae and the body. As predicted, a larger pool of morphological features yielded a more robust prediction of maximal body curvature than vertebral number alone. Stepwise linear regression showed that up to 11 features were significant predictors of the three measures of body curvature, yielding highly significant multiple regressions with r(2) values of 0.523, 0.537, and 0.584. The second moment of area of the centrum was always the best predictor, followed by either centrum length or transverse height. Ranking as the fifth most important variable in three different models, the body's total length, fineness ratio, and width were the most important non-vertebral morphologies. Without considering the effects of muscle activity, these correlations suggest a dominant role for the vertebral column in providing the passive mechanical properties of the body that control, in part, body curvature during swimming. (c) 2009 Wiley-Liss, Inc.
Multiple correlation analyses of metabolic and endocrine profiles with fertility in primiparous and multiparous cows.

PubMed

Wathes, D C; Bourne, N; Cheng, Z; Mann, G E; Taylor, V J; Coffey, M P

2007-03-01

Results from 4 studies were combined (representing a total of 500 lactations) to investigate the relationships between metabolic parameters and fertility in dairy cows. Information was collected on blood metabolic traits and body condition score at 1 to 2 wk prepartum and at 2, 4, and 7 wk postpartum. Fertility traits were days to commencement of luteal activity, days to first service, days to conception, and failure to conceive. Primiparous and multiparous cows were considered separately. Initial linear regression analyses were used to determine relationships among fertility, metabolic, and endocrine traits at each time point. All metabolic and endocrine traits significantly related to fertility were included in stepwise multiple regression analyses alone (model 1), including peak milk yield and interval to commencement of luteal activity (model 2), and with the further addition of dietary group (model 3). In multiparous cows, extended calving to conception intervals were associated prepartum with greater concentrations of leptin and lesser concentrations of nonesterified fatty acids and urea, and postpartum with reduced insulin-like growth factor-I at 2 wk, greater urea at 7 wk, and greater peak milk yield. In primiparous cows, extended calving to conception intervals were associated with more body condition and more urea prepartum, elevated urea postpartum, and more body condition loss by 7 wk. In conclusion, some metabolic measurements were associated with poorer fertility outcomes. Relationships between fertility and metabolic and endocrine traits varied both according to the lactation number of the cow and with the time relative to calving.
Estimation of Relative Economic Weights of Hanwoo Carcass Traits Based on Carcass Market Price

PubMed Central

Choy, Yun Ho; Park, Byoung Ho; Choi, Tae Jung; Choi, Jae Gwan; Cho, Kwang Hyun; Lee, Seung Soo; Choi, You Lim; Koh, Kyung Chul; Kim, Hyo Sun

2012-01-01

The objective of this study was to estimate economic weights of Hanwoo carcass traits that can be used to build economic selection indexes for selection of seedstocks. Data from carcass measures for determining beef yield and quality grades were collected and provided by the Korean Institute for Animal Products Quality Evaluation (KAPE). Out of 1,556,971 records, 476,430 records collected from 13 abattoirs from 2008 to 2010 after deletion of outlying observations were used to estimate relative economic weights of bid price per kg carcass weight on cold carcass weight (CW), eye muscle area (EMA), backfat thickness (BF) and marbling score (MS) and the phenotypic relationships among component traits. Price of carcass tended to increase linearly as yield grades or quality grades, in marginal or in combination, increased. Partial regression coefficients for MS, EMA, BF, and for CW in original scales were +948.5 won/score, +27.3 won/cm2, −95.2 won/mm and +7.3 won/kg when all three sex categories were taken into account. Among four grade determining traits, relative economic weight of MS was the greatest. Variations in partial regression coefficients by sex categories were great but the trends in relative weights for each carcass measures were similar. Relative economic weights of four traits in integer values when standardized measures were fit into covariance model were +4:+1:−1:+1 for MS:EMA:BF:CW. Further research is required to account for the cost of production per unit carcass weight or per unit production under different economic situations. PMID:25049531
The role of enzyme and substrate concentration in the evaluation of serum angiotensin converting enzyme (ACE) inhibition by enalaprilat in vitro.

PubMed

Weisser, K; Schloos, J

1991-10-09

The relationship between serum angiotensin converting enzyme (ACE) activity and concentration of the ACE inhibitor enalaprilat was determined in vitro in the presence of different concentrations (S = 4-200 mM) of the substrate Hip-Gly-Gly. From Henderson plots, a competitive tight-binding relationship between enalaprilat and serum ACE was found yielding a value of approximately 5 nM for serum ACE concentration (Et) and an inhibition constant (Ki) for enalaprilat of approximately 0.1 nM. A plot of reaction velocity (Vi) versus total inhibitor concentration (It) exhibited a non-parallel shift of the inhibition curve to the right with increasing S. This was reflected by apparent Hill coefficients greater than 1 when the commonly used inhibitory sigmoid concentration-effect model (Emax model) was applied to the data. Slopes greater than 1 were obviously due to discrepancies between the free inhibitor concentration (If) present in the assay and It plotted on the abscissa and could, therefore, be indicators of tight-binding conditions. Thus, the sigmoid Emax model leads to an overestimation of Ki. Therefore, a modification of the inhibitory sigmoid Emax model (called "Emax tight model") was applied, which accounts for the depletion of If by binding, refers to It and allows estimation of the parameters Et and IC50f (free concentration of inhibitor when 50% inhibition occurs) using non-linear regression analysis. This model could describe the non-symmetrical shape of the inhibition curves and the results for Ki and Et correlated very well with those derived from the Henderson plots. The latter findings confirm that the degree of ACE inhibition measured in vitro is, in fact, dependent on the concentration of substrate and enzyme present in the assay. This is of importance not only for the correct evaluation of Ki but also for the interpretation of the time course of serum ACE inhibition measured ex vivo. The non-linear model has some advantages over the linear Henderson equation: it is directly applicable without conversion of the data and avoids the stochastic dependency of the variables, allowing non-linear regression of all data points contributing with the same weight.
Prediction of siRNA potency using sparse logistic regression.

PubMed

Hu, Wei; Hu, John

2014-06-01

RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
Relationship between dairy cow genetic merit and profit on commercial spring calving dairy farms.

PubMed

Ramsbottom, G; Cromie, A R; Horan, B; Berry, D P

2012-07-01

Because not all animal factors influencing profitability can be included in total merit breeding indices for profitability, the association between animal total merit index and true profitability, taking cognisance of all factors associated with costs and revenues, is generally not known. One method to estimate such associations is at the herd level, associating herd average genetic merit with herd profitability. The objective of this study was to primarily relate herd average genetic merit for a range of traits, including the Irish total merit index, with indicators of performance, including profitability, using correlation and multiple regression analyses. Physical, genetic and financial performance data from 1131 Irish seasonal calving pasture-based dairy farms were available following edits; data on some herds were available for more than 1 year of the 3-year study period (2007 to 2009). Herd average economic breeding index (EBI) was associated with reduced herd average phenotypic milk yield but with greater milk composition, resulting in higher milk prices. Moderate positive correlations (0.26 to 0.61) existed between genetic merit for an individual trait and average herd performance for that trait (e.g. genetic merit for milk yield and average per cow milk yield). Following adjustment for year, stocking rate, herd size and quantity of purchased feed in the multiple regression analysis, average herd EBI was positively and linearly associated with net margin per cow and per litre as well as gross revenue output per cow and per litre. The change in net margin per cow per unit change in the total merit index was €1.94 (s.e. = 0.42), which was not different from the expectation of €2. This study, based on a large data set of commercial herds with accurate information on profitability and genetic merit, confirms that, after accounting for confounding factors, the change in herd profitability per unit change in herd genetic merit for the total merit index is within expectations.
Genetic analysis of milk production traits of Tunisian Holsteins using random regression test-day model with Legendre polynomials

PubMed Central

2018-01-01

Objective The objective of this study was to estimate genetic parameters of milk, fat, and protein yields within and across lactations in Tunisian Holsteins using a random regression test-day (TD) model. Methods A random regression multiple trait multiple lactation TD model was used to estimate genetic parameters in the Tunisian dairy cattle population. Data were TD yields of milk, fat, and protein from the first three lactations. Random regressions were modeled with third-order Legendre polynomials for the additive genetic, and permanent environment effects. Heritabilities, and genetic correlations were estimated by Bayesian techniques using the Gibbs sampler. Results All variance components tended to be high in the beginning and the end of lactations. Additive genetic variances for milk, fat, and protein yields were the lowest and were the least variable compared to permanent variances. Heritability values tended to increase with parity. Estimates of heritabilities for 305-d yield-traits were low to moderate, 0.14 to 0.2, 0.12 to 0.17, and 0.13 to 0.18 for milk, fat, and protein yields, respectively. Within-parity, genetic correlations among traits were up to 0.74. Genetic correlations among lactations for the yield traits were relatively high and ranged from 0.78±0.01 to 0.82±0.03, between the first and second parities, from 0.73±0.03 to 0.8±0.04 between the first and third parities, and from 0.82±0.02 to 0.84±0.04 between the second and third parities. Conclusion These results are comparable to previously reported estimates on the same population, indicating that the adoption of a random regression TD model as the official genetic evaluation for production traits in Tunisia, as developed by most Interbull countries, is possible in the Tunisian Holsteins. PMID:28823122

Genetic analysis of milk production traits of Tunisian Holsteins using random regression test-day model with Legendre polynomials.

PubMed

Ben Zaabza, Hafedh; Ben Gara, Abderrahmen; Rekik, Boulbaba

2018-05-01

The objective of this study was to estimate genetic parameters of milk, fat, and protein yields within and across lactations in Tunisian Holsteins using a random regression test-day (TD) model. A random regression multiple trait multiple lactation TD model was used to estimate genetic parameters in the Tunisian dairy cattle population. Data were TD yields of milk, fat, and protein from the first three lactations. Random regressions were modeled with third-order Legendre polynomials for the additive genetic, and permanent environment effects. Heritabilities, and genetic correlations were estimated by Bayesian techniques using the Gibbs sampler. All variance components tended to be high in the beginning and the end of lactations. Additive genetic variances for milk, fat, and protein yields were the lowest and were the least variable compared to permanent variances. Heritability values tended to increase with parity. Estimates of heritabilities for 305-d yield-traits were low to moderate, 0.14 to 0.2, 0.12 to 0.17, and 0.13 to 0.18 for milk, fat, and protein yields, respectively. Within-parity, genetic correlations among traits were up to 0.74. Genetic correlations among lactations for the yield traits were relatively high and ranged from 0.78±0.01 to 0.82±0.03, between the first and second parities, from 0.73±0.03 to 0.8±0.04 between the first and third parities, and from 0.82±0.02 to 0.84±0.04 between the second and third parities. These results are comparable to previously reported estimates on the same population, indicating that the adoption of a random regression TD model as the official genetic evaluation for production traits in Tunisia, as developed by most Interbull countries, is possible in the Tunisian Holsteins.
Broadband linearisation of high-efficiency power amplifiers

NASA Technical Reports Server (NTRS)

Kenington, Peter B.; Parsons, Kieran J.; Bennett, David W.

1993-01-01

A feedforward-based amplifier linearization technique is presented which is capable of yielding significant improvements in both linearity and power efficiency over conventional amplifier classes (e.g. class-A or class-AB). Theoretical and practical results are presented showing that class-C stages may be used for both the main and error amplifiers yielding practical efficiencies well in excess of 30 percent, with theoretical efficiencies of much greater than 40 percent being possible. The levels of linearity which may be achieved are required for most satellite systems, however if greater linearity is required, the technique may be used in addition to conventional pre-distortion techniques.
Modeling long-term suspended-sediment export from an undisturbed forest catchment

NASA Astrophysics Data System (ADS)

Zimmermann, Alexander; Francke, Till; Elsenbeer, Helmut

2013-04-01

Most estimates of suspended sediment yields from humid, undisturbed, and geologically stable forest environments fall within a range of 5 - 30 t km-2 a-1. These low natural erosion rates in small headwater catchments (≤ 1 km2) support the common impression that a well-developed forest cover prevents surface erosion. Interestingly, those estimates originate exclusively from areas with prevailing vertical hydrological flow paths. Forest environments dominated by (near-) surface flow paths (overland flow, pipe flow, and return flow) and a fast response to rainfall, however, are not an exceptional phenomenon, yet only very few sediment yields have been estimated for these areas. Not surprisingly, even fewer long-term (≥ 10 years) records exist. In this contribution we present our latest research which aims at quantifying long-term suspended-sediment export from an undisturbed rainforest catchment prone to frequent overland flow. A key aspect of our approach is the application of machine-learning techniques (Random Forest, Quantile Regression Forest) which allows not only the handling of non-Gaussian data, non-linear relations between predictors and response, and correlations between predictors, but also the assessment of prediction uncertainty. For the current study we provided the machine-learning algorithms exclusively with information from a high-resolution rainfall time series to reconstruct discharge and suspended sediment dynamics for a 21-year period. The significance of our results is threefold. First, our estimates clearly show that forest cover does not necessarily prevent erosion if wet antecedent conditions and large rainfalls coincide. During these situations, overland flow is widespread and sediment fluxes increase in a non-linear fashion due to the mobilization of new sediment sources. Second, our estimates indicate that annual suspended sediment yields of the undisturbed forest catchment show large fluctuations. Depending on the frequency of large events, annual suspended-sediment yield varies between 74 - 416 t km-2 a-1. Third, the estimated sediment yields exceed former benchmark values by an order of magnitude and provide evidence that the erosion footprint of undisturbed, forested catchments can be undistinguishable from that of sustainably managed, but hydrologically less responsive areas. Because of the susceptibility to soil loss we argue that any land use should be avoided in natural erosion hotspots.
Disconcordance in Statistical Models of Bisphenol A and Chronic Disease Outcomes in NHANES 2003-08

PubMed Central

Casey, Martin F.; Neidell, Matthew

2013-01-01

Background Bisphenol A (BPA), a high production chemical commonly found in plastics, has drawn great attention from researchers due to the substance’s potential toxicity. Using data from three National Health and Nutrition Examination Survey (NHANES) cycles, we explored the consistency and robustness of BPA’s reported effects on coronary heart disease and diabetes. Methods And Findings We report the use of three different statistical models in the analysis of BPA: (1) logistic regression, (2) log-linear regression, and (3) dose-response logistic regression. In each variation, confounders were added in six blocks to account for demographics, urinary creatinine, source of BPA exposure, healthy behaviours, and phthalate exposure. Results were sensitive to the variations in functional form of our statistical models, but no single model yielded consistent results across NHANES cycles. Reported ORs were also found to be sensitive to inclusion/exclusion criteria. Further, observed effects, which were most pronounced in NHANES 2003-04, could not be explained away by confounding. Conclusions Limitations in the NHANES data and a poor understanding of the mode of action of BPA have made it difficult to develop informative statistical models. Given the sensitivity of effect estimates to functional form, researchers should report results using multiple specifications with different assumptions about BPA measurement, thus allowing for the identification of potential discrepancies in the data. PMID:24223205
Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis.

PubMed

Gianola, Daniel; Fariello, Maria I; Naya, Hugo; Schön, Chris-Carolin

2016-10-13

Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals ( G: ) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G,: provided variance components are unaffected by exclusion of such marker(s) from G: The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G: does matter. Removal of eigenvectors from G: can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. Copyright © 2016 Gianola et al.
Predictive and mechanistic multivariate linear regression models for reaction development

PubMed Central

Santiago, Celine B.; Guo, Jing-Yao

2018-01-01

Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Adding a Parameter Increases the Variance of an Estimated Regression Function

ERIC Educational Resources Information Center

Withers, Christopher S.; Nadarajah, Saralees

2011-01-01

The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
Using nonlinear quantile regression to estimate the self-thinning boundary curve

Treesearch

Quang V. Cao; Thomas J. Dean

2015-01-01

The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
Simultaneous spectrophotometric determination of salbutamol and bromhexine in tablets.

PubMed

Habib, I H I; Hassouna, M E M; Zaki, G A

2005-03-01

Typical anti-mucolytic drugs called salbutamol hydrochloride and bromhexine sulfate encountered in tablets were determined simultaneously either by using linear regression at zero-crossing wavelengths of the first derivation of UV-spectra or by application of multiple linear partial least squares regression method. The results obtained by the two proposed mathematical methods were compared with those obtained by the HPLC technique.
High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

PubMed

Laurens, L M L; Wolfrum, E J

2013-12-18

One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Modeling the frequency of opposing left-turn conflicts at signalized intersections using generalized linear regression models.

PubMed

Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei

2014-01-01

The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Constituent concentrations, loads, and yields to Beaver Lake, Arkansas, water years 1999-2008

USGS Publications Warehouse

Bolyard, Susan E.; De Lanois, Jeanne L.; Green, W. Reed

2010-01-01

Beaver Lake is a large, deep-storage reservoir used as a drinking-water supply and considered a primary watershed of concern in the State of Arkansas. As such, information is needed to assess water quality, especially nutrient enrichment, nutrient-algal relations, turbidity, and sediment issues within the reservoir system. Water-quality samples were collected at three main inflows to Beaver Lake: the White River near Fayetteville, Richland Creek at Goshen, and War Eagle Creek near Hindsville. Water-quality samples collected over the period represented different flow conditions (from low to high). Constituent concentrations, flow-weighted concentrations, loads, and yields from White River, Richland Creek, and War Eagle Creek to Beaver Lake for water years 1999-2008 were documented for this report. Constituents include total ammonia plus organic nitrogen, dissolved nitrite plus nitrate nitrogen, dissolved orthophosphorus (soluble reactive phosphorus), total phosphorus, total nitrogen, dissolved organic carbon, total organic carbon, and suspended sediment. Linear regression models developed by computer program S-LOADEST were used to estimate loads for each constituent for the 10-year period at each station. Constituent yields and flow-weighted concentrations for each of the three stations were calculated for the study. Constituent concentrations and loads and yields varied with time and varied among the three tributaries contributing to Beaver Lake. These differences can result from differences in precipitation, land use, contributions of nutrients from point sources, and variations in basin size. Load and yield estimates varied yearly during the study period, water years 1999-2008, with the least nutrient and sediment load and yields generally occurring in water year 2006, and the greatest occurring in water year 2008, during a year with record amounts of precipitation. Flow-weighted concentrations of most constituents were greatest at War Eagle Creek near Hindsville than White River near Fayetteville and Richland Creek at Goshen. Loads and yields of most constituents were greater at the War Eagle Creek and White River stations than at the Richland Creek Station.
Visual Field Outcomes for the Idiopathic Intracranial Hypertension Treatment Trial (IIHTT).

PubMed

Wall, Michael; Johnson, Chris A; Cello, Kimberly E; Zamba, K D; McDermott, Michael P; Keltner, John L

2016-03-01

The Idiopathic Intracranial Hypertension Treatment Trial (IIHTT) showed that acetazolamide provided a modest, significant improvement in mean deviation (MD). Here, we further analyze visual field changes over the 6-month study period. Of 165 subjects with mild visual loss in the IIHTT, 125 had perimetry at baseline and 6 months. We evaluated pointwise linear regression of visual sensitivity versus time to classify test locations in the worst MD (study) eye as improving or not; pointwise changes from baseline to month 6 in decibels; and clinical consensus of change from baseline to 6 months. The average study eye had 36 of 52 test locations with improving sensitivity over 6 months using pointwise linear regression, but differences between the acetazolamide and placebo groups were not significant. Pointwise results mostly improved in both treatment groups with the magnitude of the mean change within groups greatest and statistically significant around the blind spot and the nasal area, especially in the acetazolamide group. The consensus classification of visual field change from baseline to 6 months in the study eye yielded percentages (acetazolamide, placebo) of 7.2% and 17.5% worse, 35.1% and 31.7% with no change, and 56.1% and 50.8% improved; group differences were not statistically significant. In the IIHTT, compared to the placebo group, the acetazolamide group had a significant pointwise improvement in visual field function, particularly in the nasal and pericecal areas; the latter is likely due to reduction in blind spot size related to improvement in papilledema. (ClinicalTrials.gov number, NCT01003639.).
Chromatographic behaviour predicts the ability of potential nootropics to permeate the blood-brain barrier.

PubMed

Farsa, Oldřich

2013-01-01

The log BB parameter is the logarithm of the ratio of a compound's equilibrium concentrations in the brain tissue versus the blood plasma. This parameter is a useful descriptor in assessing the ability of a compound to permeate the blood-brain barrier. The aim of this study was to develop a Hansch-type linear regression QSAR model that correlates the parameter log BB and the retention time of drugs and other organic compounds on a reversed-phase HPLC containing an embedded amide moiety. The retention time was expressed by the capacity factor log k'. The second aim was to estimate the brain's absorption of 2-(azacycloalkyl)acetamidophenoxyacetic acids, which are analogues of piracetam, nefiracetam, and meclofenoxate. Notably, these acids may be novel nootropics. Two simple regression models that relate log BB and log k' were developed from an assay performed using a reversed-phase HPLC that contained an embedded amide moiety. Both the quadratic and linear models yielded statistical parameters comparable to previously published models of log BB dependence on various structural characteristics. The models predict that four members of the substituted phenoxyacetic acid series have a strong chance of permeating the barrier and being absorbed in the brain. The results of this study show that a reversed-phase HPLC system containing an embedded amide moiety is a functional in vitro surrogate of the blood-brain barrier. These results suggest that racetam-type nootropic drugs containing a carboxylic moiety could be more poorly absorbed than analogues devoid of the carboxyl group, especially if the compounds penetrate the barrier by a simple diffusion mechanism.
Validity of bioelectrical impedance measurement in predicting fat-free mass of Chinese children and adolescents.

PubMed

Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang

2014-11-15

The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45 kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents.
Validity of Bioelectrical Impedance Measurement in Predicting Fat-Free Mass of Chinese Children and Adolescents

PubMed Central

Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang

2014-01-01

Background The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. Material/Methods A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. Results FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. Conclusions BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents. PMID:25398209
Do depression treatments reduce suicidal ideation? The effects of CBT, IPT, pharmacotherapy, and placebo on suicidality.

PubMed

Weitz, Erica; Hollon, Steven D; Kerkhof, Ad; Cuijpers, Pim

2014-01-01

Many well-researched treatments for depression exist. However, there is not yet enough evidence on whether these therapies, designed for the treatment of depression, are also effective for reducing suicidal ideation. This research provides valuable information for researchers, clinicians, and suicide prevention policy makers. Analysis was conducted on the Treatment for Depression Research Collaborative (TDCRP) sample, which included CBT, IPT, medication, and placebo treatment groups. Participants were included in the analysis if they reported suicidal ideation on the HRSD or BDI (score of ≥1). Multivariate linear regression indicated that both IPT (b=.41, p<.05) and medication (b =.47, p<.05) yielded a significant reduction in suicide symptoms compared to placebo on the HRSD. Multivariate linear regression indicated that after adjustment for change in depression these treatment effects were no longer significant. Moderate Cohen׳s d effect sizes from baseline to post-test differences in suicide score by treatment group are reported. These analyses were completed on a single suicide item from each of the measures. Moreover, the TDCRP excluded participants with moderate to severe suicidal ideation. This study demonstrates the specific effectiveness of IPT and medications in reducing suicidal ideation (relative to placebo), albeit largely as a consequence of their more general effects on depression. This adds to the growing body of evidence that depression treatments, specifically IPT and medication, can also reduce suicidal ideation and serves to further our understanding of the complex relationship between depression and suicide. Copyright © 2014 Elsevier B.V. All rights reserved.
Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities

PubMed Central

Gíslason, Magnús; Sigurðsson, Sigurður; Guðnason, Vilmundur; Harris, Tamara; Carraro, Ugo; Gargiulo, Paolo

2018-01-01

Sarcopenic muscular degeneration has been consistently identified as an independent risk factor for mortality in aging populations. Recent investigations have realized the quantitative potential of computed tomography (CT) image analysis to describe skeletal muscle volume and composition; however, the optimum approach to assessing these data remains debated. Current literature reports average Hounsfield unit (HU) values and/or segmented soft tissue cross-sectional areas to investigate muscle quality. However, standardized methods for CT analyses and their utility as a comorbidity index remain undefined, and no existing studies compare these methods to the assessment of entire radiodensitometric distributions. The primary aim of this study was to present a comparison of nonlinear trimodal regression analysis (NTRA) parameters of entire radiodensitometric muscle distributions against extant CT metrics and their correlation with lower extremity function (LEF) biometrics (normal/fast gait speed, timed up-and-go, and isometric leg strength) and biochemical and nutritional parameters, such as total solubilized cholesterol (SCHOL) and body mass index (BMI). Data were obtained from 3,162 subjects, aged 66–96 years, from the population-based AGES-Reykjavik Study. 1-D k-means clustering was employed to discretize each biometric and comorbidity dataset into twelve subpopulations, in accordance with Sturges’ Formula for Class Selection. Dataset linear regressions were performed against eleven NTRA distribution parameters and standard CT analyses (fat/muscle cross-sectional area and average HU value). Parameters from NTRA and CT standards were analogously assembled by age and sex. Analysis of specific NTRA parameters with standard CT results showed linear correlation coefficients greater than 0.85, but multiple regression analysis of correlative NTRA parameters yielded a correlation coefficient of 0.99 (P<0.005). These results highlight the specificities of each muscle quality metric to LEF biometrics, SCHOL, and BMI, and particularly highlight the value of the connective tissue regime in this regard. PMID:29513690
Advanced quantitative methods in correlating sarcopenic muscle degeneration with lower extremity function biometrics and comorbidities.

PubMed

Edmunds, Kyle; Gíslason, Magnús; Sigurðsson, Sigurður; Guðnason, Vilmundur; Harris, Tamara; Carraro, Ugo; Gargiulo, Paolo

2018-01-01

Sarcopenic muscular degeneration has been consistently identified as an independent risk factor for mortality in aging populations. Recent investigations have realized the quantitative potential of computed tomography (CT) image analysis to describe skeletal muscle volume and composition; however, the optimum approach to assessing these data remains debated. Current literature reports average Hounsfield unit (HU) values and/or segmented soft tissue cross-sectional areas to investigate muscle quality. However, standardized methods for CT analyses and their utility as a comorbidity index remain undefined, and no existing studies compare these methods to the assessment of entire radiodensitometric distributions. The primary aim of this study was to present a comparison of nonlinear trimodal regression analysis (NTRA) parameters of entire radiodensitometric muscle distributions against extant CT metrics and their correlation with lower extremity function (LEF) biometrics (normal/fast gait speed, timed up-and-go, and isometric leg strength) and biochemical and nutritional parameters, such as total solubilized cholesterol (SCHOL) and body mass index (BMI). Data were obtained from 3,162 subjects, aged 66-96 years, from the population-based AGES-Reykjavik Study. 1-D k-means clustering was employed to discretize each biometric and comorbidity dataset into twelve subpopulations, in accordance with Sturges' Formula for Class Selection. Dataset linear regressions were performed against eleven NTRA distribution parameters and standard CT analyses (fat/muscle cross-sectional area and average HU value). Parameters from NTRA and CT standards were analogously assembled by age and sex. Analysis of specific NTRA parameters with standard CT results showed linear correlation coefficients greater than 0.85, but multiple regression analysis of correlative NTRA parameters yielded a correlation coefficient of 0.99 (P<0.005). These results highlight the specificities of each muscle quality metric to LEF biometrics, SCHOL, and BMI, and particularly highlight the value of the connective tissue regime in this regard.
Adjustment of ionized calcium concentration for serum pH is not a valid marker of calcium homeostasis: implications for identifying individuals at risk of calcium metabolic disorders.

PubMed

Lam, Virginie; Dhaliwal, Satvinder S; Mamo, John C

2013-05-01

Ionized calcium (iCa) is the biologically active form of this micronutrient. Serum determination of iCa is measured via ion-electrode potentiometry (IEP) and reporting iCa relative to pH 7.4 is normally utilized to avoid the potential confounding effects of ex vivo changes to serum pH. Adjustment of iCa for pH has not been adequately justified. In this study, utilizing carefully standardized protocols for blood collection, the preparation of serum and controlling time of collection-to-analysis, we determined serum iCa and pH utilizing an IEP-analyser hosted at an accredited diagnostic laboratory. Regression analysis of unadjusted-iCa (iCa(raw)) concentration versus pH was described by linear regression and accounted for 37% of serum iCa(raw) variability. iCa(raw) was then expressed at pH 7.4 by either adjusting iCa(raw) based on the linear regression equation describing the association of iCa with serum pH (iCa(regr)) or using IEP coded published normative equations (iCa(pub)). iCa(regr) was comparable to iCa(raw), indicating that blood collection and processing methodologies were sound. However, iCa(pub) yielded values that were significantly lower than iCa(raw). iCa(pub) did not identify 15% subjects who had greater than desirable serum concentration of iCa based on iCa(raw). Sixty percent of subjects with low levels of iCa(raw) were also not detected by iCa(pub). Determination of the kappa value measure of agreement for iCa(raw) versus iCa(pub) showed relatively poor concordance (κ = 0.42). With simple protocols that avoid sampling artefacts, expressing iCa(raw) is likely to be a more valid and physiologically relevant marker of calcium homeostasis than is iCa(pub).

Instantaneous global spatial interaction? Exploring the Gaussian inequality, distance and Internet pings in a global network

NASA Astrophysics Data System (ADS)

Baker, R. G. V.

2005-12-01

The Internet has been publicly portrayed as a new technological horizon yielding instantaneous interaction to a point where geography no longer matters. This research aims to dispel this impression by applying a dynamic form of trip modelling to investigate pings in a global computer network compiled by the Stanford Linear Accelerator Centre (SLAC) from 1998 to 2004. Internet flows have been predicted to have the same mathematical operators as trips to a supermarket, since they are both periodic and constrained by a distance metric. Both actual and virtual trips are part of a spectrum of origin-destination pairs in the time-space convergence of trip time-lines. Internet interaction is very near to the convergence of these time-lines (at a very small time scale in milliseconds, but with interactions over thousands of kilometres). There is a lag effect and this is formalised by the derivation of Gaussian and gravity inequalities between the time taken (Δ t) and the partitioning of distance (Δ x). This inequality seems to be robust for a regression of Δ t to Δ x in the SLAC data set for each year (1998 to 2004). There is a constant ‘forbidden zone’ in the interaction, underpinned by the fact that pings do not travel faster than the speed of light. Superimposed upon this zone is the network capacity where a linear regression of Δ t to Δ x is a proxy summarising global Internet connectivity for that year. The results suggest that there has been a substantial improvement in connectivity over the period with R 2 increasing steadily from 0.39 to 0.65 from less Gaussian spreading of the ping latencies. Further, the regression line shifts towards the inequality boundary from 1998 to 2004, where the increased slope shows a greater proportional rise in local connectivity over global connectivity. A conclusion is that national geography still does matter in spatial interaction modelling of the Internet.
Aspirin as a potential modality for the chemoprevention of breast cancer: A dose-response meta-analysis of cohort studies from 857,831 participants

PubMed Central

Lu, Liming; Shi, Leiyu; Zeng, Jingchun; Wen, Zehuai

2017-01-01

Background Previous meta-analyses on the relationship between aspirin use and breast cancer risk have drawn inconsistent results. In addition, the threshold effect of different doses, frequencies and durations of aspirin use in preventing breast cancer have yet to be established. Results The search yielded 13 prospective cohort studies (N=857,831 participants) that reported an average of 7.6 cases/1,000 person-years of breast cancer during a follow-up period of from 4.4 to 14 years. With a random effects model, a borderline significant inverse association was observed between overall aspirin use and breast cancer risk, with a summarized RR = 0.94 (P = 0.051, 95% CI 0.87-1.01). The linear regression model was a better fit for the dose-response relationship, which displayed a potential relationship between the frequency of aspirin use and breast cancer risk (RR = 0.97, 0.95 and 0.90 for 5, 10 and 20 times/week aspirin use, respectively). It was also a better fit for the duration of aspirin use and breast cancer risk (RR = 0.86, 0.73 and 0.54 for 5, 10 and 20 years of aspirin use). Methods We searched MEDLINE, EMBASE and CENTRAL databases through early October 2016 for relevant prospective cohort studies of aspirin use and breast cancer risk. Meta-analysis of relative risks (RR) estimates associated with aspirin intake were presented by fixed or random effects models. The dose-response meta-analysis was performed by linear trend regression and restricted cubic spline regression. Conclusion Our study confirmed a dose-response relationship between aspirin use and breast cancer risk. For clinical prevention, long term (>5 years) consistent use (2-7 times/week) of aspirin appears to be more effective in achieving a protective effect against breast cancer. PMID:28418881
A web-based normative calculator for the uniform data set (UDS) neuropsychological test battery.

PubMed

Shirk, Steven D; Mitchell, Meghan B; Shaughnessy, Lynn W; Sherman, Janet C; Locascio, Joseph J; Weintraub, Sandra; Atri, Alireza

2011-11-11

With the recent publication of new criteria for the diagnosis of preclinical Alzheimer's disease (AD), there is a need for neuropsychological tools that take premorbid functioning into account in order to detect subtle cognitive decline. Using demographic adjustments is one method for increasing the sensitivity of commonly used measures. We sought to provide a useful online z-score calculator that yields estimates of percentile ranges and adjusts individual performance based on sex, age and/or education for each of the neuropsychological tests of the National Alzheimer's Coordinating Center Uniform Data Set (NACC, UDS). In addition, we aimed to provide an easily accessible method of creating norms for other clinical researchers for their own, unique data sets. Data from 3,268 clinically cognitively-normal older UDS subjects from a cohort reported by Weintraub and colleagues (2009) were included. For all neuropsychological tests, z-scores were estimated by subtracting the raw score from the predicted mean and then dividing this difference score by the root mean squared error term (RMSE) for a given linear regression model. For each neuropsychological test, an estimated z-score was calculated for any raw score based on five different models that adjust for the demographic predictors of SEX, AGE and EDUCATION, either concurrently, individually or without covariates. The interactive online calculator allows the entry of a raw score and provides five corresponding estimated z-scores based on predictions from each corresponding linear regression model. The calculator produces percentile ranks and graphical output. An interactive, regression-based, normative score online calculator was created to serve as an additional resource for UDS clinical researchers, especially in guiding interpretation of individual performances that appear to fall in borderline realms and may be of particular utility for operationalizing subtle cognitive impairment present according to the newly proposed criteria for Stage 3 preclinical Alzheimer's disease.
Aspirin as a potential modality for the chemoprevention of breast cancer: A dose-response meta-analysis of cohort studies from 857,831 participants.

PubMed

Lu, Liming; Shi, Leiyu; Zeng, Jingchun; Wen, Zehuai

2017-06-20

Previous meta-analyses on the relationship between aspirin use and breast cancer risk have drawn inconsistent results. In addition, the threshold effect of different doses, frequencies and durations of aspirin use in preventing breast cancer have yet to be established. The search yielded 13 prospective cohort studies (N=857,831 participants) that reported an average of 7.6 cases/1,000 person-years of breast cancer during a follow-up period of from 4.4 to 14 years. With a random effects model, a borderline significant inverse association was observed between overall aspirin use and breast cancer risk, with a summarized RR = 0.94 (P = 0.051, 95% CI 0.87-1.01). The linear regression model was a better fit for the dose-response relationship, which displayed a potential relationship between the frequency of aspirin use and breast cancer risk (RR = 0.97, 0.95 and 0.90 for 5, 10 and 20 times/week aspirin use, respectively). It was also a better fit for the duration of aspirin use and breast cancer risk (RR = 0.86, 0.73 and 0.54 for 5, 10 and 20 years of aspirin use). We searched MEDLINE, EMBASE and CENTRAL databases through early October 2016 for relevant prospective cohort studies of aspirin use and breast cancer risk. Meta-analysis of relative risks (RR) estimates associated with aspirin intake were presented by fixed or random effects models. The dose-response meta-analysis was performed by linear trend regression and restricted cubic spline regression. Our study confirmed a dose-response relationship between aspirin use and breast cancer risk. For clinical prevention, long term (>5 years) consistent use (2-7 times/week) of aspirin appears to be more effective in achieving a protective effect against breast cancer.
Standards for Standardized Logistic Regression Coefficients

ERIC Educational Resources Information Center

Menard, Scott

2011-01-01

Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Bark analysis as a guide to cassava nutrition in Sierra Leone

DOE Office of Scientific and Technical Information (OSTI.GOV)

Godfrey-Sam-Aggrey, W.; Garber, M.J.

1979-01-01

Cassava main stem barks from two experiments in which similar fertilizers were applied directly in a 2/sup 5/ confounded factorial design were analyzed and the bark nutrients used as a guide to cassava nutrition. The application of multiple regression analysis to the respective root yields and bark nutrient concentrations enable nutrient levels and optimum adjusted root yields to be derived. Differences in bark nutrient concentrations reflected soil fertility levels. Bark analysis and the application of multiple regression analysis to root yields and bark nutrients appear to be useful tools for predicting fertilizer recommendations for cassava production.
Hydrology of the U.S. Army Pinon Canyon maneuver site, Las Animas County, Colorado

USGS Publications Warehouse

Von Guerard, Paul; Abbott, P.O.; Nickless, Raymond C.

1987-01-01

The U.S. Department of the Army (Fort Carson Military Reservation) has acquired 381 sq mi of semiarid rangeland in southeastern Colorado for mechanized military maneuvers. The study area, known as the Pinon Canyon Maneuver Site, drains into the Purgatoire River, a major tributary of the upper Arkansas River. A multidisciplined hydrologic investigation began in October 1982. The primary aquifer in the Maneuver Site is the Dakota-Purgatoire. Well yields generally range from 10 to 500 gal/min. Dissolved solids concentrations in groundwater ranged from 195 to 6,150 mg/L. Streamflow in the Purgatoire River is perennial. Tributaries draining the Maneuver Site are intermittent or ephemeral and contribute only about 4.4% of the streamflow of the Purgatoire River downstream from the Maneuver Site. Flood frequencies were calculated by using the log Pearson III procedure and compared well with a regional estimating technique that was developed that uses physical drainage-basin characteristics. Calcium and sulfate are the predominant ions in the surface water of the area. Time-series plots indicate that instream water-quality standards for nitrate and metals are exceeded. About 80% of the suspended-sediment load is transported by rainfall runoff, which occurs less than 8% of the time. Ephermal tributaries contributed less than 25% of the suspended-sediment load transported to the Purgatoire River downstream from the Maneuver Site. Historic annual mean sediment yields were measured for 29 small watersheds. Sediment yields were measured for 29 small watersheds. Sediment yields ranged from 9.5 to 1,700 tons/sq mi. Sediment yields were estimated by a multiple-linear-regression model developed by using physical drainage-basin characteristics and by the Pacific Southwest Interagency Committee method. (USGS)
REML/BLUP and sequential path analysis in estimating genotypic values and interrelationships among simple maize grain yield-related traits.

PubMed

Olivoto, T; Nardino, M; Carvalho, I R; Follmann, D N; Ferrari, M; Szareski, V J; de Pelegrin, A J; de Souza, V Q

2017-03-22

Methodologies using restricted maximum likelihood/best linear unbiased prediction (REML/BLUP) in combination with sequential path analysis in maize are still limited in the literature. Therefore, the aims of this study were: i) to use REML/BLUP-based procedures in order to estimate variance components, genetic parameters, and genotypic values of simple maize hybrids, and ii) to fit stepwise regressions considering genotypic values to form a path diagram with multi-order predictors and minimum multicollinearity that explains the relationships of cause and effect among grain yield-related traits. Fifteen commercial simple maize hybrids were evaluated in multi-environment trials in a randomized complete block design with four replications. The environmental variance (78.80%) and genotype-vs-environment variance (20.83%) accounted for more than 99% of the phenotypic variance of grain yield, which difficult the direct selection of breeders for this trait. The sequential path analysis model allowed the selection of traits with high explanatory power and minimum multicollinearity, resulting in models with elevated fit (R 2 > 0.9 and ε < 0.3). The number of kernels per ear (NKE) and thousand-kernel weight (TKW) are the traits with the largest direct effects on grain yield (r = 0.66 and 0.73, respectively). The high accuracy of selection (0.86 and 0.89) associated with the high heritability of the average (0.732 and 0.794) for NKE and TKW, respectively, indicated good reliability and prospects of success in the indirect selection of hybrids with high-yield potential through these traits. The negative direct effect of NKE on TKW (r = -0.856), however, must be considered. The joint use of mixed models and sequential path analysis is effective in the evaluation of maize-breeding trials.
Canopy Chlorophyll Density Based Index for Estimating Nitrogen Status and Predicting Grain Yield in Rice

PubMed Central

Liu, Xiaojun; Zhang, Ke; Zhang, Zeyu; Cao, Qiang; Lv, Zunfu; Yuan, Zhaofeng; Tian, Yongchao; Cao, Weixing; Zhu, Yan

2017-01-01

Canopy chlorophyll density (Chl) has a pivotal role in diagnosing crop growth and nutrition status. The purpose of this study was to develop Chl based models for estimating N status and predicting grain yield of rice (Oryza sativa L.) with Leaf area index (LAI) and Chlorophyll concentration of the upper leaves. Six field experiments were conducted in Jiangsu Province of East China during 2007, 2008, 2009, 2013, and 2014. Different N rates were applied to generate contrasting conditions of N availability in six Japonica cultivars (9915, 27123, Wuxiangjing 14, Wuyunjing 19, Yongyou 8, and Wuyunjing 24) and two Indica cultivars (Liangyoupei 9, YLiangyou 1). The SPAD values of the four uppermost leaves and LAI were measured from tillering to flowering growth stages. Two N indicators, leaf N accumulation (LNA) and plant N accumulation (PNA) were measured. The LAI estimated by LAI-2000 and LI-3050C were compared and calibrated with a conversion equation. A linear regression analysis showed significant relationships between Chl value and N indicators, the equations were as follows: PNA = (0.092 × Chl) − 1.179 (R2 = 0.94, P < 0.001, relative root mean square error (RRMSE) = 0.196), LNA = (0.052 × Chl) − 0.269 (R2 = 0.93, P < 0.001, RRMSE = 0.185). Standardized method was used to quantity the correlation between Chl value and grain yield, normalized yield = (0.601 × normalized Chl) + 0.400 (R2 = 0.81, P < 0.001, RRMSE = 0.078). Independent experimental data also validated the use of Chl value to accurately estimate rice N status and predict grain yield. PMID:29163568
Image interpolation via regularized local linear regression.

PubMed

Liu, Xianming; Zhao, Debin; Xiong, Ruiqin; Ma, Siwei; Gao, Wen; Sun, Huifang

2011-12-01

The linear regression model is a very attractive tool to design effective image interpolation schemes. Some regression-based image interpolation algorithms have been proposed in the literature, in which the objective functions are optimized by ordinary least squares (OLS). However, it is shown that interpolation with OLS may have some undesirable properties from a robustness point of view: even small amounts of outliers can dramatically affect the estimates. To address these issues, in this paper we propose a novel image interpolation algorithm based on regularized local linear regression (RLLR). Starting with the linear regression model where we replace the OLS error norm with the moving least squares (MLS) error norm leads to a robust estimator of local image structure. To keep the solution stable and avoid overfitting, we incorporate the l(2)-norm as the estimator complexity penalty. Moreover, motivated by recent progress on manifold-based semi-supervised learning, we explicitly consider the intrinsic manifold structure by making use of both measured and unmeasured data points. Specifically, our framework incorporates the geometric structure of the marginal probability distribution induced by unmeasured samples as an additional local smoothness preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results on benchmark test images demonstrate that the proposed method achieves very competitive performance with the state-of-the-art interpolation algorithms, especially in image edge structure preservation. © 2011 IEEE
Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

PubMed Central

2016-01-01

Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

PubMed

Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

2016-01-01

Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

PubMed

Kumar, K Vasanth; Porkodi, K; Rocha, F

2008-01-15

A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
SU-E-J-237: Image Feature Based DRR and Portal Image Registration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, X; Chang, J

Purpose: Two-dimensional (2D) matching of the kV X-ray and digitally reconstructed radiography (DRR) images is an important setup technique for image-guided radiotherapy (IGRT). In our clinics, mutual information based methods are used for this purpose on commercial linear accelerators, but with often needs for manual corrections. This work proved the feasibility that feature based image transform can be used to register kV and DRR images. Methods: The scale invariant feature transform (SIFT) method was implemented to detect the matching image details (or key points) between the kV and DRR images. These key points represent high image intensity gradients, and thusmore » the scale invariant features. Due to the poor image contrast from our kV image, direct application of the SIFT method yielded many detection errors. To assist the finding of key points, the center coordinates of the kV and DRR images were read from the DICOM header, and the two groups of key points with similar relative positions to their corresponding centers were paired up. Using these points, a rigid transform (with scaling, horizontal and vertical shifts) was estimated. We also artificially introduced vertical and horizontal shifts to test the accuracy of our registration method on anterior-posterior (AP) and lateral pelvic images. Results: The results provided a satisfactory overlay of the transformed kV onto the DRR image. The introduced vs. detected shifts were fit into a linear regression. In the AP image experiments, linear regression analysis showed a slope of 1.15 and 0.98 with an R2 of 0.89 and 0.99 for the horizontal and vertical shifts, respectively. The results are 1.2 and 1.3 with R2 of 0.72 and 0.82 for the lateral image shifts. Conclusion: This work provided an alternative technique for kV to DRR alignment. Further improvements in the estimation accuracy and image contrast tolerance are underway.« less
Applied Multiple Linear Regression: A General Research Strategy

ERIC Educational Resources Information Center

Smith, Brandon B.

1969-01-01

Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Estimating the Depth of the Navy Recruiting Market

DTIC Science & Technology

2016-09-01

recommend that NRC make use of the Poisson regression model in order to determine high-yield ZIP codes for market depth. 14. SUBJECT...recommend that NRC make use of the Poisson regression model in order to determine high-yield ZIP codes for market depth. vi THIS PAGE INTENTIONALLY LEFT...DEPTH OF THE NAVY RECRUITING MARKET by Emilie M. Monaghan September 2016 Thesis Advisor: Lyn R. Whitaker Second Reader: Jonathan K. Alt
Variable screening via quantile partial correlation

PubMed Central

Ma, Shujie; Tsai, Chih-Ling

2016-01-01

In quantile linear regression with ultra-high dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. PMID:28943683
Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

PubMed Central

Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

2016-01-01

Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

PubMed

Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

2016-03-01

In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Patterns of medicinal plant use: an examination of the Ecuadorian Shuar medicinal flora using contingency table and binomial analyses.

PubMed

Bennett, Bradley C; Husby, Chad E

2008-03-28

Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.

An Application to the Prediction of LOD Change Based on General Regression Neural Network

NASA Astrophysics Data System (ADS)

Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.

2011-07-01

Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.
Impacts of land use change on watershed streamflow and sediment yield: An assessment using hydrologic modelling and partial least squares regression

NASA Astrophysics Data System (ADS)

Yan, B.; Fang, N. F.; Zhang, P. C.; Shi, Z. H.

2013-03-01

SummaryUnderstanding how changes in individual land use types influence the dynamics of streamflow and sediment yield would greatly improve the predictability of the hydrological consequences of land use changes and could thus help stakeholders to make better decisions. Multivariate statistics are commonly used to compare individual land use types to control the dynamics of streamflow or sediment yields. However, one issue with the use of conventional statistical methods to address relationships between land use types and streamflow or sediment yield is multicollinearity. In this study, an integrated approach involving hydrological modelling and partial least squares regression (PLSR) was used to quantify the contributions of changes in individual land use types to changes in streamflow and sediment yield. In a case study, hydrological modelling was conducted using land use maps from four time periods (1978, 1987, 1999, and 2007) for the Upper Du watershed (8973 km2) in China using the Soil and Water Assessment Tool (SWAT). Changes in streamflow and sediment yield across the two simulations conducted using the land use maps from 2007 to 1978 were found to be related to land use changes according to a PLSR, which was used to quantify the effect of this influence at the sub-basin scale. The major land use changes that affected streamflow in the studied catchment areas were related to changes in the farmland, forest and urban areas between 1978 and 2007; the corresponding regression coefficients were 0.232, -0.147 and 1.256, respectively, and the Variable Influence on Projection (VIP) was greater than 1. The dominant first-order factors affecting the changes in sediment yield in our study were: farmland (the VIP and regression coefficient were 1.762 and 14.343, respectively) and forest (the VIP and regression coefficient were 1.517 and -7.746, respectively). The PLSR methodology presented in this paper is beneficial and novel, as it partially eliminates the co-dependency of the variables and facilitates a more unbiased view of the contribution of the changes in individual land use types to changes in streamflow and sediment yield. This practicable and simple approach could be applied to a variety of other watersheds for which time-sequenced digital land use maps are available.
Solving a mixture of many random linear equations by tensor decomposition and alternating minimization.

DOT National Transportation Integrated Search

2016-09-01

We consider the problem of solving mixed random linear equations with k components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample...
Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation

NASA Astrophysics Data System (ADS)

Saylor, Rick D.; Edgerton, Eric S.; Hartsell, Benjamin E.

A variety of linear regression techniques and simple slope estimators are evaluated for use in the elemental carbon (EC) tracer method of secondary organic carbon (OC) estimation. Linear regression techniques based on ordinary least squares are not suitable for situations where measurement uncertainties exist in both regressed variables. In the past, regression based on the method of Deming [1943. Statistical Adjustment of Data. Wiley, London] has been the preferred choice for EC tracer method parameter estimation. In agreement with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], we find that in the limited case where primary non-combustion OC (OC non-comb) is assumed to be zero, the ratio of averages (ROA) approach provides a stable and reliable estimate of the primary OC-EC ratio, (OC/EC) pri. In contrast with Chu [2005. Stable estimate of primary OC/EC ratios in the EC tracer method. Atmospheric Environment 39, 1383-1392], however, we find that the optimal use of Deming regression (and the more general York et al. [2004. Unified equations for the slope, intercept, and standard errors of the best straight line. American Journal of Physics 72, 367-375] regression) provides excellent results as well. For the more typical case where OC non-comb is allowed to obtain a non-zero value, we find that regression based on the method of York is the preferred choice for EC tracer method parameter estimation. In the York regression technique, detailed information on uncertainties in the measurement of OC and EC is used to improve the linear best fit to the given data. If only limited information is available on the relative uncertainties of OC and EC, then Deming regression should be used. On the other hand, use of ROA in the estimation of secondary OC, and thus the assumption of a zero OC non-comb value, generally leads to an overestimation of the contribution of secondary OC to total measured OC.
Genetic parameters of linear conformation type traits and their relationship with milk yield throughout lactation in mixed-breed dairy goats.

PubMed

McLaren, A; Mucha, S; Mrode, R; Coffey, M; Conington, J

2016-07-01

Conformation traits are of interest to many dairy goat breeders not only as descriptive traits in their own right, but also because of their influence on production, longevity, and profitability. If these traits are to be considered for inclusion in future dairy goat breeding programs, relationships between them and production traits such as milk yield must be considered. With the increased use of regression models to estimate genetic parameters, an opportunity now exists to investigate correlations between conformation traits and milk yield throughout lactation in more detail. The aims of this study were therefore to (1) estimate genetic parameters for conformation traits in a population of crossbred dairy goats, (2) estimate correlations between all conformation traits, and (3) assess the relationship between conformation traits and milk yield throughout lactation. No information on milk composition was available. Data were collected from goats based on 2 commercial goat farms during August and September in 2013 and 2014. Ten conformation traits, relating to udder, teat, leg, and feet characteristics, were scored on a linear scale (1-9). The overall data set comprised data available for 4,229 goats, all in their first lactation. The population of goats used in the study was created using random crossings between 3 breeds: British Alpine, Saanen, and Toggenburg. In each generation, the best performing animals were selected for breeding, leading to the formation of a synthetic breed. The pedigree file used in the analyses contained sire and dam information for a total of 30,139 individuals. The models fitted relevant fixed and random effects. Heritability estimates for the conformation traits were low to moderate, ranging from 0.02 to 0.38. A range of positive and negative phenotypic and genetic correlations between the traits were observed, with the highest correlations found between udder depth and udder attachment (0.78), teat angle and teat placement (0.70), and back legs and back feet (0.64). The genetic correlations estimated between conformation traits and milk yield across the first lactation demonstrated changes during this period. The majority of correlations estimated between milk yield and the udder and teat traits were negative. Therefore, future breeding programs would benefit from including these traits to ensure that selection for increased productivity is not accompanied by any unwanted change in functional fitness. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Hypothesis testing in functional linear regression models with Neyman's truncation and wavelet thresholding for longitudinal data.

PubMed

Yang, Xiaowei; Nie, Kun

2008-03-15

Longitudinal data sets in biomedical research often consist of large numbers of repeated measures. In many cases, the trajectories do not look globally linear or polynomial, making it difficult to summarize the data or test hypotheses using standard longitudinal data analysis based on various linear models. An alternative approach is to apply the approaches of functional data analysis, which directly target the continuous nonlinear curves underlying discretely sampled repeated measures. For the purposes of data exploration, many functional data analysis strategies have been developed based on various schemes of smoothing, but fewer options are available for making causal inferences regarding predictor-outcome relationships, a common task seen in hypothesis-driven medical studies. To compare groups of curves, two testing strategies with good power have been proposed for high-dimensional analysis of variance: the Fourier-based adaptive Neyman test and the wavelet-based thresholding test. Using a smoking cessation clinical trial data set, this paper demonstrates how to extend the strategies for hypothesis testing into the framework of functional linear regression models (FLRMs) with continuous functional responses and categorical or continuous scalar predictors. The analysis procedure consists of three steps: first, apply the Fourier or wavelet transform to the original repeated measures; then fit a multivariate linear model in the transformed domain; and finally, test the regression coefficients using either adaptive Neyman or thresholding statistics. Since a FLRM can be viewed as a natural extension of the traditional multiple linear regression model, the development of this model and computational tools should enhance the capacity of medical statistics for longitudinal data.
Development of non-linear models predicting daily fine particle concentrations using aerosol optical depth retrievals and ground-based measurements at a municipality in the Brazilian Amazon region

NASA Astrophysics Data System (ADS)

Gonçalves, Karen dos Santos; Winkler, Mirko S.; Benchimol-Barbosa, Paulo Roberto; de Hoogh, Kees; Artaxo, Paulo Eduardo; de Souza Hacon, Sandra; Schindler, Christian; Künzli, Nino

2018-07-01

Epidemiological studies generally use particulate matter measurements with diameter less 2.5 μm (PM2.5) from monitoring networks. Satellite aerosol optical depth (AOD) data has considerable potential in predicting PM2.5 concentrations, and thus provides an alternative method for producing knowledge regarding the level of pollution and its health impact in areas where no ground PM2.5 measurements are available. This is the case in the Brazilian Amazon rainforest region where forest fires are frequent sources of high pollution. In this study, we applied a non-linear model for predicting PM2.5 concentration from AOD retrievals using interaction terms between average temperature, relative humidity, sine, cosine of date in a period of 365,25 days and the square of the lagged relative residual. Regression performance statistics were tested comparing the goodness of fit and R2 based on results from linear regression and non-linear regression for six different models. The regression results for non-linear prediction showed the best performance, explaining on average 82% of the daily PM2.5 concentrations when considering the whole period studied. In the context of Amazonia, it was the first study predicting PM2.5 concentrations using the latest high-resolution AOD products also in combination with the testing of a non-linear model performance. Our results permitted a reliable prediction considering the AOD-PM2.5 relationship and set the basis for further investigations on air pollution impacts in the complex context of Brazilian Amazon Region.
Sex discrimination from the acetabulum in a twentieth-century skeletal sample from France using digital photogrammetry.

PubMed

Macaluso, P J

2011-02-01

Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.
Evaluation of genotype x environment interactions in cotton using the method proposed by Eberhart and Russell and reaction norm models.

PubMed

Alves, R S; Teodoro, P E; Farias, F C; Farias, F J C; Carvalho, L P; Rodrigues, J I S; Bhering, L L; Resende, M D V

2017-08-17

Cotton produces one of the most important textile fibers of the world and has great relevance in the world economy. It is an economically important crop in Brazil, which is the world's fifth largest producer. However, studies evaluating the genotype x environment (G x E) interactions in cotton are scarce in this country. Therefore, the goal of this study was to evaluate the G x E interactions in two important traits in cotton (fiber yield and fiber length) using the method proposed by Eberhart and Russell (simple linear regression) and reaction norm models (random regression). Eight trials with sixteen upland cotton genotypes, conducted in a randomized block design, were used. It was possible to identify a genotype with wide adaptability and stability for both traits. Reaction norm models have excellent theoretical and practical properties and led to more informative and accurate results than the method proposed by Eberhart and Russell and should, therefore, be preferred. Curves of genotypic values as a function of the environmental gradient, which predict the behavior of the genotypes along the environmental gradient, were generated. These curves make possible the recommendation to untested environmental levels.
Lipase-catalyzed synthesis of palmitanilide: Kinetic model and antimicrobial activity study.

PubMed

Liu, Kuan-Miao; Liu, Kuan-Ju

2016-01-01

Enzymatic syntheses of fatty acid anilides are important owing to their wide range of industrial applications in detergents, shampoo, cosmetics, and surfactant formulations. The amidation reaction of Mucor miehei lipase Lipozyme IM20 was investigated for direct amidation of triacylglycerol in organic solvents. The process parameters (reaction temperature, substrate molar ratio, enzyme amount) were optimized to achieve the highest yield of anilide. The maximum yield of palmitanilide (88.9%) was achieved after 24 h of reaction at 40 °C at an enzyme concentration of 1.4% (70 mg). Kinetics of lipase-catalyzed amidation of aniline with tripalmitin has been investigated. The reaction rate could be described in terms of the Michaelis-Menten equation with a Ping-Pong Bi-Bi mechanism and competitive inhibition by both the substrates. The kinetic constants were estimated by using non-linear regression method using enzyme kinetic modules. The enzyme operational stability study showed that Lipozyme IM20 retained 38.1% of the initial activity for the synthesis of palmitanilide (even after repeated use for 48 h). Palmitanilide, a fatty acid amide, exhibited potent antimicrobial activity toward Bacillus cereus. Copyright © 2015 Elsevier Inc. All rights reserved.
Economic consequences of aviation system disruptions: A reduced-form computable general equilibrium analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Zhenhua; Rose, Adam Z.; Prager, Fynnwin

The state of the art approach to economic consequence analysis (ECA) is computable general equilibrium (CGE) modeling. However, such models contain thousands of equations and cannot readily be incorporated into computerized systems used by policy analysts to yield estimates of economic impacts of various types of transportation system failures due to natural hazards, human related attacks or technological accidents. This paper presents a reduced-form approach to simplify the analytical content of CGE models to make them more transparent and enhance their utilization potential. The reduced-form CGE analysis is conducted by first running simulations one hundred times, varying key parameters, suchmore » as magnitude of the initial shock, duration, location, remediation, and resilience, according to a Latin Hypercube sampling procedure. Statistical analysis is then applied to the “synthetic data” results in the form of both ordinary least squares and quantile regression. The analysis yields linear equations that are incorporated into a computerized system and utilized along with Monte Carlo simulation methods for propagating uncertainties in economic consequences. Although our demonstration and discussion focuses on aviation system disruptions caused by terrorist attacks, the approach can be applied to a broad range of threat scenarios.« less
Effects of porosity on weld-joint tensile strength of aluminum alloys

NASA Technical Reports Server (NTRS)

Lovoy, C. V.

1974-01-01

Tensile properties in defect-free weldments of aluminum alloys 2014-T6 and 2219-T87 (sheet and plate) are shown to be related to the level or concentration of induced simulated porosity. The scatter diagram shows that the ultimate tensile strength of the weldments displays the most pronounced linear relationship with the level of porosity. The relationships between yield strength or elongation and porosity are either trivial or inconsequential in the lower and intermediate levels of porosity content. In highly concentrated levels of porosity, both yield strength and elongation values decrease markedly. Correlation coefficients were obtained by simple straight line regression analysis between the variables of ultimate tensile strength and pore level. The coefficients were greater, indicating a better correlation, using a pore area accumulation concept or pore volume accumulation than the accumulation of the pore diameters. These relationships provide a useful tool for assessing the existing aerospace radiographic acceptance standards with respect to permissible porosity. In addition, these relationships, in combination with known design load requirements, will serve as an engineering guideline in determining when a weld repair is necessary based on accumulative pore level as detected by radiographic techniques.
Mapping Quantitative Trait Loci Associated with Toot Traits Using Sequencing-Based Genotyping Chromosome Segment Substitution Lines Derived from 9311 and Nipponbare in Rice (Oryza sativa L.).

PubMed

Zhou, Yong; Dong, Guichun; Tao, Yajun; Chen, Chen; Yang, Bin; Wu, Yue; Yang, Zefeng; Liang, Guohua; Wang, Baohe; Wang, Yulong

2016-01-01

Identification of quantitative trait loci (QTLs) associated with rice root morphology provides useful information for avoiding drought stress and maintaining yield production under the irrigation condition. In this study, a set of chromosome segment substitution lines derived from 9311 as the recipient and Nipponbare as donor, were used to analysis root morphology. By combining the resequencing-based bin-map with a multiple linear regression analysis, QTL identification was conducted on root number (RN), total root length (TRL), root dry weight (RDW), maximum root length (MRL), root thickness (RTH), total absorption area (TAA) and root vitality (RV), using the CSSL population grown under hydroponic conditions. A total of thirty-eight QTLs were identified: six for TRL, six for RDW, eight for the MRL, four for RTH, seven for RN, two for TAA, and five for RV. Phenotypic effect variance explained by these QTLs ranged from 2.23% to 37.08%, and four single QTLs had more than 10% phenotypic explanations on three root traits. We also detected the correlations between grain yield (GY) and root traits, and found that TRL, RTH and MRL had significantly positive correlations with GY. However, TRL, RDW and MRL had significantly positive correlations with biomass yield (BY). Several QTLs identified in our population were co-localized with some loci for grain yield or biomass. This information may be immediately exploited for improving rice water and fertilizer use efficiency for molecular breeding of root system architectures.
Stratification for the propensity score compared with linear regression techniques to assess the effect of treatment or exposure.

PubMed

Senn, Stephen; Graf, Erika; Caputo, Angelika

2007-12-30

Stratifying and matching by the propensity score are increasingly popular approaches to deal with confounding in medical studies investigating effects of a treatment or exposure. A more traditional alternative technique is the direct adjustment for confounding in regression models. This paper discusses fundamental differences between the two approaches, with a focus on linear regression and propensity score stratification, and identifies points to be considered for an adequate comparison. The treatment estimators are examined for unbiasedness and efficiency. This is illustrated in an application to real data and supplemented by an investigation on properties of the estimators for a range of underlying linear models. We demonstrate that in specific circumstances the propensity score estimator is identical to the effect estimated from a full linear model, even if it is built on coarser covariate strata than the linear model. As a consequence the coarsening property of the propensity score-adjustment for a one-dimensional confounder instead of a high-dimensional covariate-may be viewed as a way to implement a pre-specified, richly parametrized linear model. We conclude that the propensity score estimator inherits the potential for overfitting and that care should be taken to restrict covariates to those relevant for outcome. Copyright (c) 2007 John Wiley & Sons, Ltd.
Decomposition of Near-Infrared Spectroscopy Signals Using Oblique Subspace Projections: Applications in Brain Hemodynamic Monitoring.

PubMed

Caicedo, Alexander; Varon, Carolina; Hunyadi, Borbala; Papademetriou, Maria; Tachtsidis, Ilias; Van Huffel, Sabine

2016-01-01

Clinical data is comprised by a large number of synchronously collected biomedical signals that are measured at different locations. Deciphering the interrelationships of these signals can yield important information about their dependence providing some useful clinical diagnostic data. For instance, by computing the coupling between Near-Infrared Spectroscopy signals (NIRS) and systemic variables the status of the hemodynamic regulation mechanisms can be assessed. In this paper we introduce an algorithm for the decomposition of NIRS signals into additive components. The algorithm, SIgnal DEcomposition base on Obliques Subspace Projections (SIDE-ObSP), assumes that the measured NIRS signal is a linear combination of the systemic measurements, following the linear regression model y = Ax + ϵ . SIDE-ObSP decomposes the output such that, each component in the decomposition represents the sole linear influence of one corresponding regressor variable. This decomposition scheme aims at providing a better understanding of the relation between NIRS and systemic variables, and to provide a framework for the clinical interpretation of regression algorithms, thereby, facilitating their introduction into clinical practice. SIDE-ObSP combines oblique subspace projections (ObSP) with the structure of a mean average system in order to define adequate signal subspaces. To guarantee smoothness in the estimated regression parameters, as observed in normal physiological processes, we impose a Tikhonov regularization using a matrix differential operator. We evaluate the performance of SIDE-ObSP by using a synthetic dataset, and present two case studies in the field of cerebral hemodynamics monitoring using NIRS. In addition, we compare the performance of this method with other system identification techniques. In the first case study data from 20 neonates during the first 3 days of life was used, here SIDE-ObSP decoupled the influence of changes in arterial oxygen saturation from the NIRS measurements, facilitating the use of NIRS as a surrogate measure for cerebral blood flow (CBF). The second case study used data from a 3-years old infant under Extra Corporeal Membrane Oxygenation (ECMO), here SIDE-ObSP decomposed cerebral/peripheral tissue oxygenation, as a sum of the partial contributions from different systemic variables, facilitating the comparison between the effects of each systemic variable on the cerebral/peripheral hemodynamics.
Yield Strength Testing in Human Cadaver Nasal Septal Cartilage and L-Strut Constructs.

PubMed

Liu, Yuan F; Messinger, Kelton; Inman, Jared C

2017-01-01

To our knowledge, yield strength testing in human nasal septal cartilage has not been reported to date. An understanding of the basic mechanics of the nasal septum may help surgeons decide how much of an L-strut to preserve and how much grafting is needed. To determine the factors correlated with yield strength of the cartilaginous nasal septum and to explore the association between L-strut width and thickness in determining yield strength. In an anatomy laboratory, yield strength of rectangular pieces of fresh cadaver nasal septal cartilage was measured, and regression was performed to identify the factors correlated with yield strength. To measure yield strength in L-shaped models, 4 bonded paper L-struts models were constructed for every possible combination of the width and thickness, for a total of 240 models. Mathematical modeling using the resultant data with trend lines and surface fitting was performed to quantify the associations among L-strut width, thickness, and yield strength. The study dates were November 1, 2015, to April 1, 2016. The factors correlated with nasal cartilage yield strength and the associations among L-strut width, thickness, and yield strength in L-shaped models. Among 95 cartilage pieces from 12 human cadavers (mean [SD] age, 67.7 [12.6] years) and 240 constructed L-strut models, L-strut thickness was the only factor correlated with nasal septal cartilage yield strength (coefficient for thickness, 5.54; 95% CI, 4.08-7.00; P < .001), with an adjusted R2 correlation coefficient of 0.37. The mean (SD) yield strength R2 varied with L-strut thickness exponentially (0.93 [0.06]) for set widths, and it varied with L-strut width linearly (0.82 [0.11]) or logarithmically (0.85 [0.17]) for set thicknesses. A 3-dimensional surface model of yield strength with L-strut width and thickness as variables was created using a 2-dimensional gaussian function (adjusted R2 = 0.94). Estimated yield strengths were generated from the model to allow determination of the desired yield strength with different permutations of L-strut width and thickness. In this study of human cadaver nasal septal cartilage, L-strut thickness was significantly associated with yield strength. In a bonded paper L-strut model, L-strut thickness had a more important role in determining yield strength than L-strut width. Surgeons should consider the thickness of potential L-struts when determining the amount of cartilaginous septum to harvest and graft. NA.
Non-Linear Approach in Kinesiology Should Be Preferred to the Linear--A Case of Basketball.

PubMed

Trninić, Marko; Jeličić, Mario; Papić, Vladan

2015-07-01

In kinesiology, medicine, biology and psychology, in which research focus is on dynamical self-organized systems, complex connections exist between variables. Non-linear nature of complex systems has been discussed and explained by the example of non-linear anthropometric predictors of performance in basketball. Previous studies interpreted relations between anthropometric features and measures of effectiveness in basketball by (a) using linear correlation models, and by (b) including all basketball athletes in the same sample of participants regardless of their playing position. In this paper the significance and character of linear and non-linear relations between simple anthropometric predictors (AP) and performance criteria consisting of situation-related measures of effectiveness (SE) in basketball were determined and evaluated. The sample of participants consisted of top-level junior basketball players divided in three groups according to their playing time (8 minutes and more per game) and playing position: guards (N = 42), forwards (N = 26) and centers (N = 40). Linear (general model) and non-linear (general model) regression models were calculated simultaneously and separately for each group. The conclusion is viable: non-linear regressions are frequently superior to linear correlations when interpreting actual association logic among research variables.
Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

PubMed Central

Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

2013-01-01

Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839
Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

PubMed

Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

2013-01-01

Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.
Analysis and prediction of flow from local source in a river basin using a Neuro-fuzzy modeling tool.

PubMed

Aqil, Muhammad; Kita, Ichiro; Yano, Akira; Nishiyama, Soichi

2007-10-01

Traditionally, the multiple linear regression technique has been one of the most widely used models in simulating hydrological time series. However, when the nonlinear phenomenon is significant, the multiple linear will fail to develop an appropriate predictive model. Recently, neuro-fuzzy systems have gained much popularity for calibrating the nonlinear relationships. This study evaluated the potential of a neuro-fuzzy system as an alternative to the traditional statistical regression technique for the purpose of predicting flow from a local source in a river basin. The effectiveness of the proposed identification technique was demonstrated through a simulation study of the river flow time series of the Citarum River in Indonesia. Furthermore, in order to provide the uncertainty associated with the estimation of river flow, a Monte Carlo simulation was performed. As a comparison, a multiple linear regression analysis that was being used by the Citarum River Authority was also examined using various statistical indices. The simulation results using 95% confidence intervals indicated that the neuro-fuzzy model consistently underestimated the magnitude of high flow while the low and medium flow magnitudes were estimated closer to the observed data. The comparison of the prediction accuracy of the neuro-fuzzy and linear regression methods indicated that the neuro-fuzzy approach was more accurate in predicting river flow dynamics. The neuro-fuzzy model was able to improve the root mean square error (RMSE) and mean absolute percentage error (MAPE) values of the multiple linear regression forecasts by about 13.52% and 10.73%, respectively. Considering its simplicity and efficiency, the neuro-fuzzy model is recommended as an alternative tool for modeling of flow dynamics in the study area.

An hourly PM10 diagnosis model for the Bilbao metropolitan area using a linear regression methodology.

PubMed

González-Aparicio, I; Hidalgo, J; Baklanov, A; Padró, A; Santa-Coloma, O

2013-07-01

There is extensive evidence of the negative impacts on health linked to the rise of the regional background of particulate matter (PM) 10 levels. These levels are often increased over urban areas becoming one of the main air pollution concerns. This is the case on the Bilbao metropolitan area, Spain. This study describes a data-driven model to diagnose PM10 levels in Bilbao at hourly intervals. The model is built with a training period of 7-year historical data covering different urban environments (inland, city centre and coastal sites). The explanatory variables are quantitative-log [NO2], temperature, short-wave incoming radiation, wind speed and direction, specific humidity, hour and vehicle intensity-and qualitative-working days/weekends, season (winter/summer), the hour (from 00 to 23 UTC) and precipitation/no precipitation. Three different linear regression models are compared: simple linear regression; linear regression with interaction terms (INT); and linear regression with interaction terms following the Sawa's Bayesian Information Criteria (INT-BIC). Each type of model is calculated selecting two different periods: the training (it consists of 6 years) and the testing dataset (it consists of 1 year). The results of each type of model show that the INT-BIC-based model (R(2) = 0.42) is the best. Results were R of 0.65, 0.63 and 0.60 for the city centre, inland and coastal sites, respectively, a level of confidence similar to the state-of-the art methodology. The related error calculated for longer time intervals (monthly or seasonal means) diminished significantly (R of 0.75-0.80 for monthly means and R of 0.80 to 0.98 at seasonally means) with respect to shorter periods.
Visual field progression in glaucoma: estimating the overall significance of deterioration with permutation analyses of pointwise linear regression (PoPLR).

PubMed

O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H

2012-10-01

To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

ERIC Educational Resources Information Center

Liou, Pey-Yan

2009-01-01

The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…
A non-destructive selection criterion for fibre content in jute : II. Regression approach.

PubMed

Arunachalam, V; Iyer, R D

1974-01-01

An experiment with ten populations of jute, comprising varieties and mutants of the two species Corchorus olitorius and C.capsularis was conducted at two different locations with the object of evolving an effective criterion for selecting superior single plants for fibre yield. At Delhi, variation existed only between varieties as a group and mutants as a group, while at Pusa variation also existed among the mutant populations of C. capsularis.A multiple regression approach was used to find the optimum combination of characters for prediction of fibre yield. A process of successive elimination of characters based on the coefficient of determination provided by individual regression equations was employed to arrive at the optimal set of characters for predicting fibre yield. It was found that plant height, basal and mid-diameters and basal and mid-dry fibre weights would provide such an optimal set.
Genetic parameters for body condition score, body weight, milk yield, and fertility estimated using random regression models.

PubMed

Berry, D P; Buckley, F; Dillon, P; Evans, R D; Rath, M; Veerkamp, R F

2003-11-01

Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields from 8725 multiparous Holstein-Friesian cows. A cubic random regression was sufficient to model the changing genetic variances for BCS, BW, and milk across different days in milk. The genetic correlations between BCS and fertility changed little over the lactation; genetic correlations between BCS and interval to first service and between BCS and pregnancy rate to first service varied from -0.47 to -0.31, and from 0.15 to 0.38, respectively. This suggests that maximum genetic gain in fertility from indirect selection on BCS should be based on measurements taken in midlactation when the genetic variance for BCS is largest. Selection for increased BW resulted in shorter intervals to first service, but more services and poorer pregnancy rates; genetic correlations between BW and pregnancy rate to first service varied from -0.52 to -0.45. Genetic selection for higher lactation milk yield alone through selection on increased milk yield in early lactation is likely to have a more deleterious effect on genetic merit for fertility than selection on higher milk yield in late lactation.
FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

PubMed

Lorenzo-Seva, Urbano; Ferrando, Pere J

2011-03-01

We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.
Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

NASA Astrophysics Data System (ADS)

Gusriani, N.; Firdaniza

2018-03-01

The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Orthogonal Projection in Teaching Regression and Financial Mathematics

ERIC Educational Resources Information Center

Kachapova, Farida; Kachapov, Ilias

2010-01-01

Two improvements in teaching linear regression are suggested. The first is to include the population regression model at the beginning of the topic. The second is to use a geometric approach: to interpret the regression estimate as an orthogonal projection and the estimation error as the distance (which is minimized by the projection). Linear…
Logistic models--an odd(s) kind of regression.

PubMed

Jupiter, Daniel C

2013-01-01

The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Simultaneous fitting of genomic-BLUP and Bayes-C components in a genomic prediction model.

PubMed

Iheshiulor, Oscar O M; Woolliams, John A; Svendsen, Morten; Solberg, Trygve; Meuwissen, Theo H E

2017-08-24

The rapid adoption of genomic selection is due to two key factors: availability of both high-throughput dense genotyping and statistical methods to estimate and predict breeding values. The development of such methods is still ongoing and, so far, there is no consensus on the best approach. Currently, the linear and non-linear methods for genomic prediction (GP) are treated as distinct approaches. The aim of this study was to evaluate the implementation of an iterative method (called GBC) that incorporates aspects of both linear [genomic-best linear unbiased prediction (G-BLUP)] and non-linear (Bayes-C) methods for GP. The iterative nature of GBC makes it less computationally demanding similar to other non-Markov chain Monte Carlo (MCMC) approaches. However, as a Bayesian method, GBC differs from both MCMC- and non-MCMC-based methods by combining some aspects of G-BLUP and Bayes-C methods for GP. Its relative performance was compared to those of G-BLUP and Bayes-C. We used an imputed 50 K single-nucleotide polymorphism (SNP) dataset based on the Illumina Bovine50K BeadChip, which included 48,249 SNPs and 3244 records. Daughter yield deviations for somatic cell count, fat yield, milk yield, and protein yield were used as response variables. GBC was frequently (marginally) superior to G-BLUP and Bayes-C in terms of prediction accuracy and was significantly better than G-BLUP only for fat yield. On average across the four traits, GBC yielded a 0.009 and 0.006 increase in prediction accuracy over G-BLUP and Bayes-C, respectively. Computationally, GBC was very much faster than Bayes-C and similar to G-BLUP. Our results show that incorporating some aspects of G-BLUP and Bayes-C in a single model can improve accuracy of GP over the commonly used method: G-BLUP. Generally, GBC did not statistically perform better than G-BLUP and Bayes-C, probably due to the close relationships between reference and validation individuals. Nevertheless, it is a flexible tool, in the sense, that it simultaneously incorporates some aspects of linear and non-linear models for GP, thereby exploiting family relationships while also accounting for linkage disequilibrium between SNPs and genes with large effects. The application of GBC in GP merits further exploration.
On summary measure analysis of linear trend repeated measures data: performance comparison with two competing methods.

PubMed

Vossoughi, Mehrdad; Ayatollahi, S M T; Towhidi, Mina; Ketabchi, Farzaneh

2012-03-22

The summary measure approach (SMA) is sometimes the only applicable tool for the analysis of repeated measurements in medical research, especially when the number of measurements is relatively large. This study aimed to describe techniques based on summary measures for the analysis of linear trend repeated measures data and then to compare performances of SMA, linear mixed model (LMM), and unstructured multivariate approach (UMA). Practical guidelines based on the least squares regression slope and mean of response over time for each subject were provided to test time, group, and interaction effects. Through Monte Carlo simulation studies, the efficacy of SMA vs. LMM and traditional UMA, under different types of covariance structures, was illustrated. All the methods were also employed to analyze two real data examples. Based on the simulation and example results, it was found that the SMA completely dominated the traditional UMA and performed convincingly close to the best-fitting LMM in testing all the effects. However, the LMM was not often robust and led to non-sensible results when the covariance structure for errors was misspecified. The results emphasized discarding the UMA which often yielded extremely conservative inferences as to such data. It was shown that summary measure is a simple, safe and powerful approach in which the loss of efficiency compared to the best-fitting LMM was generally negligible. The SMA is recommended as the first choice to reliably analyze the linear trend data with a moderate to large number of measurements and/or small to moderate sample sizes.
Analysis of Learning Curve Fitting Techniques.

DTIC Science & Technology

1987-09-01

1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied
On vertical profile of ozone at Syowa

NASA Technical Reports Server (NTRS)

Chubachi, Shigeru

1994-01-01

The difference in the vertical ozone profile at Syowa between 1966-1981 and 1982-1988 is shown. The month-height cross section of the slope of the linear regressions between ozone partial pressure and 100-mb temperature is also shown. The vertically integrated values of the slopes are in close agreement with the slopes calculated by linear regression of Dobson total ozone on 100-mb temperature in the period of 1982-1988.
Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

PubMed

Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

2018-01-01

The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.
Classification of sodium MRI data of cartilage using machine learning.

PubMed

Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

2015-11-01

To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
Nonlinear isochrones in murine left ventricular pressure-volume loops: how well does the time-varying elastance concept hold?

PubMed

Claessens, T E; Georgakopoulos, D; Afanasyeva, M; Vermeersch, S J; Millar, H D; Stergiopulos, N; Westerhof, N; Verdonck, P R; Segers, P

2006-04-01

The linear time-varying elastance theory is frequently used to describe the change in ventricular stiffness during the cardiac cycle. The concept assumes that all isochrones (i.e., curves that connect pressure-volume data occurring at the same time) are linear and have a common volume intercept. Of specific interest is the steepest isochrone, the end-systolic pressure-volume relationship (ESPVR), of which the slope serves as an index for cardiac contractile function. Pressure-volume measurements, achieved with a combined pressure-conductance catheter in the left ventricle of 13 open-chest anesthetized mice, showed a marked curvilinearity of the isochrones. We therefore analyzed the shape of the isochrones by using six regression algorithms (two linear, two quadratic, and two logarithmic, each with a fixed or time-varying intercept) and discussed the consequences for the elastance concept. Our main observations were 1) the volume intercept varies considerably with time; 2) isochrones are equally well described by using quadratic or logarithmic regression; 3) linear regression with a fixed intercept shows poor correlation (R(2) < 0.75) during isovolumic relaxation and early filling; and 4) logarithmic regression is superior in estimating the fixed volume intercept of the ESPVR. In conclusion, the linear time-varying elastance fails to provide a sufficiently robust model to account for changes in pressure and volume during the cardiac cycle in the mouse ventricle. A new framework accounting for the nonlinear shape of the isochrones needs to be developed.
Does Nonlinear Modeling Play a Role in Plasmid Bioprocess Monitoring Using Fourier Transform Infrared Spectra?

PubMed

Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M

2017-06-01

The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Application of General Regression Neural Network to the Prediction of LOD Change

NASA Astrophysics Data System (ADS)

Zhang, Xiao-Hong; Wang, Qi-Jie; Zhu, Jian-Jun; Zhang, Hao

2012-01-01

Traditional methods for predicting the change in length of day (LOD change) are mainly based on some linear models, such as the least square model and autoregression model, etc. However, the LOD change comprises complicated non-linear factors and the prediction effect of the linear models is always not so ideal. Thus, a kind of non-linear neural network — general regression neural network (GRNN) model is tried to make the prediction of the LOD change and the result is compared with the predicted results obtained by taking advantage of the BP (back propagation) neural network model and other models. The comparison result shows that the application of the GRNN to the prediction of the LOD change is highly effective and feasible.
Estimating effects of limiting factors with regression quantiles

USGS Publications Warehouse

Cade, B.S.; Terrell, J.W.; Schroeder, R.L.

1999-01-01

In a recent Concepts paper in Ecology, Thomson et al. emphasized that assumptions of conventional correlation and regression analyses fundamentally conflict with the ecological concept of limiting factors, and they called for new statistical procedures to address this problem. The analytical issue is that unmeasured factors may be the active limiting constraint and may induce a pattern of unequal variation in the biological response variable through an interaction with the measured factors. Consequently, changes near the maxima, rather than at the center of response distributions, are better estimates of the effects expected when the observed factor is the active limiting constraint. Regression quantiles provide estimates for linear models fit to any part of a response distribution, including near the upper bounds, and require minimal assumptions about the form of the error distribution. Regression quantiles extend the concept of one-sample quantiles to the linear model by solving an optimization problem of minimizing an asymmetric function of absolute errors. Rank-score tests for regression quantiles provide tests of hypotheses and confidence intervals for parameters in linear models with heteroscedastic errors, conditions likely to occur in models of limiting ecological relations. We used selected regression quantiles (e.g., 5th, 10th, ..., 95th) and confidence intervals to test hypotheses that parameters equal zero for estimated changes in average annual acorn biomass due to forest canopy cover of oak (Quercus spp.) and oak species diversity. Regression quantiles also were used to estimate changes in glacier lily (Erythronium grandiflorum) seedling numbers as a function of lily flower numbers, rockiness, and pocket gopher (Thomomys talpoides fossor) activity, data that motivated the query by Thomson et al. for new statistical procedures. Both example applications showed that effects of limiting factors estimated by changes in some upper regression quantile (e.g., 90-95th) were greater than if effects were estimated by changes in the means from standard linear model procedures. Estimating a range of regression quantiles (e.g., 5-95th) provides a comprehensive description of biological response patterns for exploratory and inferential analyses in observational studies of limiting factors, especially when sampling large spatial and temporal scales.
40 CFR 1066.220 - Linearity verification for chassis dynamometer systems.

Code of Federal Regulations, 2014 CFR

2014-07-01

... dynamometer speed and torque at least as frequently as indicated in Table 1 of § 1066.215. The intent of... linear regression and the linearity criteria specified in Table 1 of this section. (b) Performance requirements. If a measurement system does not meet the applicable linearity criteria in Table 1 of this...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.